Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using info field for reward vector #2

Closed
ffelten opened this issue Jun 17, 2022 · 2 comments
Closed

Consider using info field for reward vector #2

ffelten opened this issue Jun 17, 2022 · 2 comments

Comments

@ffelten
Copy link
Collaborator

ffelten commented Jun 17, 2022

Hello,

Thanks for this repository, it will be very useful to the MORL community :-).

I was just wondering if you think it would be a good idea to enforce gym compatibility by specifying rewards as scalar and giving the vectorial rewards elsewhere. The idea would be to use a field in the info dictionary as they do in PGMORL. This would allow to use existing RL algorithms and logging libraries out of box (e.g. stable-baselines, tensorboard logs, ...).

For example: In a DST env, if you return the treasure reward only in the reward field, you can use the DQN implementation from baselines and have insights on the average reward, as well as the episode length in the tensorboard logs. Of course, you can extract the full vectorial reward from the info dictionary in order to learn with MORL :-).

With kind regards,

Florian

@LucasAlegre
Copy link
Member

Hi @ffelten!

I am glad that mo-gym is being useful :)

Regarding scalarizing rewards, that is why I created the LinearReward wrapper (see the example in the Readme as well). It makes the env return the reward as scalar and the vector reward in the info dict.
I believe it is possible to use stable-baselines when using this wrapper.

Do you think it would be better the other way around? I.e. the default is the linear reward and a wrapper makes it vectorized?

Best regards,
Lucas

@ffelten
Copy link
Collaborator Author

ffelten commented Jun 18, 2022

Oh, I missed that one. I think we should be fine with a wrapper such as the one you mentionned. Good job :-).

@ffelten ffelten closed this as completed Jun 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants