Consider using info field for reward vector #2

ffelten · 2022-06-17T12:14:26Z

Hello,

Thanks for this repository, it will be very useful to the MORL community :-).

I was just wondering if you think it would be a good idea to enforce gym compatibility by specifying rewards as scalar and giving the vectorial rewards elsewhere. The idea would be to use a field in the info dictionary as they do in PGMORL. This would allow to use existing RL algorithms and logging libraries out of box (e.g. stable-baselines, tensorboard logs, ...).

For example: In a DST env, if you return the treasure reward only in the reward field, you can use the DQN implementation from baselines and have insights on the average reward, as well as the episode length in the tensorboard logs. Of course, you can extract the full vectorial reward from the info dictionary in order to learn with MORL :-).

With kind regards,

Florian

LucasAlegre · 2022-06-17T17:42:06Z

Hi @ffelten!

I am glad that mo-gym is being useful :)

Regarding scalarizing rewards, that is why I created the LinearReward wrapper (see the example in the Readme as well). It makes the env return the reward as scalar and the vector reward in the info dict.
I believe it is possible to use stable-baselines when using this wrapper.

Do you think it would be better the other way around? I.e. the default is the linear reward and a wrapper makes it vectorized?

Best regards,
Lucas

ffelten · 2022-06-18T16:11:47Z

Oh, I missed that one. I think we should be fine with a wrapper such as the one you mentionned. Good job :-).

ffelten closed this as completed Jun 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider using info field for reward vector #2

Consider using info field for reward vector #2

ffelten commented Jun 17, 2022

LucasAlegre commented Jun 17, 2022

ffelten commented Jun 18, 2022

Consider using info field for reward vector #2

Consider using info field for reward vector #2

Comments

ffelten commented Jun 17, 2022

LucasAlegre commented Jun 17, 2022

ffelten commented Jun 18, 2022