data recording and saving method #1079

Xiong5Heng · 2024-03-23T13:07:23Z

I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
- design request (i.e. "X should be changed to Y.")
I have visited the source website
I have searched through the issue tracker for duplicates

I have mentioned version numbers, operating system and environment, where applicable:

import tianshou, gymnasium as gym, torch, numpy, sys
print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

Hi,

When I use SubprocVectorEnv, I want to record the rewards from all environments. Do you have similar function just like VecMonitor in SB3 (https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html#stable_baselines3.common.vec_env.VecMonitor)?

MischaPanch · 2024-03-25T14:17:27Z

I would suggest using an environment wrapper for that. At the moment tianshou is primarily an algorithm library, not focused on wrappers. In fact, you could just use the wrapper from SB3 together with tianshou.

Let me know if this answers your question

Xiong5Heng · 2024-03-26T02:29:30Z

Hi,
Thanks for your reply, and I will try your solution.
But, if I do not use the wrapper from SB3, are there any other ways to record the rewards from all vector environments?

MischaPanch · 2024-03-26T18:03:37Z

The best way would be to use an env wrapper. Note that in all examples you can create your own env factory with your own wrapper. I'll try to add a tutorial on how to do that soon.

Apart from that, you can probably use a custom logger. You can also access the buffer directly during training through the trainer, all rewards are saved there.

In very near future we will add support for callbacks during training, which then would provide the simplest way for saving custom data (see #977 #895)

Xiong5Heng · 2024-03-28T03:36:21Z

The best way would be to use an env wrapper. Note that in all examples you can create your own env factory with your own wrapper. I'll try to add a tutorial on how to do that soon.

Apart from that, you can probably use a custom logger. You can also access the buffer directly during training through the trainer, all rewards are saved there.

In very near future we will add support for callbacks during training, which then would provide the simplest way for saving custom data (see #977 #895)

Thanks for your brilliant work!

MischaPanch added the question Further information is requested label Mar 25, 2024

Xiong5Heng closed this as completed Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data recording and saving method #1079

data recording and saving method #1079

Xiong5Heng commented Mar 23, 2024 •

edited

Loading

MischaPanch commented Mar 25, 2024

Xiong5Heng commented Mar 26, 2024

MischaPanch commented Mar 26, 2024

Xiong5Heng commented Mar 28, 2024

data recording and saving method #1079

data recording and saving method #1079

Comments

Xiong5Heng commented Mar 23, 2024 • edited Loading

MischaPanch commented Mar 25, 2024

Xiong5Heng commented Mar 26, 2024

MischaPanch commented Mar 26, 2024

Xiong5Heng commented Mar 28, 2024

Xiong5Heng commented Mar 23, 2024 •

edited

Loading