Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data recording and saving method #1079

Closed
9 tasks done
Xiong5Heng opened this issue Mar 23, 2024 · 4 comments
Closed
9 tasks done

data recording and saving method #1079

Xiong5Heng opened this issue Mar 23, 2024 · 4 comments
Labels
question Further information is requested

Comments

@Xiong5Heng
Copy link

Xiong5Heng commented Mar 23, 2024

  • I have marked all applicable categories:
    • exception-raising bug
    • RL algorithm bug
    • documentation request (i.e. "X is missing from the documentation.")
    • new feature request
    • design request (i.e. "X should be changed to Y.")
  • I have visited the source website
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and environment, where applicable:
    import tianshou, gymnasium as gym, torch, numpy, sys
    print(tianshou.__version__, gym.__version__, torch.__version__, numpy.__version__, sys.version, sys.platform)

Hi,

When I use SubprocVectorEnv, I want to record the rewards from all environments. Do you have similar function just like VecMonitor in SB3 (https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html#stable_baselines3.common.vec_env.VecMonitor)?

@MischaPanch
Copy link
Collaborator

I would suggest using an environment wrapper for that. At the moment tianshou is primarily an algorithm library, not focused on wrappers. In fact, you could just use the wrapper from SB3 together with tianshou.

Let me know if this answers your question

@MischaPanch MischaPanch added the question Further information is requested label Mar 25, 2024
@Xiong5Heng
Copy link
Author

Hi,
Thanks for your reply, and I will try your solution.
But, if I do not use the wrapper from SB3, are there any other ways to record the rewards from all vector environments?

@MischaPanch
Copy link
Collaborator

The best way would be to use an env wrapper. Note that in all examples you can create your own env factory with your own wrapper. I'll try to add a tutorial on how to do that soon.

Apart from that, you can probably use a custom logger. You can also access the buffer directly during training through the trainer, all rewards are saved there.

In very near future we will add support for callbacks during training, which then would provide the simplest way for saving custom data (see #977 #895)

@Xiong5Heng
Copy link
Author

The best way would be to use an env wrapper. Note that in all examples you can create your own env factory with your own wrapper. I'll try to add a tutorial on how to do that soon.

Apart from that, you can probably use a custom logger. You can also access the buffer directly during training through the trainer, all rewards are saved there.

In very near future we will add support for callbacks during training, which then would provide the simplest way for saving custom data (see #977 #895)

Thanks for your brilliant work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants