Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about the test_env_seed #18

Closed
MagiFeeney opened this issue Oct 28, 2023 · 4 comments
Closed

A question about the test_env_seed #18

MagiFeeney opened this issue Oct 28, 2023 · 4 comments

Comments

@MagiFeeney
Copy link

I found in your code, no matter what changes made to the main seed, it won't influence the test_env_seed at all. In other words, with different training seeds, it only evaluate on a same seed (test_seed: int = 1841). Is there any intention for this choice? Or should I also split a more seed for evaluation? Something might like this:

# original seeding at https://github.com/facebookresearch/denoised_mdp/blob/main/main.py#L546C5-L546C104
torch_seed, np_seed, data_collect_env_seed, replay_buffer_seed = split_seed(cast(int, cfg.seed), 4)
# to
torch_seed, np_seed, data_collect_env_seed, replay_buffer_seed, test_env_seed = split_seed(cast(int, cfg.seed), 5)
@ssnl
Copy link
Contributor

ssnl commented Oct 28, 2023

It was an intentional choice so that training configs don't affect test settings. You can split that if you want. Just be mindful of the behavior changes.

@ssnl ssnl closed this as completed Oct 28, 2023
@MagiFeeney
Copy link
Author

Thanks for yours explanation. However, my major concern is that only testing on a fixed seed would lead increase divergence between the training. To my knowledge, current RL algorithms either test on the same env used for training or at least keeps a constant shifting (i.e. test_env_seed = const + train_env_seed). I don't really have a clue how much those seedings affect the performance. Hope you can make some clarification.

@ssnl
Copy link
Contributor

ssnl commented Oct 28, 2023

I'm not sure how test seeds can cause training divergence.

You can make an argument for either case and I think they are both valid.

@MagiFeeney
Copy link
Author

Ok, I will experiment more see if there is an answer. Thanks for you time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants