-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewards Not Zero after Done #43
Comments
When using the Also just FYI:
Hope this helps :) |
@lebrice Thank you very much for the help! |
No! (Edit: maybe I'm misunderstanding your problem though) Are you saying that you get |
@lebrice Yes. |
We have some new reset logic plumbing that should resolve this issue in the next couple of days. Originally, we found that we didn't really need to reset during rollouts--we'd just run a rollout for a fixed episode length and then mask out frames after the point at which a |
OK! This should be addressed. Envs by default now reset after done=True. You can still get the old behavior if you wish to control auto-resetting yourself, by calling |
Hello,
I have extended the PyTorch example with an Augmented Random Search implementation:
https://github.com/kayuksel/braxars/blob/main/braxars_multi.py
However, what I have noticed that the reward of a batch-member is not zero after being done.
What are the values that are returned for "done" members? Should I treat their reward as zero?
I am now resetting the environment when that happens, I couldn't find how to reset done-members.
Another question I have is on the rendering. Are we able to render while using a notebook only?
Is it possible to render a selected (e.g. best-performing) batch member or all in the same render?
The text was updated successfully, but these errors were encountered: