-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update episode_v2.py with last_info_for #37382
Conversation
@vymao Awesome! I'm all for adding this. Can you add 1-2 lines to ray/rllib/examples/custom_metrics_and_callbacks.py to illustrate and test the code path? |
Added an example usage. The environment in the file (Cartpole-v1) returns an empty info dict, specific custom metrics probably aren't useful here. |
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values. Signed-off-by: Victor Mao <vctrm67@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
Adding type EnvInfoDict to imports Signed-off-by: Victor Mao <vctrm67@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
Signed-off-by: Victor <vctr.y.m@example.com>
…cks.py Signed-off-by: Victor <vctr.y.m@example.com>
Fixing lint Signed-off-by: Victor Mao <vctrm67@gmail.com> Signed-off-by: Victor <vctr.y.m@example.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
@vymao Thanks for making these changes. I've added some more code to make the example a little more colourful. |
0.5 * (angle - self.last_angle) + 0.5 * self._pole_angle_vel | ||
) | ||
info["pole_angle_vel"] = self._pole_angle_vel | ||
return obs, rew, term, trunc, info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly necessary for the example. But since we are working with the info dict here, let's also create the info our selves in order to show a more end-to-end scenario.
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values. Signed-off-by: Victor Mao <vctrm67@gmail.com> Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: NripeshN <nn2012@hw.ac.uk>
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values. Signed-off-by: Victor Mao <vctrm67@gmail.com> Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: harborn <gangsheng.wu@intel.com>
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values. Signed-off-by: Victor Mao <vctrm67@gmail.com> Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values. Signed-off-by: Victor Mao <vctrm67@gmail.com> Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
Why are these changes needed?
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values that can then be used in callbacks.
Related issue number
#37319
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.