Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update episode_v2.py with last_info_for #37382

Merged
merged 8 commits into from
Jul 28, 2023
Merged

Conversation

vymao
Copy link
Contributor

@vymao vymao commented Jul 13, 2023

Why are these changes needed?

Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values that can then be used in callbacks.

Related issue number

#37319

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ArturNiederfahrenhorst
Copy link
Contributor

@vymao Awesome! I'm all for adding this.

Can you add 1-2 lines to ray/rllib/examples/custom_metrics_and_callbacks.py to illustrate and test the code path?

@vymao
Copy link
Contributor Author

vymao commented Jul 20, 2023

Added an example usage. The environment in the file (Cartpole-v1) returns an empty info dict, specific custom metrics probably aren't useful here.

vymao and others added 6 commits July 19, 2023 23:41
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values.

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Adding type EnvInfoDict to imports

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Signed-off-by: Victor <vctr.y.m@example.com>
…cks.py

Signed-off-by: Victor <vctr.y.m@example.com>
Fixing lint

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Signed-off-by: Victor <vctr.y.m@example.com>
@ArturNiederfahrenhorst
Copy link
Contributor

@vymao Thanks for making these changes. I've added some more code to make the example a little more colourful.
Also I've included changes on how much this script babbles.

0.5 * (angle - self.last_angle) + 0.5 * self._pole_angle_vel
)
info["pole_angle_vel"] = self._pole_angle_vel
return obs, rew, term, trunc, info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly necessary for the example. But since we are working with the info dict here, let's also create the info our selves in order to show a more end-to-end scenario.

@kouroshHakha kouroshHakha added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Jul 28, 2023
@kouroshHakha kouroshHakha merged commit 7c07659 into ray-project:master Jul 28, 2023
37 of 40 checks passed
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values.

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values.

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values.

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
Adding last_info_for method to return last info dict from the environment. This is currently present in Episode but not EpisodeV2 and is especially useful for custom environment values.

Signed-off-by: Victor Mao <vctrm67@gmail.com>
Co-authored-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants