Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Re-instantiate on_episode_created callback for multi-agent (for now). #43779

Merged

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Mar 7, 2024

Re-instantiate on_episode_created callback for multi-agent (for now).

The callback on_episode_created is currently not available on the new stack due to the usage of gym.vector.Env, which automatically resets sub-environments under the hood (so there is no chance for RLlib to get in between an episode ending and the reset of the next one).

Solution:
Single-agent: Find setting in gym to disable this behavior.
Multi-agent: Same as above, BUT also, RLlib currently does NOT even use gym.vector.Env for multi-agent, so here we might - for the time being - simply re-enable this callback.

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>
@sven1977 sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Mar 7, 2024
Copy link
Collaborator

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Important detail with the new API. There is also no effort planned by Farama to change this. Maybe works with the asynchronous vector environment.

"sub-environments when they are terminated. Override the "
"`on_episode_start` method instead, which gets fired right after "
"the `env.reset()` call."
"When using the new API stack in single-agent and with EnvRunners, "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a problem I encountered when evaluating against benchmarks, i.e. running a benhcmark against the policy which means the environment needs to be reconfigured for a benchmark run. The solution is to run episodes and after x episodes switch the configuration and call the sampler again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Some users need this opportunity to send config-information to the env before(!) it is reset, but after(!) an episode ended.

We'll have to figure out a way around this for single-agent now, using vector Env. :|

@@ -350,6 +350,8 @@ def _sample_timesteps(

# Create a new episode instance.
self._episode = self._new_episode()
self._make_on_episode_callback("on_episode_created", self._episode)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even write self.make_on_episode_callback("on_episode_created") as self._episode is used anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (in next PR :) )

@sven1977 sven1977 merged commit 6edc70c into ray-project:master Mar 7, 2024
9 checks passed
@sven1977 sven1977 deleted the reinstate_on_episode_created_callback branch March 8, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants