[RLlib] Re-instantiate `on_episode_created` callback for multi-agent (for now). #43779

sven1977 · 2024-03-07T14:08:27Z

Re-instantiate on_episode_created callback for multi-agent (for now).

The callback on_episode_created is currently not available on the new stack due to the usage of gym.vector.Env, which automatically resets sub-environments under the hood (so there is no chance for RLlib to get in between an episode ending and the reset of the next one).

Solution:
Single-agent: Find setting in gym to disable this behavior.
Multi-agent: Same as above, BUT also, RLlib currently does NOT even use gym.vector.Env for multi-agent, so here we might - for the time being - simply re-enable this callback.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Important detail with the new API. There is also no effort planned by Farama to change this. Maybe works with the asynchronous vector environment.

simonsays1980 · 2024-03-07T16:45:13Z

rllib/algorithms/algorithm_config.py

-                    "sub-environments when they are terminated. Override the "
-                    "`on_episode_start` method instead, which gets fired right after "
-                    "the `env.reset()` call."
+                    "When using the new API stack in single-agent and with EnvRunners, "


Yeah that's a problem I encountered when evaluating against benchmarks, i.e. running a benhcmark against the policy which means the environment needs to be reconfigured for a benchmark run. The solution is to run episodes and after x episodes switch the configuration and call the sampler again.

Makes sense. Some users need this opportunity to send config-information to the env before(!) it is reset, but after(!) an episode ended.

We'll have to figure out a way around this for single-agent now, using vector Env. :|

simonsays1980 · 2024-03-07T16:50:44Z

rllib/env/multi_agent_env_runner.py

@@ -350,6 +350,8 @@ def _sample_timesteps(

                # Create a new episode instance.
                self._episode = self._new_episode()
+                self._make_on_episode_callback("on_episode_created", self._episode)


We could even write self.make_on_episode_callback("on_episode_created") as self._episode is used anyways.

fixed (in next PR :) )

…ray-project#43779)

wip

d2f55fe

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla, kouroshHakha and simonsays1980 as code owners March 7, 2024 14:08

sven1977 assigned simonsays1980 Mar 7, 2024

sven1977 added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Mar 7, 2024

simonsays1980 approved these changes Mar 7, 2024

View reviewed changes

sven1977 merged commit 6edc70c into ray-project:master Mar 7, 2024
9 checks passed

sven1977 deleted the reinstate_on_episode_created_callback branch March 8, 2024 11:42

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 7, 2024

[RLlib] Re-instantiate on_episode_created callback for multi-agent. (…

44c5fd5

…ray-project#43779)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Re-instantiate `on_episode_created` callback for multi-agent (for now). #43779

[RLlib] Re-instantiate `on_episode_created` callback for multi-agent (for now). #43779

sven1977 commented Mar 7, 2024 •

edited

Loading

simonsays1980 left a comment

simonsays1980 Mar 7, 2024

sven1977 Mar 7, 2024

simonsays1980 Mar 7, 2024

sven1977 Mar 7, 2024

[RLlib] Re-instantiate on_episode_created callback for multi-agent (for now). #43779

[RLlib] Re-instantiate on_episode_created callback for multi-agent (for now). #43779

Conversation

sven1977 commented Mar 7, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

simonsays1980 left a comment

Choose a reason for hiding this comment

simonsays1980 Mar 7, 2024

Choose a reason for hiding this comment

sven1977 Mar 7, 2024

Choose a reason for hiding this comment

simonsays1980 Mar 7, 2024

Choose a reason for hiding this comment

sven1977 Mar 7, 2024

Choose a reason for hiding this comment

[RLlib] Re-instantiate `on_episode_created` callback for multi-agent (for now). #43779

[RLlib] Re-instantiate `on_episode_created` callback for multi-agent (for now). #43779

sven1977 commented Mar 7, 2024 •

edited

Loading