New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[RLlib] MultiAgentEpisode: Add `concat()` API. #44622

Merged

sven1977 merged 27 commits into ray-project:master from sven1977:multi_agent_episode_add_concat

Apr 12, 2024

Contributor

sven1977 commented Apr 10, 2024 •

edited

Loading

MultiAgentEpisode: Add concat() API.

This API already exists for SingleAgentEpisode and should work analogous for MultiAgentEpisode.
It is mostly used by episode based replay buffers, e.g. in DQN and SAC to add incoming episode chunks (from an EnvRunner) to an already buffer-stored previous chunk of the same episode.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

sven1977 added 8 commits

April 9, 2024 16:59

wip

0f5dbbe

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

867b272

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

9fb5d72

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

970275d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

6606c3b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

9ab9991

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

1aee36b

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

86a4ef8

…i_agent_episode_add_concat

Signed-off-by: sven1977 <svenmika1977@gmail.com>

# Conflicts:
#	rllib/env/multi_agent_episode.py
#	rllib/env/tests/test_multi_agent_episode.py

sven1977 marked this pull request as ready for review

April 10, 2024 09:39

sven1977 requested review from avnishn, ArturNiederfahrenhorst, maxpumperla, kouroshHakha and simonsays1980 as code owners

April 10, 2024 09:40

sven1977 closed this

sven1977 deleted the multi_agent_episode_add_concat branch

April 10, 2024 11:54

sven1977 restored the multi_agent_episode_add_concat branch

April 10, 2024 11:56

sven1977 reopened this

simonsays1980 approved these changes

View reviewed changes

Collaborator

simonsays1980 left a comment

LGTM. Only thing missing: Tests for env_t_to_agent_t. We should add these.

rllib/env/multi_agent_episode.py Outdated

+                      In order for this to work, both chunks (`self` and `other`) must fit
+                      together. This is checked by the IDs (must be identical), the time step counters
+                      (`self.t` must be the same as `episode_chunk.t_started`), as well as the

Collaborator

simonsays1980 Apr 10, 2024

Should this be self.env_t and self._env_t_started?

Contributor Author

sven1977 Apr 12, 2024

done

rllib/env/multi_agent_episode.py Outdated

+                      In order for this to work, both chunks (`self` and `other`) must fit
+                      together. This is checked by the IDs (must be identical), the time step counters
+                      (`self.t` must be the same as `episode_chunk.t_started`), as well as the
+                      observations/infos at the concatenation boundaries (`self.observations[-1]`

Collaborator

simonsays1980 Apr 10, 2024

This is only true for SAEs, correct? In MAEs we keep the observations in the SAEs, don't we?

Contributor Author

sven1977 Apr 12, 2024

fixed

rllib/env/multi_agent_episode.py

+                              # Then store all agent data from the new episode chunk in self.
+                              self.agent_episodes[agent_id] = other.agent_episodes[agent_id]
+                              # Do not forget the env to agent timestep mapping.
+                              self.env_t_to_agent_t[agent_id] = other.env_t_to_agent_t[agent_id]

Collaborator

simonsays1980 Apr 10, 2024

Are the env_t_to_agent_t already updated with the new env_t_started of the other or do we have to add the starting point of the other to env_t_to_agent_t here?

Contributor Author

sven1977 Apr 12, 2024

They get concatenated as well. Note that - for now - env_t_to_agent_t always start at 0, no matter what the env_t_started is. They basically operate on a "simpler" time-axis.
We might want to change this in the future (and use the true global env steps instead). But I'm not sure yet, whether this would not complicate things too much. After all, it's just an int offset that we are talking about here.

rllib/env/multi_agent_episode.py Outdated

+                              # If `self` has hanging agent values -> Add these to `other`'s agent
+                              # SingleAgentEpisode (as a new timestep) and only then concatenate.
+                              # Otherwise, the concatentaion would fail b/c of missing data.
+                              if agent_id in self._agent_buffered_actions:

Collaborator

simonsays1980 Apr 10, 2024

Ah great point! This was missing in the first implementation, but it really important.

rllib/env/multi_agent_episode.py

+                              if agent_id in self._agent_buffered_actions:
+                                  assert agent_id in self._agent_buffered_extra_model_outputs
+                                  sa_episode.add_env_step(
+                                      observation=other.agent_episodes[agent_id].get_observations(0),

Collaborator

simonsays1980 Apr 10, 2024

Is indices=0 the first "new" one from the other episode chunk for agents that were missing the next observation? The last observation in self is then in the other.observations' lookback buffer, correct?

Contributor Author

sven1977 Apr 12, 2024

Nope, observations (and infos) always overlap by one ts (regardless of any lookback, so even if lookback=0)!

So for a multi agent episode, if you do e.g. self.cut(), the returned successor's first observation (multi-agent dict) is identical to self's last observation (multi-agent dict). Same principal as for single-agent episodes.

Contributor Author

sven1977 Apr 12, 2024

However, here, we have to concatenate single-agent episodes (that sit inside a multi-agent one). For those single-agent episodes, in case agents are not always stepping with each env(!) step, they might "miss" this overlap. In this case, just for the purpose of concatenating the individual single-agent episodes, we have to "fix" this and add the overlap-timestep to the first SA episodes (so its observations overlap with the second SA episode' by 1 ts).

rllib/env/multi_agent_episode.py


		# Validate.
		self.validate()

Collaborator

simonsays1980 Apr 10, 2024

AH, I have waited so long for this. Can't wait to try out the replay buffer now :)

Contributor Author

sven1977 Apr 10, 2024

Yeah, hang in there. Almost there :)

rllib/env/tests/test_multi_agent_episode.py

                       )
-                      check(episode_1.agent_buffers["agent_4"]["actions"].queue[0], buffered_action)
+                      check(episode_1._agent_buffered_rewards, {"a1": 6.0})
+                      check((a0.is_done, a1.is_done), (False, False))

Collaborator

simonsays1980 Apr 10, 2024

We should also test for the correctness of env_t_to_agent_t here.

Contributor Author

sven1977 Apr 10, 2024

yes, let me add tests for this as well.

Collaborator

simonsays1980 Apr 10, 2024

Then hopefully we can merge and take this over into the MAERB PR

sven1977 added 11 commits

April 10, 2024 15:15

wip

ed02c26

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          merge

b1fd64e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

ba98ade

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

d791077

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          LINT

7f18cb1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

64e53d7

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          merge

5c216a1

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          merge

e1ef111

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

02148b9

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

75d48c3

…i_agent_episode_add_concat

Signed-off-by: sven1977 <svenmika1977@gmail.com>

# Conflicts:
#	rllib/env/multi_agent_episode.py


          Merge branch 'multi_agent_episode_fix_cut' into multi_agent_episode_a…

7e8cbab

…dd_concat

# Conflicts:
#	rllib/env/multi_agent_episode.py
#	rllib/env/tests/test_multi_agent_episode.py

sven1977 added 8 commits

April 11, 2024 19:07

wip

97c012b

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

9655cbe

…i_agent_episode_add_concat


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

1720ff9

…i_agent_episode_add_concat

wip

07eae4c

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

9bb3581

…i_agent_episode_add_concat


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

0760d6a

…i_agent_episode_add_concat


          Merge branch 'master' of https://github.com/ray-project/ray into mult…

90dac42

…i_agent_episode_add_concat


          fixes

91f4440

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 merged commit a37bb30 into ray-project:master

5 checks passed

sven1977 deleted the multi_agent_episode_add_concat branch

April 12, 2024 11:58

harborn pushed a commit to harborn/ray that referenced this pull request


          [RLlib] MultiAgentEpisode: Add concat() API. (ray-project#44622)

9c530f0

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request


          [RLlib] MultiAgentEpisode: Add concat() API. (ray-project#44622)

ac851f4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

simonsays1980 simonsays1980 approved these changes

avnishn Awaiting requested review from avnishn

ArturNiederfahrenhorst Awaiting requested review from ArturNiederfahrenhorst ArturNiederfahrenhorst is a code owner

maxpumperla Awaiting requested review from maxpumperla

kouroshHakha Awaiting requested review from kouroshHakha