New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[RLlib] Refine `MultiAgentEpisode` and add test cases. #40799

Merged

sven1977 merged 52 commits into ray-project:master from simonsays1980:testing-multi-agent-episode

Nov 28, 2023

Collaborator

simonsays1980 commented Oct 30, 2023 •

edited

Loading

Why are these changes needed?

The MultiAgentEpisode needs buffers to record action, reward, state, and extra_model_output in case an agent has acted at a timestep, but not received a next observation, yet. The appropriate logic for adding timesteps should be implemented here together with extensive testing to ensure in this complex area precise execution.

Related issue number

Clsoes #40746

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

simonsays1980 added 30 commits

October 11, 2023 18:12


          Initialized MultiAgentEpisode.

afc0e17

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added timestep mapping and necessary methods to MultiAgentEpisode.

c6c8275

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added get_<data> - methods to 'MultiAgentEpisode' and refined initial…

7b423ce

…ization, timestep mapping and class data.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added 'create_successor', Äget_state()' and 'from_state()'. Agent mod…

d7a50fa

…ule states will only be stored in the 'SingleAgentEpisode's.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Refactored 'self.t' and 'self.t_started'. Added 'to_sample_batch', 'g…

02ea5b8

…et_return' and '__len__'. Moved episode files into 'rllib/env'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added list conversion to 'from_sample_batch' in case episode is not d…

ac189c5

…one. Furthermore moved 'SingleAgentEpisode' and 'MultiAgentEpisode' towards 'rllib/env'. I also added unit testing for 'SingleAgentEpisode'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Ran linter.

c84525a

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Adding docstrings.

f3d88ec

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Adding docstrings.

9112f38

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Merged Master

7f860a5

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          LINTER

a8b3b1d

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added documentation to 'SingleAgentEpisode' and merged existing episo…

b4ea186

…de into branch episode.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Fixed some bugs found during testing and finished testing for 'Single…

fd71c17

…AgentEpisode'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Implemented review from @sven1977 and changed docstrings a bit.

03cce90

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Changed import for 'SingleAgentEpisode' to fix some errors in CI tests.

47e88e6

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Refactored 'get_observations|rewards|actions' into helper funciton.

1a62530

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Refactored 'get_observations|rewards|actions' into helper funciton.

9056dfe

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added infos to 'MUltiAgentEpiosde'.

0d8e324

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Fixed imports and the resulting bug in 'single_agent_gym_env_runner.py'.

879b080

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added 'extra_model_outputs' to 'MultiAgentEpisode'.

b53e122

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Merge branch 'multi-agent-episode' into infos-and-extra-model-outs-fo…

77e6296

…r-mae

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Implemented @sven1977 's review.

96e9d21

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Implemented @sven1977 's review.

277282e

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Implemented @sven1977 's review.

ccce22c

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Merged branch 'multi-agent-epsiode' into branch.

72bc8f5

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Started testing for MultiAgentEpisode.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added buffers to the 'MultiAgentEpisode' and a corresponding logic to…

c786e06

… 'add_timestep()'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Refined 'add_timestep()' in 'MultiAgentEpisode' to handle also states…

68d5478

… and agents that terminate before stepping first time.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added first test with multi-agent test environment and refined a coup…

d18d4a7

…le of functionalitites in 'MultiAgentEPisode'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added test scenarios and modified testing environment.

de6bab0

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

simonsays1980 commented

View reviewed changes

rllib/env/multi_agent_episode.py Show resolved Hide resolved

simonsays1980 added 13 commits

November 8, 2023 18:35


          Added test cases for 'create_successor' in the 'MultiAgentEpisode'. I…

b8b530c

…ntense testing led to further changes in the 'MultiAgentEpisode', specifically as we do need for the successor results from 'get_observations()', 'get_infos', etc. in type 'List[MultiAgentDict]' and not 'MultiAgentDict'. Furthermore, 'terminateds' and 'truncateds' hat to be provided with a getter. Needs more testing.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added functionalities to transfer buffers to successors of 'MultiAgen…

8d6ac06

…tEpisode's. Also added corresponding tests. Rnamed 'global_rewards' to 'partial_rewards'. This is an intermediate commit to have a safe state to return to.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added more tests to the 'test_create_successor'. Found some incpnsist…

2d375ee

…encies in the use of the ' partial_rewards'. Has to be fixed before stepping forward.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Removed states and related methods from 'MultiAgentEpisode' as they w…

9aabdac

…ere removed from 'SingleAgentEpisode by @sven1977. Added test for getters. Needed to change '_getattr_by_index' as using buffered actions is non-trivial. Had to add a 'global_actions_t' timestep_mapping for the actions as they could be buffered and the original timestep would get lost. Testing is not finished.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Finished 'get_actions()' and 'get_extra_model_outputs()' for the case…

53121d8

… of using buffered actions. Thereby refactored '_getattr_by_index()' extensively to be used more generically. Furthermore, wrote corresponding tests. Fixed a bug in the 'create_successor()' method. All tests run.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Finished testing for 'get_rewards()' without partial and buffered rew…

225eb68

…ards, but with receiving either 'MultiAgentDict' or 'List[MutliAgentDict]'. Made minor changes to '_IndexMapping' and '_getattr_by_index()'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added methods to '_IndexMapping' for more complex index searching. Ad…

f0fb853

…ded buffered rewards to the 'get_rewards' method.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added functionality for buffered_rewards and partial rewards. 'get_re…

7d408aa

…wards' is complete now and needs to be tested more.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added the file for testing the 'MultiAgentEpisode'. Worked on the fil…

6bdc9e6

…e for some days now.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Finished testing for getters and fixed some minor bugs I ran over whi…

b42decf

…le testing.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Fixed several minor bugs and modified 'create_successor()' and 'conca…

c0802ad

…t_episode()' to use the corresponding 'SingleAgentEpisode''s methods, to contain as initial observation in the successor always the last observation of an agent. Adjusted tests accordingly and added multiple new ones.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Finished testing for 'concat_episode()'. Also added a '_copy_buffer' …

040ad60

…method to 'MultiAgentEpisode' for immutable copying of buffers between episodes. Refined the test file.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>


          Added tests for 'to_sample_batch', '__len__', and 'get_returns'. Modi…

7d570e1

…fied these functions to account for empty episodes and agents that are done.'

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 changed the title ~~Testing multi agent episode~~ [RLlib] Refine MultiAgentEpisode and add test cases.

sven1977 marked this pull request as ready for review

November 27, 2023 14:12

sven1977 requested review from sven1977, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla and kouroshHakha as code owners

November 27, 2023 14:12

sven1977 added 2 commits

November 27, 2023 15:27

wip

d23f959

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

a8937f3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 approved these changes

View reviewed changes

Contributor

sven1977 left a comment

LGTM for now. Let's merge this to make some progress here.
We will have to:

Rename the MAE APIs to: add_env_reset/add_env_step
Enhance the getter APIs to allow for more options, such as one_hot_discrete, fill, etc.. analogous to the upcoming SingleAgentEpisode enhancements.
Add the infinite lookback buffer functionality to MAE.

sven1977 added 3 commits

November 28, 2023 10:05

wip

721ff00

Signed-off-by: sven1977 <svenmika1977@gmail.com>


          Merge branch 'master' of https://github.com/ray-project/ray into test…

3a64881

…ing-multi-agent-episode

wip

18ea675

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 merged commit 2ded325 into ray-project:master

14 of 15 checks passed

ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request


          [RLlib] Refine MultiAgentEpisode class and add test cases. (ray-pro…

278c60c

…ject#40799)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

sven1977 sven1977 approved these changes

avnishn Awaiting requested review from avnishn

ArturNiederfahrenhorst Awaiting requested review from ArturNiederfahrenhorst ArturNiederfahrenhorst is a code owner

smorad Awaiting requested review from smorad

maxpumperla Awaiting requested review from maxpumperla

kouroshHakha Awaiting requested review from kouroshHakha