[RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. #45462

simonsays1980 · 2024-05-21T11:08:18Z

Why are these changes needed?

Restoring certain agents from checkpoint is a frequent use case and we should provide examples for this scenario. This PR is adding such an example in the new API. stack. The example does the following:

Training of n agents on Pendulum-v1 MultiEnv.
Choosing the best checkpoint by return.
Loading the module state for policy 0 from this checkpoint.
Training the agents with policy 0 restored from checkpoint.

This example shows that training further on from a restored checkpoint - even for only a single agent - results in faster convergence.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

… multi-agent environment. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

rllib/BUILD

rllib/examples/multi_agent/restore_1_of_n_agents_from_checkpoint.py

sven1977 · 2024-05-21T12:32:34Z

rllib/examples/multi_agent/restore_1_of_n_agents_from_checkpoint.py

@@ -0,0 +1,151 @@
+"""Simple example of loading module weights for 1 of n agents from checkpoint.


nit: Let's not use "simple".

"An example script showing how to load RLModule weights for 1 out of n agents from a checkpoint" ?

Yup, not simple for everyone lol. I know what you mean, its actually quite some complexity to make this possible in MA scenarios - and so powerful.

sven1977 · 2024-05-21T12:47:26Z

rllib/examples/multi_agent/restore_1_of_n_agents_from_checkpoint.py

@@ -0,0 +1,151 @@
+"""Simple example of loading module weights for 1 of n agents from checkpoint.
+


Can we add a tiny paragraph here saying the usual:

This example: - runs a multi-agent Pendulum experiment with ... policies ... blabla - saves a checkpoint of the used MultiAgentRLModule every blabla iterations - stops the experiment after the agents reach a combined return of ... - picks the best of both trained policies (based on episode return) and restores only the corresponding RLModule. - runs a second experiment with the restored RLModule (single-agent) .... blabla

sven1977

Looks good to me! Just a few nits on comments/docstrings.

Awesome example! One more down. :)

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

…he issues. In addition added 'no_main' tag to test in BUILD b/c linter errored out. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) Test: - CI --------- Signed-off-by: can <can@anyscale.com>

#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) This is a redo of #45495 which got reverted. The difference now is that we run the bazel command in a container instead of on the current environment. bazel seems to have issues sharing the cache when calling bazel within bazel (https://buildkite.com/ray-project/microcheck/builds/444#018fa23a-6e31-435b-a0ea-412ca2d1017b/175-1476) Test: - CI - full microcheck run: https://buildkite.com/ray-project/microcheck/builds/464 Signed-off-by: can <can@anyscale.com>

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

…45495) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) Test: - CI --------- Signed-off-by: can <can@anyscale.com> Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…45507) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) This is a redo of ray-project#45495 which got reverted. The difference now is that we run the bazel command in a container instead of on the current environment. bazel seems to have issues sharing the cache when calling bazel within bazel (https://buildkite.com/ray-project/microcheck/builds/444#018fa23a-6e31-435b-a0ea-412ca2d1017b/175-1476) Test: - CI - full microcheck run: https://buildkite.com/ray-project/microcheck/builds/464 Signed-off-by: can <can@anyscale.com> Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…from a checkpoint. (ray-project#45462) Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…45495) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) Test: - CI --------- Signed-off-by: can <can@anyscale.com> Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…45507) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) This is a redo of ray-project#45495 which got reverted. The difference now is that we run the bazel command in a container instead of on the current environment. bazel seems to have issues sharing the cache when calling bazel within bazel (https://buildkite.com/ray-project/microcheck/builds/444#018fa23a-6e31-435b-a0ea-412ca2d1017b/175-1476) Test: - CI - full microcheck run: https://buildkite.com/ray-project/microcheck/builds/464 Signed-off-by: can <can@anyscale.com> Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…from a checkpoint. (ray-project#45462) Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

…45495) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) Test: - CI --------- Signed-off-by: can <can@anyscale.com>

…45507) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) This is a redo of ray-project#45495 which got reverted. The difference now is that we run the bazel command in a container instead of on the current environment. bazel seems to have issues sharing the cache when calling bazel within bazel (https://buildkite.com/ray-project/microcheck/builds/444#018fa23a-6e31-435b-a0ea-412ca2d1017b/175-1476) Test: - CI - full microcheck run: https://buildkite.com/ray-project/microcheck/builds/464 Signed-off-by: can <can@anyscale.com>

…from a checkpoint. (ray-project#45462)

…45495) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) Test: - CI --------- Signed-off-by: can <can@anyscale.com> Signed-off-by: gchurch <gabe1church@gmail.com>

…45507) ray-project#45462 adds a new tests by changing bazel rule instead of adding a new test file; this case can only be covered by our previous logic of computing new tests; recover this logic (in addition to the logic of computing new tests by looking at changed test files) This is a redo of ray-project#45495 which got reverted. The difference now is that we run the bazel command in a container instead of on the current environment. bazel seems to have issues sharing the cache when calling bazel within bazel (https://buildkite.com/ray-project/microcheck/builds/444#018fa23a-6e31-435b-a0ea-412ca2d1017b/175-1476) Test: - CI - full microcheck run: https://buildkite.com/ray-project/microcheck/builds/464 Signed-off-by: can <can@anyscale.com> Signed-off-by: gchurch <gabe1church@gmail.com>

simonsays1980 added 7 commits May 10, 2024 12:16

Changed comment.

c748df8

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

6409007

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

d2f9030

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

a3416a8

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

8582ad9

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

b565f34

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Added example to restore 1 of n agents from checkpoint using Pendulum…

c940fc4

… multi-agent environment. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

simonsays1980 self-assigned this May 21, 2024

simonsays1980 added rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack labels May 21, 2024

Added example to BUILD file.

504eddd

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 changed the title ~~[RLlib] - Example that restores 1 of n agents from checkpoint.~~ [RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. May 21, 2024

sven1977 marked this pull request as ready for review May 21, 2024 12:27

sven1977 requested review from sven1977, avnishn, ArturNiederfahrenhorst, maxpumperla and kouroshHakha as code owners May 21, 2024 12:27

sven1977 assigned sven1977 and unassigned simonsays1980 May 21, 2024

sven1977 reviewed May 21, 2024

View reviewed changes

rllib/BUILD Show resolved Hide resolved

sven1977 reviewed May 21, 2024

View reviewed changes

rllib/examples/multi_agent/restore_1_of_n_agents_from_checkpoint.py Outdated Show resolved Hide resolved

sven1977 reviewed May 21, 2024

View reviewed changes

rllib/examples/multi_agent/restore_1_of_n_agents_from_checkpoint.py Outdated Show resolved Hide resolved

sven1977 reviewed May 21, 2024

View reviewed changes

sven1977 approved these changes May 21, 2024

View reviewed changes

simonsays1980 added 3 commits May 21, 2024 20:14

Modification due to @sven1977's review.

92a6cd3

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray

c0eed1f

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into example-restore-1-of-n-agents-from-chkpt

540ad66

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

simonsays1980 added 3 commits May 22, 2024 09:52

Changed checkpoint frequency to 20 as test was not passing due to cac…

6f6dd8c

…he issues. In addition added 'no_main' tag to test in BUILD b/c linter errored out. Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Removed '--as-test' argument from example file.

ed95651

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Added 'no_main' tag to the old example in the BUILD file.

9acc2fb

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 enabled auto-merge (squash) May 22, 2024 10:21

github-actions bot added the go add ONLY when ready to merge, run all tests label May 22, 2024

can-anyscale mentioned this pull request May 22, 2024

[ci][microcheck] recover the logic to compute new tests #45495

Merged

Added the file of the old stack test to 'main'.

2527f6e

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

github-actions bot disabled auto-merge May 22, 2024 15:49

Merge branch 'master' of https://github.com/ray-project/ray

341cb95

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

can-anyscale mentioned this pull request May 22, 2024

[ci][microcheck] recover the logic to compute new tests #45507

Merged

simonsays1980 added 3 commits May 24, 2024 13:45

Merge branch 'master' of https://github.com/ray-project/ray

b76807f

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Merge branch 'master' into example-restore-1-of-n-agents-from-chkpt

3abb6a8

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

Removed old example from example folder and BUILD file.

a302bd1

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>

sven1977 merged commit 5cb7c09 into ray-project:master May 24, 2024
6 checks passed

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 6, 2024

[RLlib] Cleanup examples folder: Add example restoring 1 of n agents …

7c45041

…from a checkpoint. (ray-project#45462) Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 6, 2024

[RLlib] Cleanup examples folder: Add example restoring 1 of n agents …

d81afae

…from a checkpoint. (ray-project#45462) Signed-off-by: Ryan O'Leary <ryanaoleary@google.com>

ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Jun 7, 2024

[RLlib] Cleanup examples folder: Add example restoring 1 of n agents …

fb16cfc

…from a checkpoint. (ray-project#45462)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. #45462

[RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. #45462

simonsays1980 commented May 21, 2024 •

edited

Loading

sven1977 May 21, 2024

simonsays1980 May 21, 2024

sven1977 May 21, 2024

sven1977 left a comment

		@@ -0,0 +1,151 @@
		"""Simple example of loading module weights for 1 of n agents from checkpoint.

[RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. #45462

[RLlib]: Cleanup examples folder: Add example restoring 1 of n agents from a checkpoint. #45462

Conversation

simonsays1980 commented May 21, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

sven1977 May 21, 2024

Choose a reason for hiding this comment

simonsays1980 May 21, 2024

Choose a reason for hiding this comment

sven1977 May 21, 2024

Choose a reason for hiding this comment

sven1977 left a comment

Choose a reason for hiding this comment

simonsays1980 commented May 21, 2024 •

edited

Loading