[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 #21652

sven1977 · 2022-01-17T15:39:43Z

Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #3

Use min_time_s_per_reporting instead of min_iter_time_s (deprecated previously).
Added Trainer._get_env_creator_from_env_id([env_id]) convenience method.
ARS and ES algos: Use Trainer.setup() instead of Trainer._init() (deprecated previously) and rename self._workers into self.workers to match all other algos in RLlib.

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

avnishn

Minor comments. LMK what you think :)

avnishn · 2022-01-19T20:11:37Z

rllib/agents/ars/ars.py

        self.validate_config(config)
+
+        # Generate `self.env_creator` callable to create an env instance.
+        self._get_env_creator_from_env_id(self._env_id)


for whatever reason this function isn't available on master, but I also didn't find its definition in this pr diff, but only when checking out this branch. Strange :(

Could be a mistake by me when splitting my local branch (which contained more changes).

avnishn · 2022-01-19T20:14:03Z

rllib/agents/es/es.py

        self.validate_config(config)
+
+        # Generate `self.env_creator` callable to create an env instance.
+        self._get_env_creator_from_env_id(self._env_id)


could we change the name of this function to _set_env_creator_from_env_id. This function sets the self.env_creator variable, but it doesn't return anything. That, or we could return the env creator ourselves and set the attribute ourselves:

self.env_creator = self._get_env_creator_from_env_id(self._env_id)

thanks for the catch. I'll check. ...

…ntralized_multi_agent_learning_03

gjoliver

very nice cleanup PR.
a couple of minor questions.
on a high level, maybe break this into multiple PRs in the future, so simple changes like config param renaming can be merged much faster.
thanks.

gjoliver · 2022-01-24T08:54:18Z

rllib/agents/trainer.py

-                policy_mapping_fn=policy_mapping_fn,
-                policies_to_train=policies_to_train,
-            )
+            worker.add_policy(**kwargs)


now that this is a 1-line statement, why bother with the inline fn?
maybe cleaner to just call worker.add_policy(**kwargs) directly below.

We need to pass a function/callable to foreach_worker() below anyways. So it's better to share that code.

I see. get it.

gjoliver · 2022-01-24T08:54:42Z

rllib/agents/trainer.py

-        if workers:
-            workers.stop()
-        # Stop all optimizers.
-        if hasattr(self, "optimizer") and self.optimizer:


why get rid of this?

Seemed really outdated code.
Trainers do not have self.optimizer (anymore), only ES and ARS and those two optimizers do NOT have a stop() method, so this would actually produce errors here.

If users want their Trainers to have a self.optimizer, they can just override Trainer.cleanup() and implement the necessary logic.

…ntralized_multi_agent_learning_03

wip

a9f3098

sven1977 requested a review from avnishn January 18, 2022 11:33

sven1977 assigned avnishn Jan 18, 2022

avnishn approved these changes Jan 19, 2022

View reviewed changes

sven1977 added 5 commits January 20, 2022 09:00

wip.

5fe33ee

fixes

2f2c546

fix

21efe03

Merge branch 'master' of https://github.com/ray-project/ray into dece…

05802c9

…ntralized_multi_agent_learning_03

wip

6616496

sven1977 requested a review from gjoliver as a code owner January 21, 2022 09:32

Merge branch 'master' of https://github.com/ray-project/ray into dece…

59c8a33

…ntralized_multi_agent_learning_03

gjoliver reviewed Jan 24, 2022

View reviewed changes

sven1977 added 3 commits January 24, 2022 10:04

fixes

2312bcd

fixes

b9b8e98

Merge branch 'master' of https://github.com/ray-project/ray into dece…

3a1b7f0

…ntralized_multi_agent_learning_03

gjoliver approved these changes Jan 25, 2022

View reviewed changes

sven1977 added 4 commits January 25, 2022 10:55

Merge branch 'master' of https://github.com/ray-project/ray into dece…

3ad6097

…ntralized_multi_agent_learning_03

wip.

e3c9222

fixes.

fb01568

fix

526e0e6

sven1977 merged commit d5bfb7b into ray-project:master Jan 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 #21652

[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 #21652

sven1977 commented Jan 17, 2022 •

edited

avnishn left a comment

avnishn Jan 19, 2022

sven1977 Jan 20, 2022

sven1977 Jan 25, 2022

avnishn Jan 19, 2022

sven1977 Jan 20, 2022

sven1977 Jan 25, 2022

gjoliver left a comment

gjoliver Jan 24, 2022

sven1977 Jan 25, 2022

gjoliver Jan 25, 2022

gjoliver Jan 24, 2022

sven1977 Jan 25, 2022

[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 #21652

[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 #21652

Conversation

sven1977 commented Jan 17, 2022 • edited

Why are these changes needed?

Related issue number

Checks

avnishn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gjoliver left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 commented Jan 17, 2022 •

edited