[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

ArturNiederfahrenhorst · 2023-07-18T19:12:40Z

Why are these changes needed?

Starting with Ray 2.6.0, we roll out RLModules and Learners.
This requires notifications in docs about this migration so that users are not caught off-guard.
Further more, this PR includes a hotfix for when users attempt to set a custom_model while using the RL Module API, which stems from the ModelV2 API, which is incompatible.

Solves #37085

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst · 2023-07-18T20:25:44Z

rllib/algorithms/algorithm_config.py

+                    "with the RLModule API. Please set `_enable_rl_module_api=False` "
+                    "to use the legacy Policy and ModelV2 API."
+                )
+


This is not compatible anymore, while policies_to_train and policy_mapping_fn can sitll be used.

ArturNiederfahrenhorst · 2023-07-18T20:42:42Z

We should pick the docs part of this into 2.6.0 @kouroshHakha @bveeramani
I'll open a PR after this is merged.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

…tch norm to old API for now Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

doc/source/_includes/rllib/rlm_learner_migration_banner.rst

kouroshHakha · 2023-07-22T18:36:39Z

rllib/algorithms/algorithm.py

@@ -803,6 +803,8 @@ def get_default_policy_class(
    ) -> Optional[Type[Policy]]:
        """Returns a default Policy class to use, given a config.

+        Note that this method is ignored when the RLModule API is enabled.


Make this an actual note after the next paragraph.

kouroshHakha · 2023-07-22T18:39:19Z

rllib/algorithms/algorithm_config.py

@@ -1011,6 +1011,20 @@ def validate(self) -> None:
            else:
                self.rl_module_spec = default_rl_module_spec

+            if self.model["custom_model"] is not None:
+                raise ValueError(
+                    "Cannot use `custom_model` option with RLModule API."


You should say what they should do instead if they want to use custom_model (i.e. disabling RLModule via what API? or if they want to migrate to RLModule, migrate their custom model to a custom RLModule) Give them clear instructions.

kouroshHakha · 2023-07-22T18:40:52Z

rllib/algorithms/algorithm_config.py

+                    "Cannot use `custom_model_config` option with RLModule API."
+                    "`custom_model_config` is part of the ModelV2 API and Policy API, "
+                    "which are not compatible with the RLModule API."


same thing here. Give clear instructions to the user on what to fix.

kouroshHakha · 2023-07-22T18:41:37Z

rllib/evaluation/tests/test_trajectory_view_api.py

@@ -210,6 +210,9 @@ def test_traj_view_attention_net(self):
                sgd_minibatch_size=201,
                num_sgd_iter=5,
            )
+            # Batch-norm models have not been migrated to the RL Module API yet.


Could you create a github issue tracking this todo in out backlog?

Done #37683

kouroshHakha · 2023-07-22T18:42:09Z

rllib/examples/autoregressive_action_dist.py

+        # Batch-norm models have not been migrated to the RL Module API yet.
+        .training(_enable_learner_api=False)
+        .rl_module(_enable_rl_module_api=False)


link these two tests in the description of the issue, so the problem is clearly elaborated.

kouroshHakha · 2023-07-22T18:42:22Z

rllib/examples/batch_norm_model.py

        # Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
        .resources(num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "0")))
+        # Batch-norm models have not been migrated to the RL Module API yet.


Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

matthewdeng · 2023-07-24T16:54:04Z

doc/source/_includes/rllib/rlm_learner_migration_banner.rst

+    From Ray 2.6.0 onwards, RLlib is adopting a new stack for training and model customization, moving away from ModelV2 API and some convoluted parts of Policy API.
+    Starting from PPO algorithm, we will be gradually replace components with `RLModule API <rllib-rlmodule.html>`__.


I'm not familiar with all the technical details, but will this be sufficient for RLlib users to know what's going on?

A few thoughts/questions:

The wording of "a new stack" feels internal/ambiguous as a user, maybe you can explicitly call out RLModules and Learners in this sentence instead?

"Starting from PPO algorithm" - what does this mean to the user? Should they expect a new API for PPO in 2.6.0?

If all the information is captures in the RLModule doc, maybe it'll be helpful to just say something like "For more details about this migration(?) see RLModules<...>"

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

rllib/algorithms/algorithm_config.py

rllib/examples/custom_env.py

Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>

…onfigs for RL Modules (#37526) (#37875) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: NripeshN <nn2012@hw.ac.uk>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: harborn <gangsheng.wu@intel.com>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Victor <vctr.y.m@example.com>

Notices and algorithm config fix

1177a0c

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst added the v2.6.0-pick label Jul 18, 2023

ArturNiederfahrenhorst requested a review from sven1977 as a code owner July 18, 2023 19:12

ArturNiederfahrenhorst assigned kouroshHakha Jul 18, 2023

ArturNiederfahrenhorst requested review from gjoliver, avnishn, smorad, maxpumperla, kouroshHakha, krfricke and a team as code owners July 18, 2023 19:12

ArturNiederfahrenhorst changed the title ~~[RLlib] Put notices and algorithm config fix~~ [RLlib] Put notices and error out on invalid ModelV2/Policy related configs Jul 18, 2023

ArturNiederfahrenhorst changed the title ~~[RLlib] Put notices and error out on invalid ModelV2/Policy related configs~~ [RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules Jul 18, 2023

Some more additions

d834076

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst commented Jul 18, 2023

View reviewed changes

ArturNiederfahrenhorst added 5 commits July 18, 2023 22:22

Revert changes to AlgorithmConfig.multi_agent

d40fa92

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

fix check for custom model config

163787a

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Delete custom model stuff from custom env and fix autogregress and ba…

9207b35

…tch norm to old API for now Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

correct misspelled rllib

5db95a9

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Move attention net example to old api

ee67084

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst added v2.6.1-pick and removed v2.6.0-pick labels Jul 22, 2023

kouroshHakha reviewed Jul 22, 2023

View reviewed changes

ArturNiederfahrenhorst and others added 4 commits July 22, 2023 21:57

Update doc/source/_includes/rllib/rlm_learner_migration_banner.rst

1bdebbc

Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Update doc/source/_includes/rllib/rlm_learner_migration_banner.rst

018c8da

Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

kourosh's comments

1d3d544

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Merge branch 'helper' into oldnewapi

2cc69ce

kouroshHakha approved these changes Jul 24, 2023

View reviewed changes

matthewdeng reviewed Jul 24, 2023

View reviewed changes

ArturNiederfahrenhorst added 2 commits July 24, 2023 10:01

matt's comments

463020c

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

remove 'more'

824c8e8

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

kouroshHakha approved these changes Jul 25, 2023

View reviewed changes

angelinalg approved these changes Jul 25, 2023

View reviewed changes

Apply suggestions from code review

4d2ca82

Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

kouroshHakha approved these changes Jul 27, 2023

View reviewed changes

kouroshHakha merged commit 2df2428 into ray-project:master Jul 27, 2023
37 of 40 checks passed

ArturNiederfahrenhorst mentioned this pull request Jul 27, 2023

[RLlib] Put notices and error out on invalid ModelV2/Policy related c… #37875

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

ArturNiederfahrenhorst commented Jul 18, 2023 •

edited

ArturNiederfahrenhorst Jul 18, 2023

ArturNiederfahrenhorst commented Jul 18, 2023

kouroshHakha Jul 22, 2023

kouroshHakha Jul 22, 2023

kouroshHakha Jul 22, 2023

kouroshHakha Jul 22, 2023

ArturNiederfahrenhorst Jul 22, 2023

kouroshHakha Jul 22, 2023

kouroshHakha Jul 22, 2023

matthewdeng Jul 24, 2023

		From Ray 2.6.0 onwards, RLlib is adopting a new stack for training and model customization, moving away from ModelV2 API and some convoluted parts of Policy API.
		Starting from PPO algorithm, we will be gradually replace components with `RLModule API <rllib-rlmodule.html>`__.

[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

Conversation

ArturNiederfahrenhorst commented Jul 18, 2023 • edited

Why are these changes needed?

Choose a reason for hiding this comment

ArturNiederfahrenhorst commented Jul 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArturNiederfahrenhorst commented Jul 18, 2023 •

edited