Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526

Merged
merged 14 commits into from
Jul 27, 2023

Conversation

ArturNiederfahrenhorst
Copy link
Contributor

@ArturNiederfahrenhorst ArturNiederfahrenhorst commented Jul 18, 2023

Why are these changes needed?

Starting with Ray 2.6.0, we roll out RLModules and Learners.
This requires notifications in docs about this migration so that users are not caught off-guard.
Further more, this PR includes a hotfix for when users attempt to set a custom_model while using the RL Module API, which stems from the ModelV2 API, which is incompatible.

Solves #37085

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
@ArturNiederfahrenhorst ArturNiederfahrenhorst changed the title [RLlib] Put notices and algorithm config fix [RLlib] Put notices and error out on invalid ModelV2/Policy related configs Jul 18, 2023
@ArturNiederfahrenhorst ArturNiederfahrenhorst changed the title [RLlib] Put notices and error out on invalid ModelV2/Policy related configs [RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules Jul 18, 2023
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
"with the RLModule API. Please set `_enable_rl_module_api=False` "
"to use the legacy Policy and ModelV2 API."
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not compatible anymore, while policies_to_train and policy_mapping_fn can sitll be used.

@ArturNiederfahrenhorst
Copy link
Contributor Author

We should pick the docs part of this into 2.6.0 @kouroshHakha @bveeramani
I'll open a PR after this is merged.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
…tch norm to old API for now

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
@@ -803,6 +803,8 @@ def get_default_policy_class(
) -> Optional[Type[Policy]]:
"""Returns a default Policy class to use, given a config.

Note that this method is ignored when the RLModule API is enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this an actual note after the next paragraph.

@@ -1011,6 +1011,20 @@ def validate(self) -> None:
else:
self.rl_module_spec = default_rl_module_spec

if self.model["custom_model"] is not None:
raise ValueError(
"Cannot use `custom_model` option with RLModule API."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should say what they should do instead if they want to use custom_model (i.e. disabling RLModule via what API? or if they want to migrate to RLModule, migrate their custom model to a custom RLModule) Give them clear instructions.

Comment on lines 1023 to 1025
"Cannot use `custom_model_config` option with RLModule API."
"`custom_model_config` is part of the ModelV2 API and Policy API, "
"which are not compatible with the RLModule API."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here. Give clear instructions to the user on what to fix.

@@ -210,6 +210,9 @@ def test_traj_view_attention_net(self):
sgd_minibatch_size=201,
num_sgd_iter=5,
)
# Batch-norm models have not been migrated to the RL Module API yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a github issue tracking this todo in out backlog?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done #37683

Comment on lines +147 to +149
# Batch-norm models have not been migrated to the RL Module API yet.
.training(_enable_learner_api=False)
.rl_module(_enable_rl_module_api=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link these two tests in the description of the issue, so the problem is clearly elaborated.

# Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
.resources(num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "0")))
# Batch-norm models have not been migrated to the RL Module API yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing

Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Comment on lines 3 to 4
From Ray 2.6.0 onwards, RLlib is adopting a new stack for training and model customization, moving away from ModelV2 API and some convoluted parts of Policy API.
Starting from PPO algorithm, we will be gradually replace components with `RLModule API <rllib-rlmodule.html>`__.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with all the technical details, but will this be sufficient for RLlib users to know what's going on?

A few thoughts/questions:

  1. The wording of "a new stack" feels internal/ambiguous as a user, maybe you can explicitly call out RLModules and Learners in this sentence instead?
  2. "Starting from PPO algorithm" - what does this mean to the user? Should they expect a new API for PPO in 2.6.0?
  3. If all the information is captures in the RLModule doc, maybe it'll be helpful to just say something like "For more details about this migration(?) see RLModules<...>"

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
rllib/algorithms/algorithm_config.py Outdated Show resolved Hide resolved
rllib/examples/custom_env.py Outdated Show resolved Hide resolved
rllib/examples/custom_env.py Outdated Show resolved Hide resolved
rllib/examples/custom_env.py Outdated Show resolved Hide resolved
rllib/examples/custom_env.py Outdated Show resolved Hide resolved
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
@kouroshHakha kouroshHakha merged commit 2df2428 into ray-project:master Jul 27, 2023
37 of 40 checks passed
ArturNiederfahrenhorst added a commit to ArturNiederfahrenhorst/ray that referenced this pull request Jul 27, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
rickyyx pushed a commit that referenced this pull request Jul 28, 2023
…onfigs for RL Modules (#37526) (#37875)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…onfigs for RL Modules (ray-project#37526)

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants