-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526
[RLlib] Put notices and error out on invalid ModelV2/Policy related configs for RL Modules #37526
Conversation
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
rllib/algorithms/algorithm_config.py
Outdated
"with the RLModule API. Please set `_enable_rl_module_api=False` " | ||
"to use the legacy Policy and ModelV2 API." | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not compatible anymore, while policies_to_train and policy_mapping_fn can sitll be used.
We should pick the docs part of this into 2.6.0 @kouroshHakha @bveeramani |
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
…tch norm to old API for now Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
rllib/algorithms/algorithm.py
Outdated
@@ -803,6 +803,8 @@ def get_default_policy_class( | |||
) -> Optional[Type[Policy]]: | |||
"""Returns a default Policy class to use, given a config. | |||
|
|||
Note that this method is ignored when the RLModule API is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this an actual note after the next paragraph.
rllib/algorithms/algorithm_config.py
Outdated
@@ -1011,6 +1011,20 @@ def validate(self) -> None: | |||
else: | |||
self.rl_module_spec = default_rl_module_spec | |||
|
|||
if self.model["custom_model"] is not None: | |||
raise ValueError( | |||
"Cannot use `custom_model` option with RLModule API." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should say what they should do instead if they want to use custom_model (i.e. disabling RLModule via what API? or if they want to migrate to RLModule, migrate their custom model to a custom RLModule) Give them clear instructions.
rllib/algorithms/algorithm_config.py
Outdated
"Cannot use `custom_model_config` option with RLModule API." | ||
"`custom_model_config` is part of the ModelV2 API and Policy API, " | ||
"which are not compatible with the RLModule API." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here. Give clear instructions to the user on what to fix.
@@ -210,6 +210,9 @@ def test_traj_view_attention_net(self): | |||
sgd_minibatch_size=201, | |||
num_sgd_iter=5, | |||
) | |||
# Batch-norm models have not been migrated to the RL Module API yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you create a github issue tracking this todo in out backlog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done #37683
# Batch-norm models have not been migrated to the RL Module API yet. | ||
.training(_enable_learner_api=False) | ||
.rl_module(_enable_rl_module_api=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link these two tests in the description of the issue, so the problem is clearly elaborated.
# Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0. | ||
.resources(num_gpus=int(os.environ.get("RLLIB_NUM_GPUS", "0"))) | ||
# Batch-norm models have not been migrated to the RL Module API yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
From Ray 2.6.0 onwards, RLlib is adopting a new stack for training and model customization, moving away from ModelV2 API and some convoluted parts of Policy API. | ||
Starting from PPO algorithm, we will be gradually replace components with `RLModule API <rllib-rlmodule.html>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with all the technical details, but will this be sufficient for RLlib users to know what's going on?
A few thoughts/questions:
- The wording of "a new stack" feels internal/ambiguous as a user, maybe you can explicitly call out RLModules and Learners in this sentence instead?
- "Starting from PPO algorithm" - what does this mean to the user? Should they expect a new API for PPO in 2.6.0?
- If all the information is captures in the RLModule doc, maybe it'll be helpful to just say something like "For more details about this migration(?) see RLModules<...>"
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: NripeshN <nn2012@hw.ac.uk>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: harborn <gangsheng.wu@intel.com>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
…onfigs for RL Modules (ray-project#37526) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Victor <vctr.y.m@example.com>
Why are these changes needed?
Starting with Ray 2.6.0, we roll out RLModules and Learners.
This requires notifications in docs about this migration so that users are not caught off-guard.
Further more, this PR includes a hotfix for when users attempt to set a
custom_model
while using the RL Module API, which stems from the ModelV2 API, which is incompatible.Solves #37085