Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] SAC RLModule on new API stack. #42568

Merged
merged 15 commits into from
Jan 24, 2024

Conversation

simonsays1980
Copy link
Collaborator

@simonsays1980 simonsays1980 commented Jan 22, 2024

Why are these changes needed?

Transferring to the new stack SAC needs to be implemented with an RLModule. This PR delivers the files needed to define and configure the SACRLModule. Implementation is in PyTorch.

Related issue number

Closes #37778

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
@sven1977 sven1977 changed the title SAC-RLModule [RLlib] SAC RLModule on new API stack. Jan 23, 2024
@sven1977 sven1977 marked this pull request as ready for review January 23, 2024 13:14

@ExperimentalAPI
class SACRLModule(RLModule, RLModuleWithTargetNetworksInterface):
def setup(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add override decorator.

Question: Should we call super here or not?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to be super consistent and clear about the two decorators:

@OverrideToImplementCustomLogic
and
@OverrideToImplementCustomLogic_CallToSuperRecommended

Can you check, whether these already exist in the base: RLModule.setup() and if not add them as applicable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are used in the super class. I think I remember that I ran into an error when calling super().setup(). I have to recheck, when testing.

self.action_dist_cls = catalog.get_action_dist_cls(framework=self.framework)

# Define the temperature.
self.alpha = self.config.model_config_dict["initial_alpha"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumb question: How do we get this config.initial_alpha into the model config dict?
I had the same problem for DreamerV3 (and other algos) and I think we need to come up with a non-hacky solution for this problem.

Constraint here:

  • We don't want to have to pass the entire AlgorithmConfig into RLModule constructors as we envision RLModules to be used completely outside of RLlib in production.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we actually do not need it in the model_config_dict as it is now only used in the learner. In general we have to solve this more consistently.

It might play here a role as the old stack does use a model config for the poilicy and one for the q model. If we want to use something like this we should implement such a solution in the same breath.

"post_fcnet_activation"
]

# We don't have the exact (framework specific) action dist class yet and thus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, same in PPO. I feel like we should change the Catalog API here:
Provide the framework already in the c'tor (imo, there is no good reason NOT to do this), then users can already construct the pi-config in the c'tor, which is much cleaner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOuld this also work with dynamic action spaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But do we already support dynamic action spaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that right now, we construct one module config in the c'tor and the other one during build(), which is pretty bad imo. We should be consistent in where and when we do what. :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we do not support them right now - at least not for the new stack I guess.

simonsays1980 and others added 6 commits January 23, 2024 16:59
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
…-rl-module

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
…es in 'SACLearner'.

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
simonsays1980 and others added 3 commits January 24, 2024 14:29
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>
Signed-off-by: Sven Mika <sven@anyscale.io>
Copy link
Contributor

@sven1977 sven1977 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. Thanks for the fixes @simonsays1980 !

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
…-rl-module

Signed-off-by: Simon Zehnder <simon.zehnder@gmail.com>
@sven1977 sven1977 merged commit 93ee64c into ray-project:master Jan 24, 2024
9 checks passed
khluu pushed a commit to khluu/ray that referenced this pull request Jan 24, 2024
khluu pushed a commit to khluu/ray that referenced this pull request Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RLlib] Enabling RLModule by default on SAC
2 participants