Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Learner API Docs #37729

Merged
merged 13 commits into from
Aug 3, 2023
Merged

Conversation

avnishn
Copy link
Member

@avnishn avnishn commented Jul 24, 2023

Signed-off-by: Avnish avnishnarayan@gmail.com

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
@avnishn
Copy link
Member Author

avnishn commented Jul 24, 2023

These docs are WIP right now. I still need to add some of the other sections. I'm almost through getting the doc snippets to run without errors.

Signed-off-by: Avnish <avnishnarayan@gmail.com>
Copy link
Contributor

@kouroshHakha kouroshHakha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure that everytime you use RLModule, Learner, and LearnerGroup you are referencing the actual API?

rllib/core/learner/learner_group.py Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@angelinalg angelinalg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits.

doc/source/rllib/rllib-learner.rst Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Outdated Show resolved Hide resolved
doc/source/rllib/rllib-learner.rst Show resolved Hide resolved
@rickyyx
Copy link
Contributor

rickyyx commented Jul 27, 2023

update? still targeting at 2.6.2?

@avnishn
Copy link
Member Author

avnishn commented Jul 27, 2023

yep.

Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Avnish <avnishnarayan@gmail.com>
@avnishn
Copy link
Member Author

avnishn commented Jul 31, 2023

update? still targeting at 2.6.2?

yes

Signed-off-by: Avnish <avnishnarayan@gmail.com>
:hide:
:skipif: True

from ray.rllib.algorithms.ppo.ppo import PPOConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc contains a lot of code that is not tested.
There exists a folder "doc/source/rllib/doc_code" that could hold a guide file to hold all this test code that actually gets executed there. You could put code there and grab code snippets from there to ensure these example snippets stay up to date.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lets do that in a later iteration at this point since doc picks are due today.

But I agree that if thats the workaround then we'll go that way instead.

config.build() # test that the algorithm can be built with the given resources


.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this more like here?

  1. But the warning that this feature is in alpha and the supported algos up top
  2. Consolidate the information on how to enable it with the above information in section "Enabling Learner API in RLlib experiments".

@avnishn
Copy link
Member Author

avnishn commented Jul 31, 2023

I purposely disabled doc testing for code snippets in this PR at the request of @bveeramani

We're having ray rpc issues when running the tests on the CI, but they pass locally.


Adjust the amount of resources for training using the
`num_gpus_per_learner_worker`, `num_cpus_per_learner_worker`, and `num_learner_workers`
arguments in the `AlgorithmConfig`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link to the API refs page at least once for these things?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

framework_hyperparameters=FrameworkHyperparameters(),
)

learner_group = LearnerGroup(learner_spec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would become more useful if it would include the necessary imports so it would become copy-pasteable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its going to add a lot of cognitive load to the guide. Maybe instead we put all of this in a file later then come back to it. We can do that when we update the examples.

learner_group.load_state(LEARNER_GROUP_CKPT_DIR)

Checkpoint the state of all learners in the `LearnerGroup` via `save_state` and
`load_state`. This includes all states including neural network weights and any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make clear what the difference is between get/set/save/load state?
For many of these sections, we just start off with a code example without giving context.
I think both sections "Getting and setting state" and "Checkpointing" should begin with something like:

"You can checkpoint the state of a LearnerGroup to a directory by calling LearnerGroup.save_state(). This will call LearnerGroup.get_state() under the hood and therefore includes neural networks weights and optimizer states.
Loading a checkpoint works accordingly. Here is an example of how to save and load the state of a LearnerGroup:

.. testcode::
            :skipif: True

            learner_group.save_state(LEARNER_GROUP_CKPT_DIR)
            learner_group.load_state(LEARNER_GROUP_CKPT_DIR)

Note that since the state of all of the Learner s is identical, only the states from the first Learner need to be saved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Signed-off-by: Avnish <avnishnarayan@gmail.com>

Implementation
==============
`Learner` has many APIs for flexible implementation, however the core ones that you need to implement are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho we should clarify what this section deals with.
Is this about implementing a learner from scratch? Or for customization of an existing learner?

The heading should be something like "Customizing Learner" or "Implementing your own Learner" or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't it be both? I feel like this might be semantics.


Implementation
==============
:py:class:`~ray.rllib.core.learner.learner.Learner` has many APIs for flexible implementation, however the core ones that you need to implement are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Learner class has a lot of abstract methods (13 or so).
I think we should clarify here that if you would want to only override the ones below, you'll have to subclass a framework specific implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I was going for here is to describe the minimum functions that people need to implement or override for their custom learners, whether that be them overriding them or implementing a brand new one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo this should read:

Writing or customizing Learners

If you want to write your own custom :py:class:~ray.rllib.core.learner.learner.Learner, you will want to subclass a framework-specific Learner. The following is a list of the most important methods that you'll want to implement. Similarely, these methods are likely what you'll want to override if you want to modify an existing Learner like TorchPPOLearner :

(...)

Comment on lines +1476 to +1477
.. testcode::
:skipif: True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do code-block::. Ratched will only complain for code-block:: python

arguments in the :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig`.

.. testcode::
:hide:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this overindented?


.. testcode::
:hide:
:skipif: True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, instead of labeling all blocks with :skipif: True, we could add rllib-learner.rst to the exclude list in the BUILD file.

@kouroshHakha kouroshHakha merged commit 7dcad86 into ray-project:master Aug 3, 2023
46 of 51 checks passed
avnishn added a commit to avnishn/ray that referenced this pull request Aug 4, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
rickyyx pushed a commit that referenced this pull request Aug 7, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: NripeshN <nn2012@hw.ac.uk>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: harborn <gangsheng.wu@intel.com>
harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
Signed-off-by: Avnish <avnishnarayan@gmail.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants