[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements #37245

ArturNiederfahrenhorst · 2023-07-10T15:26:34Z

Why are these changes needed?

As discussed offline, our first iteration of docs and examples on RL Modules and Catalog need some tweaking to provide better guidance on most common user journeys and clean up some general rough edges we left.

This PR executes on a collection of action items we collected offline.
This PR also includes some changes to Catalog itself that do not change functionality but make adjustments to naming based on user feedback.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst · 2023-07-17T00:56:29Z

doc/source/rllib/doc_code/rlmodule_guide.py

@@ -399,3 +399,50 @@ def setup(self):

 module = spec.build()
 # __pass-custom-marlmodule-shared-enc-end__
+


This addition solves a todo from the RL Modules doc to show checkpointing.
It's not very extensive, but I think we should include more info in a general section on checkpointing that includes S3 checkpointing, checkpointing with and without tune etc outside of the RL Modules guide.

+1 on a separate user-guide on checkpointing and resuming experiments.

Added to backlog #37515

ArturNiederfahrenhorst · 2023-07-17T00:57:54Z

doc/source/rllib/images/rllib-concepts-rlmodules-sketch.png

General sketch that shows a couple of components on little space at the cost of being somewhat imprecise.
This is on purpose because it should only convey the general idea to form a mental model rather than specifics.

ArturNiederfahrenhorst · 2023-07-17T00:58:37Z

doc/source/rllib/key-concepts.rst

@@ -44,6 +44,7 @@ The model that tries to maximize the expected sum over all future rewards is cal
 The RL simulation feedback loop repeatedly collects data, for one (single-agent case) or multiple (multi-agent case) policies, trains the policies on these collected data, and makes sure the policies' weights are kept in sync. Thereby, the collected environment data contains observations, taken actions, received rewards and so-called **done** flags, indicating the boundaries of different episodes the agents play through in the simulation.

 The simulation iterations of action -> reward -> next state -> train -> repeat, until the end state, is called an **episode**, or in RLlib, a **rollout**.
+The most common API to define environments is the `Farama-Foundation Gymnasium <rllib-env.html#gymnasium>`__ API, which we also use in most of our examples.


This was found further down in the file in Policies section, where it does not belong afaics.

ArturNiederfahrenhorst · 2023-07-17T01:00:29Z

doc/source/rllib/key-concepts.rst

@@ -115,40 +116,32 @@ You can `configure the parallelism <rllib-training.html#specifying-resources>`__
 Check out our `scaling guide <rllib-training.html#scaling-guide>`__ for more details here.


-Policies


We should leave policies out of the concepts from now on.
They are only used on the sampling stack and therefore transparent.

ArturNiederfahrenhorst · 2023-07-17T01:03:16Z

doc/source/rllib/rllib-catalogs.rst


-Catalog (Alpha)
+Catalog (Beta)


With this PR, the API should have largely settled.

ArturNiederfahrenhorst · 2023-07-17T01:03:55Z

doc/source/rllib/rllib-catalogs.rst

 ===============

 Catalogs are where `RLModules <rllib-rlmodule.html>`__ primarily get their models and action distributions from.
 Each :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` has its own default
 :py:class:`~ray.rllib.core.models.catalog.Catalog`. For example,
 :py:class:`~ray.rllib.algorithms.ppo.ppo_torch_rl_module.PPOTorchRLModule` has the
 :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`.
-You can override Catalogs’ methods to alter the behavior of existing RLModules.
-This makes Catalogs a means of configuration for RLModules.


This is misleading.
Users should not randomly land on this page, read the first few sentences and attempt to configure an RL Module through catalogs. This gets clearner further down.

ArturNiederfahrenhorst · 2023-07-17T01:04:35Z

doc/source/rllib/rllib-catalogs.rst

 ===============

 Catalogs are where `RLModules <rllib-rlmodule.html>`__ primarily get their models and action distributions from.
 Each :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` has its own default
 :py:class:`~ray.rllib.core.models.catalog.Catalog`. For example,
 :py:class:`~ray.rllib.algorithms.ppo.ppo_torch_rl_module.PPOTorchRLModule` has the
 :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`.
-You can override Catalogs’ methods to alter the behavior of existing RLModules.
-This makes Catalogs a means of configuration for RLModules.
-You interact with Catalogs when making deeper customization to what :py:class:`~ray.rllib.core.models.Model` and :py:class:`~ray.rllib.models.distributions.Distribution` RLlib creates by default.


The is part of the information directly below.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst · 2023-07-17T04:09:04Z

doc/source/rllib/rllib-rlmodule.rst

-            :language: python
-            :start-after: __write-custom-marlmodule-shared-enc-begin__
-            :end-before: __write-custom-marlmodule-shared-enc-end__
+.. literalinclude:: doc_code/rlmodule_guide.py


This was in its own tab, but with no alternative tabs.

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

doc/source/rllib/package_ref/catalogs.rst

kouroshHakha

This is high quality right there. Awesome PR.
I left some comments and modification to make things a bit more clear.
Also re: RL Module vs. RLModule, feel free to ignore the comment I think you had something different in mind. We just need to be consistent in spelling across all the docs. Make sure everything is spelled RL Modules in text.

kouroshHakha · 2023-07-18T15:15:19Z

doc/source/rllib/rllib-catalogs.rst

+Catalogs can be extended to offer more or different models or distributions to RLModules (e.g. to PPOTorchRLModule).
+Catalogs can we written to build models for new RLModules (for new algorithms).


change the first sentence above to the following:

Catalog is a utility abstraction that modularizes the construction of `RLModules <rllib-rlmodule.html>`. It includes information such how input observation spaces should be encoded, what action distributions should be used, and so on. Each ...

Suggested change

Catalogs can be extended to offer more or different models or distributions to RLModules (e.g. to PPOTorchRLModule).

Catalogs can we written to build models for new RLModules (for new algorithms).

To customize existing RLModules either change the RLModule directly by inheriting the class and changing the `setup()` method or alternatively, extend the `Catalog` class attributed to that `RLModule`. Use catalogs only if your customizations fits the abstractions provided by Catalog.

Added this to a manual commit with proper doc links.

doc/source/rllib/rllib-catalogs.rst

kouroshHakha · 2023-07-18T15:27:35Z

doc/source/rllib/rllib-catalogs.rst

+Whenever you create a Catalog, the decision tree is executed to find suitable configs for models and classes for distributions.
+By default this happens in :py:meth:`~ray.rllib.core.models.catalog.Catalog.get_encoder_config` and :py:meth:`~ray.rllib.core.models.catalog.Catalog._get_dist_cls_from_action_space`.
+Whenever you build a model, the config is turned into a model.
+Distributions are instantiated per forward pass of an RL Module and are therefore not built.


doc/source/rllib/rllib-catalogs.rst

kouroshHakha · 2023-07-18T15:53:17Z

doc/source/rllib/rllib-catalogs.rst

+- Does the Algorithm need a special Encoder? Overwrite :py:meth:`~ray.rllib.core.models.catalog.Catalog._get_encoder_config`.
+- Does the Algorithm need an additional network? Write a method to build it. You can use RLlib's model configurations to build models from dimensions.
+- Does the Algorithm need a custom distribution? Overwrite :py:meth:`~ray.rllib.core.models.catalog.Catalog._get_dist_cls_from_action_space`.
+- Does the Algorithm need a special tokenizer? Overwrite :py:meth:`~ray.rllib.core.models.catalog.Catalog.get_tokenizer_config`.


one thing that is not clear is what is the difference between tokenizer and encoder. We need to have a proper definition of tokenizer somewhere.

Added some more info above now!

kouroshHakha · 2023-07-18T15:55:52Z

doc/source/rllib/rllib-rlmodule.rst

@@ -334,15 +330,21 @@ To construct this custom multi-agent RL module, pass the class to the :py:class:
 Extending Existing RLlib RL Modules
 -----------------------------------

-RLlib provides a number of RL Modules for different frameworks (e.g., PyTorch, TensorFlow, etc.). Extend these modules by inheriting from them and overriding the methods you need to customize. For example, extend :py:class:`~ray.rllib.algorithms.ppo.torch.ppo_torch_rl_module.PPOTorchRLModule` and augment it with your own customization. Then pass the new customized class into the algorithm configuration.
+RLlib provides a number of RL Modules for different frameworks (e.g., PyTorch, TensorFlow, etc.).
+Extend these modules by inheriting from them and overriding the methods you need to customize.


say which method people should override. setup()?

kouroshHakha · 2023-07-18T16:09:56Z

There is also a lint error.

Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Kourosh's suggestions Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

… improvements (#37245) (#37681) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: NripeshN <nn2012@hw.ac.uk>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: harborn <gangsheng.wu@intel.com>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com> Signed-off-by: Victor <vctr.y.m@example.com>

ArturNiederfahrenhorst added 2 commits July 10, 2023 16:48

initial

9131a58

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Update docstrings and make some methods private

4469763

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst assigned kouroshHakha Jul 10, 2023

ArturNiederfahrenhorst requested review from sven1977, gjoliver, avnishn, smorad, maxpumperla, kouroshHakha, krfricke and a team as code owners July 10, 2023 15:26

ArturNiederfahrenhorst added 7 commits July 11, 2023 09:00

wip

e5eae0e

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Add mobilenet v2 examples and update rlmodule and catalog docs

62f2947

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Add checkpointing example

1154e1b

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

wip

e951a50

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Move mobilenet encoder/config to new file, add mobilenet rlm example

141cb73

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Clean up catalogs doc

bccf0be

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

cleanup mobilenet v2 encoder

7a54131

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst commented Jul 17, 2023

View reviewed changes

ArturNiederfahrenhorst added 2 commits July 16, 2023 19:20

More of Kourosh's comments

1de8bd7

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Cleanup catalogs docs

5883490

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst commented Jul 17, 2023

View reviewed changes

ArturNiederfahrenhorst added 2 commits July 16, 2023 21:12

more cleanup

6b253e9

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

lint and doc errors

7148711

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst added 2 commits July 17, 2023 10:19

Move to _determine_components_hook

f6e7806

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Fix short underline and turn Betas back to Alphas

bb62655

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

ArturNiederfahrenhorst changed the title ~~[RLlib] Early improvements to Catalogs and RL Modules docs to clear up common use cases~~ [RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements Jul 17, 2023

ArturNiederfahrenhorst added 3 commits July 17, 2023 13:29

minor fix

0a2b4be

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Fix bad srcs attribute for mobilenet file

e932112

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Merge branch 'master' into catalogrlmodulesdocs

a9b54c0

kouroshHakha reviewed Jul 18, 2023

View reviewed changes

doc/source/rllib/package_ref/catalogs.rst Outdated Show resolved Hide resolved

kouroshHakha reviewed Jul 18, 2023

View reviewed changes

doc/source/rllib/package_ref/catalogs.rst Outdated Show resolved Hide resolved

kouroshHakha approved these changes Jul 18, 2023

View reviewed changes

ArturNiederfahrenhorst and others added 6 commits July 18, 2023 18:29

Update doc/source/rllib/package_ref/catalogs.rst

cbae483

Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Apply suggestions from code review

138070a

Kourosh's suggestions Co-authored-by: kourosh hakhamaneshi <31483498+kouroshHakha@users.noreply.github.com> Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

More of Kourosh's suggestions

5b865ce

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Merge branch 'help' into catalogrlmodulesdocs

aae8c5b

fix bad rst format

c73e234

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

cleanup

d428169

Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

kouroshHakha approved these changes Jul 19, 2023

View reviewed changes

kouroshHakha merged commit ab131bb into ray-project:master Jul 19, 2023
42 of 45 checks passed

Bhav00 pushed a commit to Bhav00/ray that referenced this pull request Jul 24, 2023

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs…

1213f20

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

rickyyx pushed a commit that referenced this pull request Jul 25, 2023

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs…

915041c

… improvements (#37245) (#37681) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

harborn pushed a commit to harborn/ray that referenced this pull request Aug 17, 2023

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs…

44dca95

… improvements (ray-project#37245) Signed-off-by: Artur Niederfahrenhorst <attaismyname@googlemail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements #37245

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements #37245

ArturNiederfahrenhorst commented Jul 10, 2023 •

edited

ArturNiederfahrenhorst Jul 17, 2023

kouroshHakha Jul 18, 2023

kouroshHakha Jul 18, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

ArturNiederfahrenhorst Jul 17, 2023

kouroshHakha left a comment

kouroshHakha Jul 18, 2023

kouroshHakha Jul 18, 2023

ArturNiederfahrenhorst Jul 18, 2023

kouroshHakha Jul 18, 2023

kouroshHakha Jul 18, 2023

ArturNiederfahrenhorst Jul 18, 2023

kouroshHakha Jul 18, 2023

kouroshHakha commented Jul 18, 2023

		@@ -399,3 +399,50 @@ def setup(self):

		module = spec.build()
		# __pass-custom-marlmodule-shared-enc-end__

		@@ -115,40 +116,32 @@ You can `configure the parallelism <rllib-training.html#specifying-resources>`__
		Check out our `scaling guide <rllib-training.html#scaling-guide>`__ for more details here.


		Policies

		Catalogs can be extended to offer more or different models or distributions to RLModules (e.g. to PPOTorchRLModule).
		Catalogs can we written to build models for new RLModules (for new algorithms).

	Catalogs can be extended to offer more or different models or distributions to RLModules (e.g. to PPOTorchRLModule).
	Catalogs can we written to build models for new RLModules (for new algorithms).
	To customize existing RLModules either change the RLModule directly by inheriting the class and changing the `setup()` method or alternatively, extend the `Catalog` class attributed to that `RLModule`. Use catalogs only if your customizations fits the abstractions provided by Catalog.

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements #37245

[RLlib] Early improvements to Catalogs and RL Modules docs + Catalogs improvements #37245

Conversation

ArturNiederfahrenhorst commented Jul 10, 2023 • edited

Why are these changes needed?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kouroshHakha commented Jul 18, 2023

ArturNiederfahrenhorst commented Jul 10, 2023 •

edited