[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting started" rst page. by sven1977 · Pull Request #49950 · ray-project/ray

sven1977 · 2025-01-18T18:00:11Z

Docs do-over (new API stack): Rewrite/enhance "getting started" rst page.

Rename file from rllib-training.html to getting-started.html.
Translate everything to the new API stack and simplify a little.
Vale cleanup.
Move example code into ..testcode blocks.

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # doc/source/rllib/rllib-training.rst

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Some nits here and there. Great introduction for users into RLlib.

simonsays1980 · 2025-01-20T10:29:36Z

+In this tutorial, you learn how to design, customize, and run an end-to-end RLlib learning experiment
+from scratch. This includes picking and configuring an :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`,
+running a couple of training iterations, saving the state of your
+:py:class:`~ray.rllib.algorithms.algorithm.Algorithm` from time to time, running a separate


Awesome! This is what most people are looking for.

simonsays1980 · 2025-01-20T10:30:24Z

+Python API
+~~~~~~~~~~
+
+RLlib's Python API provides all the flexibility required for applying the library to any


Do we have any other API than the Python one?

Nope :D We got rid of the CLI, b/c of the maintenance burden, its stark limitations, and it being more or less a duplicate of a subset of what the python API could do.

Well, we are working on the external access protocol for clients to connect to and communicate with RLlib, but that's heavily wip.

simonsays1980 · 2025-01-20T10:31:51Z

+    )
+
+
+To scale your setup and define, how many EnvRunner actors you want to leverage,


Shall we put all class names into ``?

Also we might want to add that these EnvRunners are used to rollout the policy and collect samples?

simonsays1980 · 2025-01-20T10:33:24Z

+.. testcode::
+
+    # Build the Algorithm (PPO).
+    ppo = config.build_algo()


Does build still work?

Yup, but you get a warning.

simonsays1980 · 2025-01-20T10:33:59Z

+    from pprint import pprint
+
+    for _ in range(5):
+        pprint(ppo.train())


simonsays1980 · 2025-01-20T10:47:34Z

+        # Define your custom env class by subclassing gymnasium.Env:
+
+        class ParrotEnv(gym.Env):
+            """Environment in which the agent learns to repeat the seen observations.


Haha! Awesome!

simonsays1980 · 2025-01-20T10:49:32Z

+        # Point your config to your custom env class:
+        config = (
+            PPOConfig()
+            .environment(ParrotEnv)  # add `env_config=[some Box space] to customize the env


Maybe a missing " ` "?

done and clarified more. Also fixed the env accepting this suggested setting.

simonsays1980 · 2025-01-20T10:50:10Z

+        class CustomTorchRLModule(TorchRLModule):
+            def setup(self):
+                # You have access here to the following already set attributes:
+                # self.observation_space


Great description!!

simonsays1980 · 2025-01-20T10:51:55Z

+    :hide:

-At the end of your script, RLlib evaluates the trained Algorithm:
+    algo.stop()


Haha. Yes that is needed.

We might however show it explicitly as otherwise users might run into problems.

... in their own code

Great idea. Will add a one-liner for this API.

simonsays1980 · 2025-01-20T10:55:33Z

+
+        The `state` of an instantiated Algorithm can be retrieved by calling its
+        `get_state` method. It contains all information necessary
+        to create the Algorithm from scratch. No access to the original code (e.g.


Does this work now also with algorithms that had defined new attributes/methods? If the class is available it should imo.

Yeah, I think so. Users can decide to override the get_state/set_state APIs to add more stateful stuff to their state-dicts, but the basic functionality (restoring EnvRunners, RLModule, Learner optimizer states, connector pipelines, etc..) works across all algos.

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_metrics_logger Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # doc/source/rllib/package_ref/algorithm.rst

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

angelinalg

Just some style nits.

angelinalg · 2025-01-22T19:27:56Z

 - `[Course] Applied Reinforcement Learning with RLlib <https://applied-rl-course.netlify.app/>`_
 - `[Blog] Intro to RLlib: Example Environments <https://medium.com/distributed-computing-with-ray/intro-to-rllib-example-environments-3a113f532c70>`_
- :doc:`[Guide] Getting Started with RLlib </rllib/rllib-training>`
+- :doc:`[Guide] Getting Started with RLlib </rllib/getting-started>`


Suggested change

- :doc:`[Guide] Getting Started with RLlib </rllib/getting-started>`

- :doc:`[Guide] Getting started with RLlib </rllib/getting-started>`

angelinalg · 2025-01-22T19:28:31Z

+
+.. _rllib-getting-started:
+
+Getting Started


Suggested change

Getting Started

Getting started

angelinalg · 2025-01-23T01:13:59Z

+RLlib's Python API provides all the flexibility required for applying the library to any
+type of RL problem.
+
+You manage RLlib experiments through an instance of the :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`


Suggested change

You manage RLlib experiments through an instance of the :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`

Manage RLlib experiments using an instance of the :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`

angelinalg · 2025-01-23T01:14:09Z

+class. An :py:class:`~ray.rllib.algorithms.algorithm.Algorithm` typically holds a neural
+network for computing actions, called ``policy``, the :ref:`RL environment <rllib-key-concepts-environments>`
+that you want to optimize against, a loss function, an optimizer, and some code describing the
+algorithm's execution logic, like determining when to collect samples, when to update your model, etc..


Suggested change

algorithm's execution logic, like determining when to collect samples, when to update your model, etc..

algorithm's execution logic, like determining when to collect samples, when to update your model, etc.

angelinalg · 2025-01-23T01:14:37Z

+In :ref:`multi-agent training <rllib-multi-agent-environments-doc>`,
+:py:class:`~ray.rllib.algorithms.algorithm.Algorithm` manages the querying and optimization of multiple policies at once.
+
+Through the algorithm's interface, you can train the policy, compute actions, or store your


Suggested change

Through the algorithm's interface, you can train the policy, compute actions, or store your

Using the algorithm's interface, you can train the policy, compute actions, or store the

angelinalg · 2025-01-23T01:35:51Z

+
+        pip install "gymnasium[atari,accept-rom-license,mujoco]"
+
+This is all, you can now start coding against RLlib. Here is an example for running the :ref:`PPO Algorithm <ppo>` on the


Suggested change

This is all, you can now start coding against RLlib. Here is an example for running the :ref:`PPO Algorithm <ppo>` on the

You are ready to start coding against RLlib. The following is an example for running the :ref:`PPO Algorithm <ppo>` on the

angelinalg · 2025-01-23T01:36:02Z

 `Taxi domain <https://gymnasium.farama.org/environments/toy_text/taxi/>`__.
-You first create a `config` for the algorithm, which defines the RL environment and
-any other needed settings and parameters.
+You first create a `config` for the algorithm, which defines the :ref:`RL environment <rllib-key-concepts-environments>` and any other needed settings and parameters.


Suggested change

You first create a `config` for the algorithm, which defines the :ref:`RL environment <rllib-key-concepts-environments>` and any other needed settings and parameters.

First create a `config` for the algorithm, which defines the :ref:`RL environment <rllib-key-concepts-environments>` and any other needed settings and parameters.

angelinalg · 2025-01-23T01:36:28Z

+    for _ in range(5):
+        pprint(algo.train())
+
+At the end of your script, you evaluate the trained Algorithm and release all its resources:


Suggested change

At the end of your script, you evaluate the trained Algorithm and release all its resources:

At the end of the script, evaluate the trained Algorithm and release all its resources:

angelinalg · 2025-01-23T01:37:08Z

+:py:class:`~ray.rllib.env.env_runner.EnvRunner` actors through the ``config.evaluation()`` method.

-`See here <rllib-training.html#using-the-python-api>`_, if you want to learn more about the RLlib training APIs.
+:ref:`See here <rllib-python-api>`, if you want to learn more about the RLlib training APIs.


Suggested change

:ref:`See here <rllib-python-api>`, if you want to learn more about the RLlib training APIs.

See :ref:`rllib-python-api`, to learn more about the RLlib training APIs.

angelinalg · 2025-01-23T01:38:40Z

+
+        The `state` of an instantiated Algorithm can be retrieved by calling its
+        `get_state` method. It contains all information necessary
+        to create the Algorithm from scratch. No access to the original code (e.g.


Suggested change

to create the Algorithm from scratch. No access to the original code (e.g.

to create the Algorithm from scratch. No access to the original code (e.g.,

…_redo_getting_started Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/algorithms/algorithm.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…_redo_getting_started

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…started" rst page. (#49950)

…started" rst page. (ray-project#49950)

sven1977 added 12 commits January 3, 2025 22:26

wip

bed55cc

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

5892337

…_redo_getting_started Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # doc/source/rllib/rllib-training.rst

Merge branch 'master' of https://github.com/ray-project/ray into docs…

e33624c

…_redo_getting_started Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # doc/source/rllib/rllib-training.rst

wip

042909d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

1ef61cc

…_redo_getting_started

wip

84b63f4

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

8d67d4f

…_redo_getting_started

Merge branch 'master' of https://github.com/ray-project/ray into docs…

6b2c97a

…_redo_getting_started

Merge branch 'master' of https://github.com/ray-project/ray into docs…

42a3a59

…_redo_getting_started

wip

e0d6ce6

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

570d73c

…_redo_getting_started

wip

bcd594c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from a team, maxpumperla and simonsays1980 as code owners January 18, 2025 18:00

sven1977 added 2 commits January 18, 2025 19:01

wip

b6001a1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

vale and lint

5812923

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 assigned simonsays1980 and angelinalg Jan 18, 2025

sven1977 added rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack labels Jan 18, 2025

sven1977 added 5 commits January 20, 2025 09:13

wip

e46558e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

28da682

…_redo_getting_started

fix

be93ff1

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

5b888e1

…_redo_getting_started

fix

00ea51c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980 approved these changes Jan 20, 2025

View reviewed changes

wip

85fd123

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 8 commits January 20, 2025 14:35

Merge branch 'master' of https://github.com/ray-project/ray into docs…

89023b9

…_redo_getting_started

fix

e5203d2

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

512806d

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

5a6e766

…_redo_getting_started

wip

7286da0

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

5c14bc3

…_redo_metrics_logger Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # doc/source/rllib/package_ref/algorithm.rst

wip

79ab310

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

1bd500c

…_redo_getting_started

angelinalg approved these changes Jan 23, 2025

View reviewed changes

sven1977 enabled auto-merge (squash) January 23, 2025 10:20

github-actions Bot added the go add ONLY when ready to merge, run all tests label Jan 23, 2025

Merge branch 'master' of https://github.com/ray-project/ray into docs…

c018b6c

…_redo_getting_started Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/algorithms/algorithm.py

github-actions Bot disabled auto-merge January 23, 2025 10:22

sven1977 added 7 commits January 23, 2025 12:26

fixes

3231c1c

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

a699e56

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into docs…

ffcb4c0

…_redo_getting_started

wip

1ec6e68

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

5703024

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

3612b6a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

7886251

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) January 23, 2025 15:30

sven1977 merged commit 66602b1 into ray-project:master Jan 24, 2025

sven1977 deleted the docs_redo_getting_started branch January 24, 2025 09:39

srinathk10 pushed a commit that referenced this pull request Feb 2, 2025

[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting …

c521524

…started" rst page. (#49950)

xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025

[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting …

351ff99

…started" rst page. (ray-project#49950)

xsuler pushed a commit to antgroup/ant-ray that referenced this pull request Mar 4, 2025

[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting …

cdc1f3c

…started" rst page. (ray-project#49950)

park12sj pushed a commit to park12sj/ray that referenced this pull request Mar 18, 2025

[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting …

9e9a1c7

…started" rst page. (ray-project#49950)

hainesmichaelc added the community-backlog label May 22, 2025

		)


		To scale your setup and define, how many EnvRunner actors you want to leverage,

	- :doc:`[Guide] Getting Started with RLlib </rllib/getting-started>`
	- :doc:`[Guide] Getting started with RLlib </rllib/getting-started>`

	You manage RLlib experiments through an instance of the :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
	Manage RLlib experiments using an instance of the :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`

	algorithm's execution logic, like determining when to collect samples, when to update your model, etc..
	algorithm's execution logic, like determining when to collect samples, when to update your model, etc.

	Through the algorithm's interface, you can train the policy, compute actions, or store your
	Using the algorithm's interface, you can train the policy, compute actions, or store the


		pip install "gymnasium[atari,accept-rom-license,mujoco]"

		This is all, you can now start coding against RLlib. Here is an example for running the :ref:`PPO Algorithm <ppo>` on the

	This is all, you can now start coding against RLlib. Here is an example for running the :ref:`PPO Algorithm <ppo>` on the
	You are ready to start coding against RLlib. The following is an example for running the :ref:`PPO Algorithm <ppo>` on the

	You first create a `config` for the algorithm, which defines the :ref:`RL environment <rllib-key-concepts-environments>` and any other needed settings and parameters.
	First create a `config` for the algorithm, which defines the :ref:`RL environment <rllib-key-concepts-environments>` and any other needed settings and parameters.

	At the end of your script, you evaluate the trained Algorithm and release all its resources:
	At the end of the script, evaluate the trained Algorithm and release all its resources:

	:ref:`See here <rllib-python-api>`, if you want to learn more about the RLlib training APIs.
	See :ref:`rllib-python-api`, to learn more about the RLlib training APIs.

	to create the Algorithm from scratch. No access to the original code (e.g.
	to create the Algorithm from scratch. No access to the original code (e.g.,

Conversation

sven1977 commented Jan 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

simonsays1980 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

angelinalg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

sven1977 commented Jan 18, 2025 •

edited

Loading