forked from ray-project/ray
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RLlib] Documentation do-over 01: Announce new API stack as alpha; ad…
…d hints to all RLlib pages; describe how to use it in new page. (ray-project#44090)
- Loading branch information
1 parent
9d16d30
commit d1ff9ad
Showing
44 changed files
with
442 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
.. note:: | ||
|
||
Ray 2.10.0 introduces the alpha stage of RLlib's "new API stack". | ||
The Ray Team plans to transition algorithms, example scripts, and documentation to the new code base | ||
thereby incrementally replacing the "old API stack" (e.g., ModelV2, Policy, RolloutWorker) throughout the subsequent minor releases leading up to Ray 3.0. | ||
|
||
Note, however, that so far only PPO (single- and multi-agent) and SAC (single-agent only) | ||
support the "new API stack" and continue to run by default with the old APIs. | ||
You can continue to use the existing custom (old stack) classes. | ||
|
||
`See </rllib/package_ref/rllib-new-api-stack.html>`__ for more details on how to use the new API stack. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
.. note:: | ||
|
||
This doc is related to RLlib's `new API stack </rllib/package_ref/rllib-new-api-stack.html>`__ and therefore experimental. |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# __enabling-new-api-stack-sa-ppo-begin__ | ||
|
||
from ray.rllib.algorithms.ppo import PPOConfig | ||
from ray.rllib.env.single_agent_env_runner import SingleAgentEnvRunner | ||
|
||
|
||
config = ( | ||
PPOConfig().environment("CartPole-v1") | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
# Set the `env_runner_cls` to `SingleAgentEnvRunner` for single-agent setups and | ||
# `MultiAgentEnvRunner` for multi-agent cases. | ||
.rollouts(env_runner_cls=SingleAgentEnvRunner) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
# `num_gpus_per_learner_worker=1`). | ||
.resources( | ||
num_learner_workers=0, # <- in most cases, set this value to the number of GPUs | ||
num_gpus_per_learner_worker=0, # <- set this to 1, if you have at least 1 GPU | ||
num_cpus_for_local_worker=1, | ||
) | ||
# When using RLlib's default models (RLModules) AND the new EnvRunners, you should | ||
# set this flag in your model config. Having to set this, will no longer be required | ||
# in the near future. It does yield a small performance advantage as value function | ||
# predictions for PPO are no longer required to happen on the sampler side (but are | ||
# now fully located on the learner side, which might have GPUs available). | ||
.training(model={"uses_new_env_runners": True}) | ||
) | ||
|
||
# __enabling-new-api-stack-sa-ppo-end__ | ||
|
||
# Test whether it works. | ||
print(config.build().train()) | ||
|
||
|
||
# __enabling-new-api-stack-ma-ppo-begin__ | ||
|
||
from ray.rllib.algorithms.ppo import PPOConfig # noqa | ||
from ray.rllib.env.multi_agent_env_runner import MultiAgentEnvRunner # noqa | ||
from ray.rllib.examples.env.multi_agent import MultiAgentCartPole # noqa | ||
|
||
|
||
# A typical multi-agent setup (otherwise using the exact same parameters as before) | ||
# looks like this. | ||
config = ( | ||
PPOConfig().environment(MultiAgentCartPole, env_config={"num_agents": 2}) | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
# Set the `env_runner_cls` to `SingleAgentEnvRunner` for single-agent setups and | ||
# `MultiAgentEnvRunner` for multi-agent cases. | ||
.rollouts(env_runner_cls=MultiAgentEnvRunner) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
# `num_gpus_per_learner_worker=1`). | ||
.resources( | ||
num_learner_workers=0, # <- in most cases, set this value to the number of GPUs | ||
num_gpus_per_learner_worker=0, # <- set this to 1, if you have at least 1 GPU | ||
num_cpus_for_local_worker=1, | ||
) | ||
# When using RLlib's default models (RLModules) AND the new EnvRunners, you should | ||
# set this flag in your model config. Having to set this, will no longer be required | ||
# in the near future. It does yield a small performance advantage as value function | ||
# predictions for PPO are no longer required to happen on the sampler side (but are | ||
# now fully located on the learner side, which might have GPUs available). | ||
.training(model={"uses_new_env_runners": True}) | ||
# Because you are in a multi-agent env, you have to set up the usual multi-agent | ||
# parameters: | ||
.multi_agent( | ||
policies={"p0", "p1"}, | ||
# Map agent 0 to p0 and agent 1 to p1. | ||
policy_mapping_fn=lambda agent_id, episode, **kwargs: f"p{agent_id}", | ||
) | ||
) | ||
|
||
# __enabling-new-api-stack-ma-ppo-end__ | ||
|
||
# Test whether it works. | ||
print(config.build().train()) | ||
|
||
|
||
# __enabling-new-api-stack-sa-sac-begin__ | ||
|
||
from ray.rllib.algorithms.sac import SACConfig # noqa | ||
from ray.rllib.env.single_agent_env_runner import SingleAgentEnvRunner # noqa | ||
|
||
|
||
config = ( | ||
SACConfig().environment("Pendulum-v1") | ||
# Switch the new API stack flag to True (False by default). | ||
# This enables the use of the RLModule (replaces ModelV2) AND Learner (replaces | ||
# Policy) classes. | ||
.experimental(_enable_new_api_stack=True) | ||
# However, the above flag only activates the RLModule and Learner APIs. In order | ||
# to utilize all of the new API stack's classes, you also have to specify the | ||
# EnvRunner (replaces RolloutWorker) to use. | ||
# Note that this step will be fully automated in the next release. | ||
.rollouts(env_runner_cls=SingleAgentEnvRunner) | ||
# We are using a simple 1-CPU setup here for learning. However, as the new stack | ||
# supports arbitrary scaling on the learner axis, feel free to set | ||
# `num_learner_workers` to the number of available GPUs for multi-GPU training (and | ||
# `num_gpus_per_learner_worker=1`). | ||
.resources( | ||
num_learner_workers=0, # <- in most cases, set this value to the number of GPUs | ||
num_gpus_per_learner_worker=0, # <- set this to 1, if you have at least 1 GPU | ||
num_cpus_for_local_worker=1, | ||
) | ||
# When using RLlib's default models (RLModules) AND the new EnvRunners, you should | ||
# set this flag in your model config. Having to set this, will no longer be required | ||
# in the near future. It does yield a small performance advantage as value function | ||
# predictions for PPO are no longer required to happen on the sampler side (but are | ||
# now fully located on the learner side, which might have GPUs available). | ||
.training( | ||
model={"uses_new_env_runners": True}, | ||
replay_buffer_config={"type": "EpisodeReplayBuffer"}, | ||
) | ||
) | ||
# __enabling-new-api-stack-sa-sac-end__ | ||
|
||
|
||
# Test whether it works. | ||
print(config.build().train()) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
|
||
.. include:: /_includes/rllib/new_api_stack.rst | ||
|
||
.. _rllib-advanced-api-doc: | ||
|
||
Advanced Python APIs | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.