Skip to content

Commit

Permalink
Merge pull request #164 from Toni-SM/develop
Browse files Browse the repository at this point in the history
Merge develop
  • Loading branch information
Toni-SM committed Jun 24, 2024
2 parents 631613a + e2d86be commit 636936f
Show file tree
Hide file tree
Showing 164 changed files with 1,530 additions and 1,246 deletions.
3 changes: 3 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,14 @@ body:
description: The skrl version can be obtained with the command `pip show skrl`.
options:
- ---
- 1.2.0
- 1.1.0
- 1.0.0
- 1.0.0-rc2
- 1.0.0-rc1
- 0.10.2 or 0.10.1
- 0.10.0 or earlier
- develop branch
validations:
required: true
- type: input
Expand Down
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,20 @@

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [1.2.0] - 2024-06-23
### Added
- Define the `environment_info` trainer config to log environment info (PyTorch implementation)
- Add support to automatically compute the write and checkpoint intervals and make it the default option
- Single forward-pass in shared models
- Distributed multi-GPU and multi-node learning (PyTorch implementation)

### Changed
- Update Orbit-related source code and docs to Isaac Lab

### Fixed
- Move the batch sampling inside gradient step loop for DDPG and TD3
- Perform JAX computation on the selected device

## [1.1.0] - 2024-02-12
### Added
- MultiCategorical mixin to operate MultiDiscrete action spaces
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
<h2 align="center" style="border-bottom: 0 !important;">SKRL - Reinforcement Learning library</h2>
<br>

**skrl** is an open-source modular library for Reinforcement Learning written in Python (on top of [PyTorch](https://pytorch.org/) and [JAX](https://jax.readthedocs.io)) and designed with a focus on modularity, readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI [Gym](https://www.gymlibrary.dev) / Farama [Gymnasium](https://gymnasium.farama.org) and [DeepMind](https://github.com/deepmind/dm_env) and other environment interfaces, it allows loading and configuring [NVIDIA Isaac Gym](https://developer.nvidia.com/isaac-gym/), [NVIDIA Isaac Orbit](https://isaac-orbit.github.io/orbit/index.html) and [NVIDIA Omniverse Isaac Gym](https://docs.omniverse.nvidia.com/isaacsim/latest/tutorial_gym_isaac_gym.html) environments, enabling agents' simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.
**skrl** is an open-source modular library for Reinforcement Learning written in Python (on top of [PyTorch](https://pytorch.org/) and [JAX](https://jax.readthedocs.io)) and designed with a focus on modularity, readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI [Gym](https://www.gymlibrary.dev) / Farama [Gymnasium](https://gymnasium.farama.org) and [DeepMind](https://github.com/deepmind/dm_env) and other environment interfaces, it allows loading and configuring [NVIDIA Isaac Gym](https://developer.nvidia.com/isaac-gym/), [NVIDIA Omniverse Isaac Gym](https://docs.omniverse.nvidia.com/isaacsim/latest/tutorial_gym_isaac_gym.html) and [NVIDIA Isaac Lab](https://isaac-sim.github.io/IsaacLab/index.html) environments, enabling agents' simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.

<br>

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/a2c.rst
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/amp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,10 @@ Support for advanced features is described in the next table
- \-
- .. centered:: :math:`\square`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/cem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,10 @@ Support for advanced features is described in the next table
- \-
- .. centered:: :math:`\square`
- .. centered:: :math:`\square`
* - Distributed
- \-
- .. centered:: :math:`\square`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
8 changes: 6 additions & 2 deletions docs/source/api/agents/ddpg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ Learning algorithm

|
| :literal:`_update(...)`
| :green:`# sample a batch from memory`
| [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`
| :green:`# gradient steps`
| **FOR** each gradient step up to :guilabel:`gradient_steps` **DO**
| :green:`# sample a batch from memory`
| [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`
| :green:`# compute target values`
| :math:`a' \leftarrow \mu_{\theta_{target}}(s')`
| :math:`Q_{_{target}} \leftarrow Q_{\phi_{target}}(s', a')`
Expand Down Expand Up @@ -236,6 +236,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/ddqn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,10 @@ Support for advanced features is described in the next table
- \-
- .. centered:: :math:`\square`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/dqn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,10 @@ Support for advanced features is described in the next table
- \-
- .. centered:: :math:`\square`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/ppo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/rpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/sac.rst
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
8 changes: 6 additions & 2 deletions docs/source/api/agents/td3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ Learning algorithm

|
| :literal:`_update(...)`
| :green:`# sample a batch from memory`
| [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`
| :green:`# gradient steps`
| **FOR** each gradient step up to :guilabel:`gradient_steps` **DO**
| :green:`# sample a batch from memory`
| [:math:`s, a, r, s', d`] :math:`\leftarrow` states, actions, rewards, next_states, dones of size :guilabel:`batch_size`
| :green:`# target policy smoothing`
| :math:`a' \leftarrow \mu_{\theta_{target}}(s')`
| :math:`noise \leftarrow \text{clip}(` :guilabel:`smooth_regularization_noise` :math:`, -c, c) \qquad` with :math:`c` as :guilabel:`smooth_regularization_clip`
Expand Down Expand Up @@ -258,6 +258,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
4 changes: 4 additions & 0 deletions docs/source/api/agents/trpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,10 @@ Support for advanced features is described in the next table
- RNN, LSTM, GRU and any other variant
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
* - Distributed
- Single Program Multi Data (SPMD) multi-GPU
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`

.. raw:: html

Expand Down
59 changes: 59 additions & 0 deletions docs/source/api/config/frameworks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,65 @@ Configurations for behavior modification of Machine Learning (ML) frameworks.

<br><hr>

PyTorch
-------

PyTorch specific configuration

.. raw:: html

<br>

API
^^^

.. py:data:: skrl.config.torch.device
:type: torch.device
:value: "cuda:${LOCAL_RANK}" | "cpu"

Default device

The default device, unless specified, is ``cuda:0`` (or ``cuda:LOCAL_RANK`` in a distributed environment) if CUDA is available, ``cpu`` otherwise

.. py:data:: skrl.config.local_rank
:type: int
:value: 0

The rank of the worker/process (e.g.: GPU) within a local worker group (e.g.: node)

This property reads from the ``LOCAL_RANK`` environment variable (``0`` if it doesn't exist).
See `torch.distributed <https://pytorch.org/docs/stable/distributed.html>`_ for more details

.. py:data:: skrl.config.rank
:type: int
:value: 0

The rank of the worker/process (e.g.: GPU) within a worker group (e.g.: across all nodes)

This property reads from the ``RANK`` environment variable (``0`` if it doesn't exist).
See `torch.distributed <https://pytorch.org/docs/stable/distributed.html>`_ for more details

.. py:data:: skrl.config.world_size
:type: int
:value: 1

The total number of workers/process (e.g.: GPUs) in a worker group (e.g.: across all nodes)

This property reads from the ``WORLD_SIZE`` environment variable (``1`` if it doesn't exist).
See `torch.distributed <https://pytorch.org/docs/stable/distributed.html>`_ for more details

.. py:data:: skrl.config.is_distributed
:type: bool
:value: False

Whether if running in a distributed environment

This property is ``True`` when the PyTorch's distributed environment variable ``WORLD_SIZE > 1``

.. raw:: html

<br>

JAX
---

Expand Down
12 changes: 6 additions & 6 deletions docs/source/api/envs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,16 @@ Environments
Wrapping (single-agent) <envs/wrapping>
Wrapping (multi-agents) <envs/multi_agents_wrapping>
Isaac Gym environments <envs/isaac_gym>
Isaac Orbit environments <envs/isaac_orbit>
Omniverse Isaac Gym environments <envs/omniverse_isaac_gym>
Isaac Lab environments <envs/isaaclab>

The environment plays a fundamental and crucial role in defining the RL setup. It is the place where the agent interacts, and it is responsible for providing the agent with information about its current state, as well as the rewards/penalties associated with each action.

.. raw:: html

<br><hr>

Grouped in this section you will find how to load environments from NVIDIA Isaac Gym, Isaac Orbit and Omniverse Isaac Gym with a simple function.
Grouped in this section you will find how to load environments from NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab with a simple function.

In addition, you will be able to :doc:`wrap single-agent <envs/wrapping>` and :doc:`multi-agent <envs/multi_agents_wrapping>` RL environment interfaces.

Expand All @@ -29,10 +29,10 @@ In addition, you will be able to :doc:`wrap single-agent <envs/wrapping>` and :d
* - :doc:`Isaac Gym environments <envs/isaac_gym>`
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`
* - :doc:`Isaac Orbit environments <envs/isaac_orbit>`
* - :doc:`Omniverse Isaac Gym environments <envs/omniverse_isaac_gym>`
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`
* - :doc:`Omniverse Isaac Gym environments <envs/omniverse_isaac_gym>`
* - :doc:`Isaac Lab environments <envs/isaaclab>`
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`

Expand All @@ -57,10 +57,10 @@ In addition, you will be able to :doc:`wrap single-agent <envs/wrapping>` and :d
* - Isaac Gym (previews)
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`
* - Isaac Orbit
* - Omniverse Isaac Gym |_5| |_5| |_5| |_5| |_2|
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`
* - Omniverse Isaac Gym |_5| |_5| |_5| |_5| |_2|
* - Isaac Lab
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\blacksquare`
* - PettingZoo
Expand Down
90 changes: 0 additions & 90 deletions docs/source/api/envs/isaac_orbit.rst

This file was deleted.

Loading

0 comments on commit 636936f

Please sign in to comment.