Skip to content

Commit

Permalink
Merge pull request #106 from Toni-SM/develop
Browse files Browse the repository at this point in the history
Develop
  • Loading branch information
Toni-SM committed Aug 11, 2023
2 parents 00a2fd3 + 0000bdf commit 1a32596
Show file tree
Hide file tree
Showing 335 changed files with 3,307 additions and 2,166 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [1.0.0-rc.2] - Unreleased
### Added
- Get truncation from `time_outs` info in Isaac Gym, Isaac Orbit and Omniverse Isaac Gym environments
- Time-limit (truncation) boostrapping in on-policy actor-critic agents
- Model instantiators `initial_log_std` parameter to set the log standard deviation's initial value

### Changed
- Structure environment loaders and wrappers file hierarchy coherently [**breaking change**]
- Drop support for versions prior to PyTorch 1.9 (1.8.0 and 1.8.1)

## [1.0.0-rc.1] - 2023-07-25
### Added
- JAX support (with Flax and Optax)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
<h2 align="center" style="border-bottom: 0 !important;">SKRL - Reinforcement Learning library</h2>
<br>

**skrl** is an open-source modular library for Reinforcement Learning written in Python (on top of [PyTorch](https://pytorch.org/) and [JAX](https://jax.readthedocs.io) and designed with a focus on modularity, readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI [Gym](https://www.gymlibrary.dev) / Farama [Gymnasium](https://gymnasium.farama.org) and [DeepMind](https://github.com/deepmind/dm_env) and other environment interfaces, it allows loading and configuring [NVIDIA Isaac Gym](https://developer.nvidia.com/isaac-gym/), [NVIDIA Isaac Orbit](https://isaac-orbit.github.io/orbit/index.html) and [NVIDIA Omniverse Isaac Gym](https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/tutorial_gym_isaac_gym.html) environments, enabling agents' simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.
**skrl** is an open-source modular library for Reinforcement Learning written in Python (on top of [PyTorch](https://pytorch.org/) and [JAX](https://jax.readthedocs.io)) and designed with a focus on modularity, readability, simplicity, and transparency of algorithm implementation. In addition to supporting the OpenAI [Gym](https://www.gymlibrary.dev) / Farama [Gymnasium](https://gymnasium.farama.org) and [DeepMind](https://github.com/deepmind/dm_env) and other environment interfaces, it allows loading and configuring [NVIDIA Isaac Gym](https://developer.nvidia.com/isaac-gym/), [NVIDIA Isaac Orbit](https://isaac-orbit.github.io/orbit/index.html) and [NVIDIA Omniverse Isaac Gym](https://docs.omniverse.nvidia.com/isaacsim/latest/tutorial_gym_isaac_gym.html) environments, enabling agents' simultaneous training by scopes (subsets of environments among all available environments), which may or may not share resources, in the same run.

<br>

Expand Down
Binary file modified docs/source/_static/imgs/example_parallel.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_categorical-dark.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_categorical-light.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_deterministic-dark.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_deterministic-light.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_gaussian-dark.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/_static/imgs/model_gaussian-light.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/source/api/agents/a2c.rst
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/a2c/a2c.py
:language: python
:lines: 18-54
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/amp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/amp/amp.py
:language: python
:lines: 18-71
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/cem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/cem/cem.py
:language: python
:lines: 15-44
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/ddpg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,8 +138,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/ddpg/ddpg.py
:language: python
:lines: 16-56
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/ddqn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/dqn/ddqn.py
:language: python
:lines: 16-55
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/dqn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/dqn/dqn.py
:language: python
:lines: 16-55
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
6 changes: 3 additions & 3 deletions docs/source/api/agents/ppo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ Learning algorithm
| :green:`# mini-batches loop`
| **FOR** each mini-batch [:math:`s, a, logp, V, R, A`] up to :guilabel:`mini_batches` **DO**
| :math:`logp' \leftarrow \pi_\theta(s, a)`
| :green:`# compute aproximate KL divergence`
| :green:`# compute approximate KL divergence`
| :math:`ratio \leftarrow logp' - logp`
| :math:`KL_{_{divergence}} \leftarrow \frac{1}{N} \sum_{i=1}^N ((e^{ratio} - 1) - ratio)`
| :green:`# early stopping with KL divergence`
Expand Down Expand Up @@ -159,8 +159,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/ppo/ppo.py
:language: python
:lines: 18-61
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/q_learning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/q_learning/q_learning.py
:language: python
:lines: 14-35
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
6 changes: 3 additions & 3 deletions docs/source/api/agents/rpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ Learning algorithm
| :green:`# mini-batches loop`
| **FOR** each mini-batch [:math:`s, a, logp, V, R, A`] up to :guilabel:`mini_batches` **DO**
| :math:`logp' \leftarrow \pi_\theta(s, a)`
| :green:`# compute aproximate KL divergence`
| :green:`# compute approximate KL divergence`
| :math:`ratio \leftarrow logp' - logp`
| :math:`KL_{_{divergence}} \leftarrow \frac{1}{N} \sum_{i=1}^N ((e^{ratio} - 1) - ratio)`
| :green:`# early stopping with KL divergence`
Expand Down Expand Up @@ -198,8 +198,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/rpo/rpo.py
:language: python
:lines: 18-62
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/sac.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/sac/sac.py
:language: python
:lines: 18-56
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/sarsa.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/sarsa/sarsa.py
:language: python
:lines: 14-35
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/td3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,8 +148,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/td3/td3.py
:language: python
:lines: 19-63
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/agents/trpo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -195,8 +195,8 @@ Configuration and hyperparameters

.. literalinclude:: ../../../../skrl/agents/torch/trpo/trpo.py
:language: python
:lines: 18-61
:linenos:
:start-after: [start-config-dict-torch]
:end-before: [end-config-dict-torch]

.. raw:: html

Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/envs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ In addition, you will be able to :doc:`wrap single-agent <envs/wrapping>` and :d
- .. centered:: :math:`\blacksquare`
* - PettingZoo
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
- .. centered:: :math:`\blacksquare`
* - robosuite
- .. centered:: :math:`\blacksquare`
- .. centered:: :math:`\square`
6 changes: 3 additions & 3 deletions docs/source/api/envs/isaac_gym.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Usage
API
^^^

.. autofunction:: skrl.envs.torch.loaders.load_isaacgym_env_preview4
.. autofunction:: skrl.envs.loaders.torch.load_isaacgym_env_preview4

.. raw:: html

Expand Down Expand Up @@ -181,7 +181,7 @@ Usage
API
^^^

.. autofunction:: skrl.envs.torch.loaders.load_isaacgym_env_preview3
.. autofunction:: skrl.envs.loaders.torch.load_isaacgym_env_preview3

.. raw:: html

Expand Down Expand Up @@ -260,4 +260,4 @@ Usage
API
^^^

.. autofunction:: skrl.envs.torch.loaders.load_isaacgym_env_preview2
.. autofunction:: skrl.envs.loaders.torch.load_isaacgym_env_preview2
2 changes: 1 addition & 1 deletion docs/source/api/envs/isaac_orbit.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,4 @@ Usage
API
^^^

.. autofunction:: skrl.envs.torch.loaders.load_isaac_orbit_env
.. autofunction:: skrl.envs.loaders.torch.load_isaac_orbit_env
16 changes: 8 additions & 8 deletions docs/source/api/envs/multi_agents_wrapping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Usage
API (PyTorch)
-------------

.. autofunction:: skrl.envs.torch.wrappers.wrap_env
.. autofunction:: skrl.envs.wrappers.torch.wrap_env

.. raw:: html

Expand All @@ -91,7 +91,7 @@ API (PyTorch)
API (JAX)
---------

.. autofunction:: skrl.envs.jax.wrappers.wrap_env
.. autofunction:: skrl.envs.wrappers.jax.wrap_env

.. raw:: html

Expand All @@ -100,7 +100,7 @@ API (JAX)
Internal API (PyTorch)
----------------------

.. autoclass:: skrl.envs.torch.wrappers.MultiAgentEnvWrapper
.. autoclass:: skrl.envs.wrappers.torch.MultiAgentEnvWrapper
:undoc-members:
:show-inheritance:
:members:
Expand All @@ -117,14 +117,14 @@ Internal API (PyTorch)
A list of all possible_agents the environment could generate

.. autoclass:: skrl.envs.torch.wrappers.BiDexHandsWrapper
.. autoclass:: skrl.envs.wrappers.torch.BiDexHandsWrapper
:undoc-members:
:show-inheritance:
:members:

.. automethod:: __init__

.. autoclass:: skrl.envs.torch.wrappers.PettingZooWrapper
.. autoclass:: skrl.envs.wrappers.torch.PettingZooWrapper
:undoc-members:
:show-inheritance:
:members:
Expand All @@ -138,7 +138,7 @@ Internal API (PyTorch)
Internal API (JAX)
------------------

.. autoclass:: skrl.envs.jax.wrappers.MultiAgentEnvWrapper
.. autoclass:: skrl.envs.wrappers.jax.MultiAgentEnvWrapper
:undoc-members:
:show-inheritance:
:members:
Expand All @@ -155,14 +155,14 @@ Internal API (JAX)
A list of all possible_agents the environment could generate

.. autoclass:: skrl.envs.jax.wrappers.BiDexHandsWrapper
.. autoclass:: skrl.envs.wrappers.jax.BiDexHandsWrapper
:undoc-members:
:show-inheritance:
:members:

.. automethod:: __init__

.. autoclass:: skrl.envs.jax.wrappers.PettingZooWrapper
.. autoclass:: skrl.envs.wrappers.jax.PettingZooWrapper
:undoc-members:
:show-inheritance:
:members:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/envs/omniverse_isaac_gym.rst
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,4 @@ In this approach, the RL algorithm is executed on a secondary thread while the s
API
^^^

.. autofunction:: skrl.envs.torch.loaders.load_omniverse_isaacgym_env
.. autofunction:: skrl.envs.loaders.torch.load_omniverse_isaacgym_env

0 comments on commit 1a32596

Please sign in to comment.