Skip to content

Commit

Permalink
[Doc] Update README.md (#2155)
Browse files Browse the repository at this point in the history
  • Loading branch information
Shixian Sheng committed May 6, 2024
1 parent 730cfb4 commit 0707e07
Showing 1 changed file with 32 additions and 32 deletions.
64 changes: 32 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
It provides pytorch and **python-first**, low and high level abstractions for RL that are intended to be **efficient**, **modular**, **documented** and properly **tested**.
The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort.

This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar ([torchrl/envs](torchrl/envs)), [transforms](torchrl/envs/transforms), [models](torchrl/modules), data utilities (e.g. collectors and containers), etc.
This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar ([torchrl/envs](https://github.com/pytorch/rl/blob/main/torchrl/envs)), [transforms](https://github.com/pytorch/rl/blob/main/torchrl/envs/transforms), [models](https://github.com/pytorch/rl/blob/main/torchrl/modules), data utilities (e.g. collectors and containers), etc.
TorchRL aims at having as few dependencies as possible (python standard library, numpy and pytorch). Common environment libraries (e.g. OpenAI gym) are only optional.

On the low-level end, torchrl comes with a set of highly re-usable functionals for cost functions, returns and data processing.
Expand Down Expand Up @@ -141,7 +141,7 @@ lines of code*!
Here is an example of how the [environment API](https://pytorch.org/rl/stable/reference/envs.html)
relies on tensordict to carry data from one function to another during a rollout
execution:
![Alt Text](docs/source/_static/img/rollout.gif)
![Alt Text](https://github.com/pytorch/rl/blob/main/docs/source/_static/img/rollout.gif)

`TensorDict` makes it easy to re-use pieces of code across environments, models and
algorithms.
Expand Down Expand Up @@ -268,11 +268,11 @@ And it is `functorch` and `torch.compile` compatible!

## Features

- A common [interface for environments](torchrl/envs)
- A common [interface for environments](https://github.com/pytorch/rl/blob/main/torchrl/envs)
which supports common libraries (OpenAI gym, deepmind control lab, etc.)<sup>(1)</sup> and state-less execution
(e.g. Model-based environments).
The [batched environments](torchrl/envs/batched_envs.py) containers allow parallel execution<sup>(2)</sup>.
A common PyTorch-first class of [tensor-specification class](torchrl/data/tensor_specs.py) is also provided.
The [batched environments](https://github.com/pytorch/rl/blob/main/torchrl/envs/batched_envs.py) containers allow parallel execution<sup>(2)</sup>.
A common PyTorch-first class of [tensor-specification class](https://github.com/pytorch/rl/blob/main/torchrl/data/tensor_specs.py) is also provided.
TorchRL's environments API is simple but stringent and specific. Check the
[documentation](https://pytorch.org/rl/stable/reference/envs.html)
and [tutorial](https://pytorch.org/rl/stable/tutorials/pendulum.html) to learn more!
Expand All @@ -288,7 +288,7 @@ And it is `functorch` and `torch.compile` compatible!
```
</details>

- multiprocess and distributed [data collectors](torchrl/collectors/collectors.py)<sup>(2)</sup>
- multiprocess and distributed [data collectors](https://github.com/pytorch/rl/blob/main/torchrl/collectors/collectors.py)<sup>(2)</sup>
that work synchronously or asynchronously.
Through the use of TensorDict, TorchRL's training loops are made very similar
to regular training loops in supervised
Expand All @@ -315,10 +315,10 @@ And it is `functorch` and `torch.compile` compatible!
```
</details>

Check our [distributed collector examples](examples/distributed/collectors) to
Check our [distributed collector examples](https://github.com/pytorch/rl/blob/main/examples/distributed/collectors) to
learn more about ultra-fast data collection with TorchRL.

- efficient<sup>(2)</sup> and generic<sup>(1)</sup> [replay buffers](torchrl/data/replay_buffers/replay_buffers.py) with modularized storage:
- efficient<sup>(2)</sup> and generic<sup>(1)</sup> [replay buffers](https://github.com/pytorch/rl/blob/main/torchrl/data/replay_buffers/replay_buffers.py) with modularized storage:
<details>
<summary>Code</summary>

Expand Down Expand Up @@ -357,7 +357,7 @@ And it is `functorch` and `torch.compile` compatible!
</details>


- cross-library [environment transforms](torchrl/envs/transforms/transforms.py)<sup>(1)</sup>,
- cross-library [environment transforms](https://github.com/pytorch/rl/blob/main/torchrl/envs/transforms/transforms.py)<sup>(1)</sup>,
executed on device and in a vectorized fashion<sup>(2)</sup>,
which process and prepare the data coming out of the environments to be used by the agent:
<details>
Expand Down Expand Up @@ -391,7 +391,7 @@ And it is `functorch` and `torch.compile` compatible!
</details>

- various tools for distributed learning (e.g. [memory mapped tensors](https://github.com/pytorch/tensordict/blob/main/tensordict/memmap.py))<sup>(2)</sup>;
- various [architectures](torchrl/modules/models/) and models (e.g. [actor-critic](torchrl/modules/tensordict_module/actors.py))<sup>(1)</sup>:
- various [architectures](https://github.com/pytorch/rl/blob/main/torchrl/modules/models/) and models (e.g. [actor-critic](https://github.com/pytorch/rl/blob/main/torchrl/modules/tensordict_module/actors.py))<sup>(1)</sup>:
<details>
<summary>Code</summary>

Expand Down Expand Up @@ -443,8 +443,8 @@ And it is `functorch` and `torch.compile` compatible!
```
</details>

- exploration [wrappers](torchrl/modules/tensordict_module/exploration.py) and
[modules](torchrl/modules/models/exploration.py) to easily swap between exploration and exploitation<sup>(1)</sup>:
- exploration [wrappers](https://github.com/pytorch/rl/blob/main/torchrl/modules/tensordict_module/exploration.py) and
[modules](https://github.com/pytorch/rl/blob/main/torchrl/modules/models/exploration.py) to easily swap between exploration and exploitation<sup>(1)</sup>:
<details>
<summary>Code</summary>

Expand Down Expand Up @@ -481,37 +481,37 @@ And it is `functorch` and `torch.compile` compatible!

</details>

- a generic [trainer class](torchrl/trainers/trainers.py)<sup>(1)</sup> that
- a generic [trainer class](https://github.com/pytorch/rl/blob/main/torchrl/trainers/trainers.py)<sup>(1)</sup> that
executes the aforementioned training loop. Through a hooking mechanism,
it also supports any logging or data transformation operation at any given
time.

- various [recipes](torchrl/trainers/helpers/models.py) to build models that
- various [recipes](https://github.com/pytorch/rl/blob/main/torchrl/trainers/helpers/models.py) to build models that
correspond to the environment being deployed.

If you feel a feature is missing from the library, please submit an issue!
If you would like to contribute to new features, check our [call for contributions](https://github.com/pytorch/rl/issues/509) and our [contribution](CONTRIBUTING.md) page.
If you would like to contribute to new features, check our [call for contributions](https://github.com/pytorch/rl/issues/509) and our [contribution](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md) page.


## Examples, tutorials and demos

A series of [examples](examples/) are provided with an illustrative purpose:
- [DQN](sota-implementations/dqn)
- [DDPG](sota-implementations/ddpg/ddpg.py)
- [IQL](sota-implementations/iql/iql_offline.py)
- [CQL](sota-implementations/cql/cql_offline.py)
- [TD3](sota-implementations/td3/td3.py)
- [A2C](examples/a2c_old/a2c.py)
- [PPO](sota-implementations/ppo/ppo.py)
- [SAC](sota-implementations/sac/sac.py)
- [REDQ](sota-implementations/redq/redq.py)
- [Dreamer](sota-implementations/dreamer/dreamer.py)
- [Decision Transformers](sota-implementations/decision_transformer)
- [RLHF](examples/rlhf)
A series of [examples](https://github.com/pytorch/rl/blob/main/examples/) are provided with an illustrative purpose:
- [DQN](https://github.com/pytorch/rl/blob/main/sota-implementations/dqn)
- [DDPG](https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py)
- [IQL](https://github.com/pytorch/rl/blob/main/sota-implementations/iql/iql_offline.py)
- [CQL](https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py)
- [TD3](https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py)
- [A2C](https://github.com/pytorch/rl/blob/main/examples/a2c_old/a2c.py)
- [PPO](https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/ppo.py)
- [SAC](https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py)
- [REDQ](https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py)
- [Dreamer](https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py)
- [Decision Transformers](https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer)
- [RLHF](https://github.com/pytorch/rl/blob/main/examples/rlhf)

and many more to come!

Check the [examples](sota-implementations/) directory for more details
Check the [examples](https://github.com/pytorch/rl/blob/main/sota-implementations/) directory for more details
about handling the various configuration settings.

We also provide [tutorials and demos](https://pytorch.org/rl/stable#tutorials) that give a sense of
Expand Down Expand Up @@ -670,7 +670,7 @@ it means that the C++ extensions were not installed or not found.
```

Versioning issues can cause error message of the type ```undefined symbol```
and such. For these, refer to the [versioning issues document](knowledge_base/VERSIONING_ISSUES.md)
and such. For these, refer to the [versioning issues document](https://github.com/pytorch/rl/blob/main/knowledge_base/VERSIONING_ISSUES.md)
for a complete explanation and proposed workarounds.

## Asking a question
Expand All @@ -683,7 +683,7 @@ the [PyTorch forum](https://discuss.pytorch.org/c/reinforcement-learning/6).
## Contributing

Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs.
You can checkout the detailed contribution guide [here](CONTRIBUTING.md).
You can checkout the detailed contribution guide [here](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md).
As mentioned above, a list of open contributions can be found in [here](https://github.com/pytorch/rl/issues/509).

Contributors are recommended to install [pre-commit hooks](https://pre-commit.com/) (using `pre-commit install`). pre-commit will check for linting related issues when the code is committed locally. You can disable th check by appending `-n` to your commit command: `git commit -m <commit message> -n`
Expand All @@ -696,4 +696,4 @@ BC-breaking changes are likely to happen but they will be introduced with a depr
warranty after a few release cycles.

# License
TorchRL is licensed under the MIT License. See [LICENSE](LICENSE) for details.
TorchRL is licensed under the MIT License. See [LICENSE](https://github.com/pytorch/rl/blob/main/LICENSE) for details.

0 comments on commit 0707e07

Please sign in to comment.