A playground for reinforcement learning using pytorch.
This repo is for quickly testing out new ideas in RL, with an emphasis on simplicity.
Features • Installation • Getting Started
- SAC (N-step)
- AWAC (N-step)
- PPO (N-step)
| Continuous | Categorical | Gumbel | |
|---|---|---|---|
| SAC | X | X | X |
| AWAC | X | - | X |
| PPO | X | X | X |
| MLP | Convolutional | |
|---|---|---|
| SAC | X | X |
| AWAC | X | X |
| PPO | X | X |
- Gym
- Unity mlagents
- Implemented using matrix multiplications, without the need for for loops, to significantly improve performance. (see core/tools/gamma_matrix.py)
Either run the following, or go to the docker section below:
conda env create -f conda_env.yaml
conda activate agency
python -m pip install -e .Launch the tests:
python -m pytest tests/functionalLaunch the training tests (tuned on a system with a 6 core CPU and Nvidia 1080 GPU):
python -m pytest tests/trainingpython examples/gym/train_sac_mlp_continuous_lunarlander.pypython examples/gym/train_sac_vision_categorical_pong.pyThe best place to start is to take a look in examples/tutorials.
train_sac_with_helper.py demonstrates how to setup a basic MLP network and use it train lunar lander using Soft Actor Critic.
It can be launched using the following command (Training takes around 1 minute on a Nvidia 1080 GPU):
python examples/tutorials/train_sac_with_helper.pyThe examples folder contains many more files that show how to train using different algorithms, distributions and network architectures.
See train_sac_mlp_continuous_lunarlander.py for an example of how to randomize hyper params:
python examples/gym/train_sac_mlp_continuous_lunarlander.py --sweep --n 10- Install docker and nvidia-docker.
- Build the container:
./docker_build.sh- Run the tests using code from the current working directory:
./docker_run.sh python -m pytest tests- Launch a training run
./docker_run.sh python examples/gym/train_sac_mlp_continuous_lunarlander.pyInstall cuda: https://docs.nvidia.com/cuda/wsl-user-guide/index.html
At the time of writing, this consists of:
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-4-local_11.4.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-wsl-ubuntu-11-4-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cudapython -m pytest tests/medium/test_discrete_vision_minigrid.py::test_namepython -m pytest -s tests/medium- Unity env data collection.
- add support for action branching.
- add support for multiple brains.