Re-implementations of Deep Reinforcement Learning (DRL) algorithms, written in PyTorch.
git clone https://github.com/liyc-ai/RL-pytorch.git
cd RL-pytorch
pip install .
# update installation if you make modifications
pip install --upgrade .
# pip install -e . --config-settings editable_mode=compat
- Deep Q Networks (DQN) [paper] [official code]
- Deep Double Q Networks (DDQN) [paper]
- Dueling Network Architectures for Deep Reinforcement Learning (DuelDQN) [paper]
- Continuous control with deep reinforcement learning (DDPG) [paper]
- Addressing Function Approximation Error in Actor-Critic Methods (TD3) [paper] [official code]
- Soft Actor-Critic Algorithms and Applications (SAC) [paper] [official code]
- Trust Region Policy Optimization (TRPO) [paper] [official code]
- Proximal Policy Optimization (PPO) [paper] [official code]
python scripts/train_agent.py agent=ppo env.id=Hopper-v4
By default, the results are stored at the runs
dir.
With the progress of this project, I found many open-source materials on the Internet to be excellent references. I am deeply grateful for the efforts of their authors. Below is a detailed list. Additionally, I would like to extend my thanks to my friends from LAMDA-RL for our helpful discussions.
Codebase
- tianshou
- stable-baselines3
- stable-baselines-contrib
- stable-baselines
- spinningup
- RL-Adventure2
- unstable_baselines
- d4rl_evaluations
- TD3
- pytorch-trpo
Blog
Tutorial