LWDRLC is a deep reinforcement learning (RL) library which is inspired by some other deep RL code bases (i.e., Spinning Up repository, Stable-baselines3 , Fujimoto TD3 repository, and Tonic repository).
LWDRL provides further tricks to improve performance of state-of-the-art algorithms potentially beyond their original papers. Therefore, LWDRL enables every user to achieve professional-level performance just in a few lines of codes.
algorithm | continuous control | on-policy / off-policy |
---|---|---|
Vanilla Policy Gradient (VPG) | ✅ | on-policy |
Proximal Policy Optimization (PPO) | ✅ | on-policy |
Deep Deterministic Policy Gradients (DDPG) | ✅ | off-policy |
Twin Delayed Deep Deterministic Policy Gradients (TD3) | ✅ | off-policy |
Soft Actor-Critic (SAC) | ✅ | off-policy |
# python 3.6 (apt)
# pytorch 1.4.0 (pip)
# tensorflow 1.14.0 (pip)
# DMC Control Suite and MuJoCo
cd dockerfiles
docker build . -t lwdrl
For other dockerfiles, you can go to RL Dockefiles.
Run with the scripts batch_off_policy_mujoco_cuda.sh
/ batch_off_policy_dmc_cuda.sh
/ batch_on_policy_mujoco_cuda.sh
/ batch_on_policy_dmc_cuda.sh
:
# eg.
bash batch_off_policy_mujoco_cuda.sh Hopper-v2 TD3 0 # env_name: Hopper-v2, algorithm: TD3, CUDA_Num : 0
# eg. Notice: `-l` denotes labels, `data/DDPG-Hopper-v2/` represents the collecting dataset,
# and `-s` represents smoothing value.
python spinupUtils/plot.py \
data/DDPG-Hopper-v2/ \
-l DDPG -s 10
Run with the scripts render_dmc.py
/ render_mujoco.py
:
# eg.
python render_dmc.py --env swimmer-swimmer6 # env_name: swimmer-swimmer6
Including Ant-v2
, HalfCheetah-v2
, Hopper-v2
, Humanoid-v2
, Swimmer-v2
, Walker2d-v2
.
@misc{QingLi2021lwdrl,
author = {Qing Li},
title = {LWDRL: LightWeight Deep Reinforcement Learning Library},
year = {2021},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LQNew/LWDRL}}
}