ChainerRL is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement algorithms in Python using Chainer, a flexible deep learning framework.
ChainerRL is tested with Python 2.7+ and 3.5.1+. For other requirements, see requirements.txt.
ChainerRL can be installed via PyPI:
pip install chainerrl
It can also be installed from the source code:
python setup.py install
Refer to Installation for more information on installation.
For more information, you can refer to ChainerRL's documentation.
|Algorithm||Discrete Action||Continous Action||Recurrent Model||CPU Async Training|
|DQN (including DoubleDQN etc.)||✓||✓ (NAF)||✓||x|
|NSQ (N-step Q-learning)||✓||✓ (NAF)||✓||✓|
|PCL (Path Consistency Learning)||✓||✓||✓||✓|
Following algorithms have been implemented in ChainerRL:
- A3C (Asynchronous Advantage Actor-Critic)
- ACER (Actor-Critic with Experience Replay)
- Asynchronous N-step Q-learning
- Categorical DQN
- DQN (including Double DQN, Persistent Advantage Learning (PAL), Double PAL, Dynamic Policy Programming (DPP))
- DDPG (Deep Deterministic Poilcy Gradients) (including SVG(0))
- PGT (Policy Gradient Theorem)
- PCL (Path Consistency Learning)
- PPO (Proximal Policy Optimization)
- TRPO (Trust Region Policy Optimization)
Q-function based algorithms such as DQN can utilize a Normalized Advantage Function (NAF) to tackle continuous-action problems as well as DQN-like discrete output networks.
Environments that support the subset of OpenAI Gym's interface (
step methods) can be used.
Any kind of contribution to ChainerRL would be highly appreciated! If you are interested in contributing to ChainerRL, please read CONTRIBUTING.md.