Tianshou (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:
~tianshou.policy.PGPolicy
Policy Gradient~tianshou.policy.DQNPolicy
Deep Q-Network~tianshou.policy.DQNPolicy
Double DQN~tianshou.policy.DQNPolicy
Dueling DQN~tianshou.policy.C51Policy
C51~tianshou.policy.A2CPolicy
Advantage Actor-Critic~tianshou.policy.DDPGPolicy
Deep Deterministic Policy Gradient~tianshou.policy.PPOPolicy
Proximal Policy Optimization~tianshou.policy.TD3Policy
Twin Delayed DDPG~tianshou.policy.SACPolicy
Soft Actor-Critic~tianshou.policy.DiscreteSACPolicy
Discrete Soft Actor-Critic~tianshou.policy.ImitationPolicy
Imitation Learning~tianshou.policy.DiscreteBCQPolicy
Discrete Batch-Constrained deep Q-Learning~tianshou.policy.PSRLPolicy
Posterior Sampling Reinforcement Learning~tianshou.data.PrioritizedReplayBuffer
Prioritized Experience Replay~tianshou.policy.BasePolicy.compute_episodic_return
Generalized Advantage Estimator
Here is Tianshou's other features:
- Elegant framework, using only ~2000 lines of code
- Support parallel environment simulation (synchronous or asynchronous) for all algorithms:
parallel_sampling
- Support recurrent state representation in actor network and critic network (RNN-style training for POMDP):
rnn_training
- Support any type of environment state/action (e.g. a dict, a self-defined class, ...):
self_defined_env
- Support
customize_training
- Support n-step returns estimation
~tianshou.policy.BasePolicy.compute_nstep_return
and prioritized experience replay~tianshou.data.PrioritizedReplayBuffer
for all Q-learning based algorithms; GAE, nstep and PER are very fast thanks to numba jit function and vectorized numpy operation - Support
/tutorials/tictactoe
- Comprehensive unit tests, including functional checking, RL pipeline checking, documentation checking, PEP8 code-style checking, and type checking
中文文档位于 https://tianshou.readthedocs.io/zh/latest/
Tianshou is currently hosted on PyPI and conda-forge. It requires Python >= 3.6.
You can simply install Tianshou from PyPI with the following command:
$ pip install tianshou
If you use Anaconda or Miniconda, you can install Tianshou from conda-forge through the following command:
$ conda -c conda-forge install tianshou
You can also install with the newest version through GitHub:
$ pip install git+https://github.com/thu-ml/tianshou.git@master --upgrade
After installation, open your python console and type :
import tianshou
print(tianshou.__version__)
If no error occurs, you have successfully installed Tianshou.
Tianshou is still under development, you can also check out the documents in stable version through tianshou.readthedocs.io/en/stable/.
tutorials/dqn tutorials/concepts tutorials/batch tutorials/tictactoe tutorials/trick tutorials/cheatsheet
api/tianshou.data api/tianshou.env api/tianshou.policy api/tianshou.trainer api/tianshou.exploration api/tianshou.utils
contributing contributor
genindex
modindex
search