Welcome to Tianshou!

Tianshou (天授) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed framework and pythonic API for building the deep reinforcement learning agent. The supported interface algorithms include:

~tianshou.policy.PGPolicy Policy Gradient
~tianshou.policy.DQNPolicy Deep Q-Network
~tianshou.policy.DQNPolicy Double DQN
~tianshou.policy.DQNPolicy Dueling DQN
~tianshou.policy.C51Policy C51
~tianshou.policy.A2CPolicy Advantage Actor-Critic
~tianshou.policy.DDPGPolicy Deep Deterministic Policy Gradient
~tianshou.policy.PPOPolicy Proximal Policy Optimization
~tianshou.policy.TD3Policy Twin Delayed DDPG
~tianshou.policy.SACPolicy Soft Actor-Critic
~tianshou.policy.DiscreteSACPolicy Discrete Soft Actor-Critic
~tianshou.policy.ImitationPolicy Imitation Learning
~tianshou.policy.DiscreteBCQPolicy Discrete Batch-Constrained deep Q-Learning
~tianshou.policy.PSRLPolicy Posterior Sampling Reinforcement Learning
~tianshou.data.PrioritizedReplayBuffer Prioritized Experience Replay
~tianshou.policy.BasePolicy.compute_episodic_return Generalized Advantage Estimator

Here is Tianshou's other features:

Elegant framework, using only ~2000 lines of code
Support parallel environment simulation (synchronous or asynchronous) for all algorithms: parallel_sampling
Support recurrent state representation in actor network and critic network (RNN-style training for POMDP): rnn_training
Support any type of environment state/action (e.g. a dict, a self-defined class, ...): self_defined_env
Support customize_training
Support n-step returns estimation ~tianshou.policy.BasePolicy.compute_nstep_return and prioritized experience replay ~tianshou.data.PrioritizedReplayBuffer for all Q-learning based algorithms; GAE, nstep and PER are very fast thanks to numba jit function and vectorized numpy operation
Support /tutorials/tictactoe
Comprehensive unit tests, including functional checking, RL pipeline checking, documentation checking, PEP8 code-style checking, and type checking

中文文档位于 https://tianshou.readthedocs.io/zh/latest/

Installation

Tianshou is currently hosted on PyPI and conda-forge. It requires Python >= 3.6.

You can simply install Tianshou from PyPI with the following command:

$ pip install tianshou

If you use Anaconda or Miniconda, you can install Tianshou from conda-forge through the following command:

$ conda -c conda-forge install tianshou

You can also install with the newest version through GitHub:

$ pip install git+https://github.com/thu-ml/tianshou.git@master --upgrade

After installation, open your python console and type :

import tianshou
print(tianshou.__version__)

If no error occurs, you have successfully installed Tianshou.

Tianshou is still under development, you can also check out the documents in stable version through tianshou.readthedocs.io/en/stable/.

tutorials/dqn tutorials/concepts tutorials/batch tutorials/tictactoe tutorials/trick tutorials/cheatsheet

api/tianshou.data api/tianshou.env api/tianshou.policy api/tianshou.trainer api/tianshou.exploration api/tianshou.utils

contributing contributor

Indices and tables

genindex
modindex
search

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.rst

index.rst

Welcome to Tianshou!

Installation

Indices and tables

Files

index.rst

Latest commit

History

index.rst

File metadata and controls

Welcome to Tianshou!

Installation

Indices and tables