Skip to content
Generic reinforcement learning codebase in TensorFlow
Python TeX Shell
Branch: master
Clone or download Reinforcement Learning Codebase status Build Status

Modular codebase for reinforcement learning models training, testing and visualization.

Contributors: Bryan M. Li, Alexander Cowen-Rivers, Piotr Kozakowski, David Tao, Siddhartha Rao Kamalakara, Nitarshan Rajkumar, Hariharan Sezhiyan, Sicong Huang, Aidan N. Gomez


Example for recorded envrionment on various RL agents.

MountainCar-v0 Pendulum-v0 VideoPinball-v0 Tennis-v0
MountainCar-v0 Pendulum-v0 VideoPinball-v0 Tennis-v0


It is recommended to install the codebase in a virtual environment (virtualenv or conda).

Quick install

Configure use_gpu and (if on OSX) mac_package_manager (either macports or homebrew) params in, then run it as


Manual setup

You need to install the following for your system:

Quick Start

# start training
python --sys ... --hparams ... --output_dir ...
# run tensorboard
tensorboard --logdir ...
# test agnet
python --sys ... --hparams ... --output_dir ... --training False --render True


Check init_flags(), for default hyper-parameters, and check hparams/ agent specific hyper-parameters examples.

  • hparams: Which hparams to use, defined under rl/hparams
  • sys: Which system environment to use.
  • env: Which RL environment to use.
  • output_dir: The directory for model checkpoints and TensorBoard summary.
  • train_steps:, Number of steps to train the agent.
  • test_episodes: Number of episodes to test the agent.
  • eval_episodes: Number of episodes to evaluate the agent.
  • training: train or test agent.
  • copies: Number of independent training/testing runs to do.
  • render: Render game play.
  • record_video: Record game play.
  • num_workers, number of workers.


More detailed documentation can be found here.


We'd love to accept your contributions to this project. Please feel free to open an issue, or submit a pull request as necessary. Contact us for potential collaborations and joining

You can’t perform that action at this time.