Skip to content
Tensorflow implementation of "The Predictron: End-To-End Learning and Planning"
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore .gitignore Dec 31, 2016
LICENSE update LICENSE Jan 5, 2017
README.md update some results Jan 9, 2017
maze.py reformat Jan 6, 2017
predictron.py reformat codes Jan 19, 2017
tf_py3.yml conda environment file Jan 19, 2017
train.py fix py3 Jan 20, 2017
train_multigpu.py reformat codes Jan 19, 2017
util.py reformat Jan 6, 2017

README.md

The Predictron

A TensorFlow implementation of

The Predictron: End-To-End Learning and Planning
David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, et al.

arXiv

Run

Try it with

python ./train_multigpu.py --max_depths=<2,4,8,16> --batch_size=128 \
  --num_gpus=<your available number of GPUs>

We assume batch_size can be divided by num_gpus.

Requirements:

tensorflow=0.12.1

Some Results

max_step = 16 and Training on 8 NVIDIA GTX TITAN X GPUs. It takes quite a long time to achieve the same numbers of steps reported in the paper. Here are some results during the training procedure.

INFO:multigpu_train:2017-01-09 01:11:36.704713: step 159760, loss = 0.0043, loss_preturns = 0.0018, loss_lambda_preturns = 0.0004 (2679.7 examples/sec; 0.048 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:39.854633: step 159770, loss = 0.0038, loss_preturns = 0.0017, loss_lambda_preturns = 0.0002 (2888.7 examples/sec; 0.044 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:43.026452: step 159780, loss = 0.0067, loss_preturns = 0.0031, loss_lambda_preturns = 0.0002 (2848.1 examples/sec; 0.045 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:46.252385: step 159790, loss = 0.0099, loss_preturns = 0.0035, loss_lambda_preturns = 0.0014 (3272.3 examples/sec; 0.039 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:49.477405: step 159800, loss = 0.0032, loss_preturns = 0.0013, loss_lambda_preturns = 0.0003 (3051.7 examples/sec; 0.042 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:52.570256: step 159810, loss = 0.0046, loss_preturns = 0.0020, loss_lambda_preturns = 0.0003 (3314.3 examples/sec; 0.039 sec/batch)
INFO:multigpu_train:2017-01-09 01:11:55.710512: step 159820, loss = 0.0040, loss_preturns = 0.0017, loss_lambda_preturns = 0.0003 (3374.0 examples/sec; 0.038 sec/batch)
You can’t perform that action at this time.