Collection of agent trainer training sessions, each containing their resulting network, metrics and visualizations.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
201609040550_5010eps
201609111241_2700eps
201609111241_7300eps
201609160922_54eps
201609171218_175eps
.gitignore
LICENSE
README.md
compress_metrics.sh
decompress_metrics.sh

README.md

Agent Trainer Results

Collection of agent trainer training sessions, each containing their resulting network, metrics and visualizations. More details in this blogpost.

Setup

Since GitHub has a limit of 100MB per file, the tensorflow's network metric files were zipped and split to smaller files. To uncompress and place them in the correct folder:

# On agent-trainer-results´ root folder, execute these commands:
$ chmod u+x decompress_metrics.sh
$ ./decompress_metrics.sh

Usage

Use tensorboard1 to explore the network's metrics:

# To see the metrics for a specific run:
$ tensorboard --logdir=/folder/where/repo/cloned/201609040550_5010eps/metrics-q-network

# To see the metrics for all runs: 
# (be aware that tensorboard takes some time to show all the runs after it is started, so you may have to wait some seconds until they are all ready for viewing)
$ tensorboard --logdir=/folder/where/repo/cloned

Use agent-trainer to play the game using the trained network, build the session's metrics or build a t-SNE visualization. For example:

$ cd /your/agent/trainer/folder/
$ python -m agent play -s 201609171218_175eps --resultspath /folder/where/repo/cloned/

Contents Description

Session Machine Training
game mode
Learning rate decay
201609040550_5010eps a) timed; easy without learning rate decay
201609111241_2700eps b) timed; easy unclipped learning rate decay
201609111241_7300eps b) timed; easy unclipped learning rate decay
201609160922_54eps b) unlimited time without learning rate decay
201609171218_175eps b) unlimited time unclipped learning rate decay

Machines used for training:

  • (a) - AMD Athlon(tm) II X2 250 Processor @ 3GHz; 2GB RAM DDR3-1333 SDRAM; SSD 500 GB: Samsung 850 EVO (CPU only training)

  • (b) - AWS EC2 g2.2xlarge (GPU enabled instance) with 200 GB General Purpose SSD (GP2)2


---

1 As of Tensorboard 16 (tested on OSX El Capitan), using tilde (~) to represent the home folder on --logdir will not work properly and tensorboard wont be able to detect the metrics files (for example, tensorboard --logdir=~/temp/agent-trainer-results)

2 Each training run can reach up to 25GB worth of replay memories, which need to be accessed randomly during the training process. Since these cannot fit into the instance's 16GB of RAM, about 600 IOPS are required to keep the training performance acceptable. GP2 volumes provide more IOPS as you increase their size, hence the allocation of a 200 GB General Purpose SSD (GP2, 600 IOPS), which turns out to be more cost effective than a smaller 30GB, 600 IOPS Provisioned SSD (IO1).