Home

Table of Contents Documentation Initial ideas Installation Linux (or Mac) Python Atari framework Installing cuda-convnet2 (Convnet Branch) Modifying cuda-convnet2 code(Convnet Branch) Comments about the algorithm Pre-processing Basic tests to the system Possible differences from the original article Random comments

Documentation

Initial ideas

Overall algorithm is written down to a file algorithm.m (in semi-pseudocode)

Installation

Linux (or Mac)

Python

We use Python 2.7 and 32 bit version
Requirements:
- Pillow (image processing library), NumPy and SciPy
  - sudo apt-get install python-pil python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
- Theano
  - sudo pip install Theano
  - More information at deeplearning.net

Atari framework

Libraries needed:
- sudo apt-get install libsdl-gfx1.2-dev libsdl-image1.2-dev libsdl1.2-dev
- sudo apt-get install imagemagick (might be needed for .show() function)
Installing Arcade Learning Environment (ALE)
- Run ./install.sh from root directory of the repository
- ALE will be compiled in ./libraries/ale
- The ALE executable is: ./libraries/ale/ale
- ROMs are stored under ./libraries/ale/roms

Installing cuda-convnet2 (Convnet Branch)

The convnet branch makes use of the cuda-convnet2 code written by Aleksey Akrizhevsky. To run this code you MUST have a NVIDIA cuda-capable GPU of compute capability 3.5. The convnet page says it has to be at least 3.5 and we have found that 5.0 (Maxwell architecture) does not work (the bug is being fixed by NVIDA as we speak).

To begin with, you need to set up your computer to use the graphics card. Download an install Cuda Toolkit and driver for the GPU. We have had sucess with Toolkit 6.0 (https://developer.nvidia.com/cuda-toolkit-60). We strongly advise to follow the installation instructions (as well as pre-intallation and post-installation) in Getting Started Guide (http://developer.download.nvidia.com/compute/cuda/6_0/rel/docs/CUDA_Getting_Started_Linux.pdf).

Once you have succeeded in doing a "hello_world" in cuda, you can move on to installing convnet2. Download and install the dependencies and git-checkout the code as described in https://code.google.com/p/cuda-convnet2/wiki/Compiling

Change the environment variables in the build.sh file in the main directory - normally you just have to change the location of your Cuda installation. Then run "./build sh" and the convnet2 should compile in a few minutes. Nevertheless, this is not enough- to run our code, you will need to tweak the convnet2 code a bit. This will be described in the following section.

Modifying cuda-convnet2 code(Convnet Branch)

It sems that certain aspects of cuda-convnet2 code are designed to work with images that have either 1 input channel (grayscale) or 3 channels (RGB). We, however, have 4 input channels (the 4 frames). To accomodate our case without getting assertion errors, one needs to change the file at:

cuda-convnet2/cudaconv3/src/weight_acts.cu

In line 2023 replace:
numImgColors < = 3 with numImgColors < = 4

and in line 2059 replace:
if (numFilterColors > 4) with if (numFilterColors > 4)

This should help the system deal with 4 input channels. After making the modifications, recompile the program by doing "./build.sh" in the "cuda-convnet2/" folder

Comments about the algorithm

Pre-processing

We convert Atari NTSC 128 palette colors to RGB using this table. (might be different in the original paper)
We DO NOT use formula 0.21*R + 0.71*G + 0.07*B to convert RGB to grayscale but 0.5*R + 0.5*G + 0.5*B. (link) (might be different in the original paper)

Basic tests to the system

Issue #7 (make sure that learning changes weight values at all):
make sure that learning_rate > 0 and uncomment the last line in "NeuralNet.train()", that prints some parameters after every training event
Issue #10 (make sure that for different input we get different output):
easiest way to make sure of this (without writing a specific function) is to set learning_rate = 0.0, so the network weights do not change (you can verify that as in Issue #7) and uncomment line print "estimated q", estimated_Q in "NeuralNet.train()". Starting from second frame, when we already have two different images to compare, the Q-values estimated during minibatch-training should have more than 1 different value (at second frame they have 2 possible values, at 3rd step 3 possible values)

Possible differences from the original article

Preprocessing: color to grayscale
Gradient descent learning rate
Gradient descent regularisation: none, L1, L2 or both
Momentum in RMSProp
Initialization of the network: weights (mean, std) + biases
What to do with initial frames which do not have previous memory?
Death has no penalty?
Implementation of the error: we have zeros in all non-taken actions

Random comments

Wiki markup syntax

Provide feedback

Saved searches

Use saved searches to filter your results more quickly