Atari-2600-Deep-Learning-Agent

This program was created as my final year project for my undergraduate Computer Science degree. It is used for performing deep reinforcement learning experiments on the Atari 2600. It can launch a range of game environments using AIGym, and then train agents how to play based on a pre-determined set of hyper parameters. The user can determine several program parameters based on the experiment they wish to carry out.

Program Requirements

The following software is required:

Python 3.6 or later (Recommended: Anaconda for easy installation)
(Python Package) AIGym
(Python Package) Tensorflow
Microsoft Build Tools for Visual Studio 2015/2017

Windows Installation Procedure

To prepare a Windows system to run this program, the following steps should be followed; With Anaconda installed on the machine, open a CLI and enter the following commands;

conda install git
conda update --all
conda clean -a
pip install git+https://github.com/Kojoley/atari-py.git
pip install gym[atari]
pip install --upgrade tensorflow

If you have a NVidia GPU and want to optimise the performance of the program, then the GPU version of tensorflow should be installed. This requires a few extra downloads and configuration steps which can be found in Tensorflow's documentation; https://www.tensorflow.org/install

Launching a Training Session

Download all the required project files and save them in a driectory of your choice. From a CLI navigate to the directory. A session with default program parameters can be launched with: python agent.py This will launch the program and ask the user which game they wish to test on. Type the name of the game you wish to test and press enter to begin.

Command Line Arguments

For complex control of the program several command line arguments can be used to set certain parameters;

Name	Abbreviation	Description	Default Value
--gameName	-gn	Name of the game to be loaded into the program	N/A
--episodeCount	-ec	The number of episodes to be ran in this trial	10
--render	-r	Should the game be rendered to the screen	False
--noTrain	-nt	Should the model be trained	True
--save	-s	Should the model be saved	False
--load	-l	Load a trained model by episode number	0

An example of using these arguments to launch an experiment on the game "Breakout" for 1000 episodes, with the game being rendered to the screen and the model being saved would look like;

python agent.py -gn Breakout -ec 1000 -r -s

List of Compatible Atari 2600 Games

A full list of game names compatiblle with this program are; Alien, Asterix, Asteroids, Atlantis, BattleZone, BeamRider, Berzerk, Bowling, Boxing, Breakout, ChopperCommand, CrazyClimber, Defender, DemonAttack, ElevatorAction, Enduro, FishinDerby, Freeway, FrostBite, Gravitar, Hero, IceHockey, JamesBond, Kangaroo, Kaboom, Krull, KungFuMaster, MontezumaRevenge, MsPacman, NameThisGame, Phoenix, Pitfall, Pong, PrivateEye, Qbert, Riverraid, RoadRunner, Robotank, Seaquest, Solaris, SpaceInvaders, StarGunner, TimePilot, UpNDown, Venture, YarsRevenge, Zaxxon.

Default Hyper Parameters

Name	Value	Description
Alpha	0.0001	Learning rate for Adam algorithm
Gamma	0.99	Discount rate of subsequent states value
Epsilon Minimum	0.05	Lowest Value epsilon can reach
Epsilon Maximum	1.0	Highest, and start value of epsilon
Epsilon Delta	0.000002	Amount by which epsilon anneals each step
Replay Size	150000	Number of experiences stored in memory
Update Frequencey	4	Number of steps between network updates
Batch Size	32	Number of experiences sampled for one update
Target Update Frequency	10000	Number of steps between target network updates

Deep Q Reinforcement Learning Design

This agent uses a range of different techniques to improve performance; seperate target and value networks, experience replay buffer and batch updating being the main concepts.

The neural network itself is a scaled down version of a similar structure suggest by deepmind. It is a convolutional neural network (CNN) with the following structure;

To improve performance there is also pre-processing performed on each frame, reducing it from RGB to grey-scale, down-sampling and then stacking 4 consective frame into a 3D tensor;

Further detail on the design, rationale and mathematics can be found in the full project report which is included in the res directory.

Accuracy and Test Results

Overall this agent can achieve super-human results in some games (Pong and Breakout) and achieve some level of learning in most others. Full tests on all the available games have not yet been carried out, but the results of 5,000,000 steps in 5 different games can be seen here;

Additional Details

There is also additonal options to run a variation of the standard agent to test two different novel methods for varying the epsilon value whilst training; Stepped Annealing Epsilon (SAE) and Variable Epsilon (VE). The instructions on how to test these methods and a write up of their performance can be found in the full project report which is included in the res directory.

Acknowledgments

I would like to acknowledge some of the developers of the “Arcade Learning Environment”, Nicolai Czempin and Marc Bellemare, who happily provided me with details on the inner workings of the emulator and assisted me with debugging and defining the scope of my project. Finally, I would like to thank Nikita Kniazev, who created a Windows branch of the python “Arcade Learning Environment (Atari-Py)” which would work with “AI Gym”.

License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vs		.vs
Results		Results
__pycache__		__pycache__
res		res
Agent.py		Agent.py
ArgumentParser.py		ArgumentParser.py
DeepQNetwork.py		DeepQNetwork.py
ExperienceReplay.py		ExperienceReplay.py
NeuralNet.py		NeuralNet.py
PreProcessor.py		PreProcessor.py
README.md		README.md
ResultsRecorder.py		ResultsRecorder.py
UnitTests.py		UnitTests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atari-2600-Deep-Learning-Agent

Program Requirements

Windows Installation Procedure

Launching a Training Session

Command Line Arguments

List of Compatible Atari 2600 Games

Default Hyper Parameters

Deep Q Reinforcement Learning Design

Accuracy and Test Results

Additional Details

Acknowledgments

License

About

Releases

Packages

Languages

ChristopherHaynes/Atari-2600-Deep-Learning-Agent

Folders and files

Latest commit

History

Repository files navigation

Atari-2600-Deep-Learning-Agent

Program Requirements

Windows Installation Procedure

Launching a Training Session

Command Line Arguments

List of Compatible Atari 2600 Games

Default Hyper Parameters

Deep Q Reinforcement Learning Design

Accuracy and Test Results

Additional Details

Acknowledgments

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages