Exploring Evolutionary Learning Techniques in Interactive Environments

In this project, NeuroEvolution of Augmenting Topologies (NEAT) and Evolution Strategies (ES) are used in various OpenAI Gym environments:

CartPole-v1
LunarLanderContinuous-v2
BipedalWalker-v3
CarRacing-v0
Pong-ram-v4

The project also includes custom code for distributed learning, which can be seen as options for training Pong.

Installation

This project uses Python 3 and has been tested on Ubuntu 18.04 and Windows 10 using venv. It should be fairly straightforward to install using Anaconda as well, but instructions for this are not listed.

Ubuntu 18.04

Install the required packages: sudo apt install libsdl2-dev swig cmake python3-tk
- There are potentially other dependencies but this should be all of them. If installation fails make sure to check the output.
Install the project dependencies with pip install -r requirements.txt.

Windows 10

It is more complicated to get the project to work on Windows. This is due to the build requirements of certain libraries used. There are two tools that must be downloaded:

Microsoft Visual C++ Build Tools
SWIG 3
Add the resulting folder to your system's path (only needs to be accessible during Python module installation)

Additionally, ale_c.dll is required to run any Atari environments. It has been included in the res folder. This file should be copied to <python folder>/Lib/site-packages/atari_py/ale_interface/.

Now, you can install the project dependencies with pip install -r requirements.txt.

Usage

The following files can be run to train an environment with the specified technique. These files can be run with python <filename>. Note that it may be difficult to stop the applications due to Python's multiprocessing, the easiest way to quit would probably be to close the terminal running the file. When a generation is finished, a simulation will display showing that generation's best individual interacting in the environment (this window may hide behind others, make sure to clear your main display).

Algorithm	Environment	File
NEAT	`XOR`	`xor_neat.py`
NEAT	`CartPole-v1`	`cartpole_neat.py`
NEAT	`LunarLanderContinuous-v2`	`lunarlander_neat.py`
NEAT	`BipedalWalker-v3`	`bipedal_neat.py`
NEAT	`CarRacing-v0`	`carracing_neat.py`
NEAT	`Pong-ram-v4`	`pong_neat.py`
NEAT	`Pong-ram-v4 (distributed)`	`pong_neat_distrib.py`
ES	`XOR`	`xor_es.py`
ES	`CartPole-v1`	`cartpole_es.py`
ES	`LunarLanderContinuous-v2`	`lunarlander_es.py`
ES	`BipedalWalker-v3`	`bipedal_es.py`
ES	`CarRacing-v0`	`carracing_es.py`
ES	`Pong-ram-v4`	`pong_es.py`
ES	`Pong-ram-v4 (distributed)`	`pong_es_distrib.py`

Distributed Training

To train Pong with distributed computing, there are a few things you must do:

Install this project on any machine that will be used.
Setup port forwarding so workers can connect to the host machine. The default port is 4919.
For each worker, run python worker.py -t <num threads> -a <server address> -p <port>.
On the host system, run either distributed file (pong_neat_distrib.py or pong_es_distrib.py).

The output of the host should look something like this:

$ python pong_<alg>_distrib.py
Waiting for connections...
Starting server at :4919
Worker connected at 12.34.56.78:31415 with 12 thread(s) and 31.0GB of memory
Worker connected at 12.34.56.79:14142 with 16 thread(s) and 55.0GB of memory
Worker connected at 12.34.56.80:17320 with 16 thread(s) and 37.0GB of memory
Done!
.
.
.

The output of a worker should look something like this:

$ python worker.py -t 12 -a 12.34.56.77 -p 4919
Starting worker process with 12 thread(s)
Connecting to 12.34.56.77:4919
Connected!
New shared data from server
Task from server: pong_<alg> (id=1)
Task from server: pong_<alg> (id=4)
Task from server: pong_<alg> (id=7)
.
.
.

Viewing Results

The only environment that stores generational fitness history is CarRacing-v0 (since there is too much debug text). For all others, terminal output must be manually copied and filtered into this format:

<alg> Best, <alg> Average
<gen 1 best>, <gen 1 avg>
<gen 2 best>, <gen 2 avg>
.
.
.

For your convenience, there are pre-processed history files in the models folder. To view fitness graphs, use python graph.py <env name> <neat hist file> <es hist file>.

As an example, python graph.py XOR models/xor_neat_hist.txt models/xor_es_hist.txt generates this graph:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Evolutionary Learning Techniques in Interactive Environments

Installation

Ubuntu 18.04

Windows 10

Usage

Distributed Training

Viewing Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
graphs		graphs
models		models
res		res
.gitignore		.gitignore
README.md		README.md
bipedalwalker_es.py		bipedalwalker_es.py
bipedalwalker_neat.py		bipedalwalker_neat.py
carracing_es.py		carracing_es.py
carracing_neat.py		carracing_neat.py
cartpole_es.py		cartpole_es.py
cartpole_neat.py		cartpole_neat.py
comm.py		comm.py
distrib.py		distrib.py
envs.txt		envs.txt
es.py		es.py
graph.py		graph.py
hyperneat.py		hyperneat.py
lunarlander_es.py		lunarlander_es.py
lunarlander_neat.py		lunarlander_neat.py
neat.py		neat.py
nn.py		nn.py
pong_es.py		pong_es.py
pong_es_distrib.py		pong_es_distrib.py
pong_neat.py		pong_neat.py
pong_neat_distrib.py		pong_neat_distrib.py
requirements.txt		requirements.txt
server.py		server.py
util.py		util.py
vis_live.py		vis_live.py
worker.py		worker.py
xor_es.py		xor_es.py
xor_neat.py		xor_neat.py

nhamil/evolution

Folders and files

Latest commit

History

Repository files navigation

Exploring Evolutionary Learning Techniques in Interactive Environments

Installation

Ubuntu 18.04

Windows 10

Usage

Distributed Training

Viewing Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages