universe-starter-agent (for continuous control)

a basic implementation of the A3C algorithm, with continuous control.

To-Dos

Command line argument to specify scenario file
How to save output files (optionally save output file)
Non-Zero Entropy

There is a large issue with float64 used by the ate3 env and float32 used by universe starter agent

Dependencies

Python 2.7 or 3.5
Golang
six (for py2/3 compatibility)
TensorFlow 0.12
tmux (the start script opens up a tmux session with multiple windows)
htop (shown in one of the tmux windows)
gym
gym[atari]
libjpeg-turbo (brew install libjpeg-turbo)
universe
opencv-python
numpy
scipy

Getting Started

conda create --name universe-starter-agent python=3.5
source activate universe-starter-agent

brew install tmux htop cmake golang libjpeg-turbo      # On Linux use sudo apt-get install -y tmux htop cmake golang libjpeg-dev

pip install "gym[atari]"
pip install universe
pip install six
pip install tensorflow
conda install -y -c https://conda.binstar.org/menpo opencv3
conda install -y numpy
conda install -y scipy

Add the following to your .bashrc so that you'll have the correct environment when the train.py script spawns new bash shells source activate universe-starter-agent

ATE3 Simulator

python train.py --num-workers 2 --log-dir /tmp/ate3

The command above will train an agent on ATE3 simulator. It will see two workers that will be learning in parallel (--num-workers flag) and will output intermediate results into given directory.

The code will launch the following processes:

worker-0 - a process that runs policy gradient
worker-1 - a process identical to process-1, that uses different random noise from the environment
ps - the parameter server, which synchronizes the parameters among the different workers
tb - a tensorboard process for convenient display of the statistics of learning

Once you start the training process, it will create a tmux session with a window for each of these processes. You can connect to them by typing tmux a in the console. Once in the tmux session, you can see all your windows with ctrl-b w. To switch to window number 0, type: ctrl-b 0. Look up tmux documentation for more commands.

To access TensorBoard to see various monitoring metrics of the agent, open http://localhost:12345/ in a browser.

Other commands:

python worker.py --job-name ps --num-workers 1
- creates a single parameter server
python worker.py --job-name worker --num-workers 1
- creates a single worker to interact with environment

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
.idea		.idea
ate3-log		ate3-log
imgs		imgs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
a3c.py		a3c.py
envs.py		envs.py
model.py		model.py
train.py		train.py
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

ate3-log

ate3-log

imgs

imgs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

a3c.py

a3c.py

envs.py

envs.py

model.py

model.py

train.py

train.py

worker.py

worker.py

Repository files navigation

universe-starter-agent (for continuous control)

To-Dos

Dependencies

Getting Started

ATE3 Simulator

About

Releases

Packages

Languages

License

zahay117/universe-starter-agent

Folders and files

Latest commit

History

Repository files navigation

universe-starter-agent (for continuous control)

To-Dos

Dependencies

Getting Started

ATE3 Simulator

About

Resources

License

Stars

Watchers

Forks

Languages