Skip to content
This repository has been archived by the owner on Feb 2, 2021. It is now read-only.

zahay117/universe-starter-agent

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

universe-starter-agent (for continuous control)

a basic implementation of the A3C algorithm, with continuous control.

To-Dos

  • Command line argument to specify scenario file
  • How to save output files (optionally save output file)
  • Non-Zero Entropy

There is a large issue with float64 used by the ate3 env and float32 used by universe starter agent

Dependencies

Getting Started

conda create --name universe-starter-agent python=3.5
source activate universe-starter-agent

brew install tmux htop cmake golang libjpeg-turbo      # On Linux use sudo apt-get install -y tmux htop cmake golang libjpeg-dev

pip install "gym[atari]"
pip install universe
pip install six
pip install tensorflow
conda install -y -c https://conda.binstar.org/menpo opencv3
conda install -y numpy
conda install -y scipy

Add the following to your .bashrc so that you'll have the correct environment when the train.py script spawns new bash shells source activate universe-starter-agent

ATE3 Simulator

python train.py --num-workers 2 --log-dir /tmp/ate3

The command above will train an agent on ATE3 simulator. It will see two workers that will be learning in parallel (--num-workers flag) and will output intermediate results into given directory.

The code will launch the following processes:

  • worker-0 - a process that runs policy gradient
  • worker-1 - a process identical to process-1, that uses different random noise from the environment
  • ps - the parameter server, which synchronizes the parameters among the different workers
  • tb - a tensorboard process for convenient display of the statistics of learning

Once you start the training process, it will create a tmux session with a window for each of these processes. You can connect to them by typing tmux a in the console. Once in the tmux session, you can see all your windows with ctrl-b w. To switch to window number 0, type: ctrl-b 0. Look up tmux documentation for more commands.

To access TensorBoard to see various monitoring metrics of the agent, open http://localhost:12345/ in a browser.

Other commands:

  • python worker.py --job-name ps --num-workers 1
    • creates a single parameter server
  • python worker.py --job-name worker --num-workers 1
    • creates a single worker to interact with environment

About

OpenAI's universal starter agent (A3C algorithm) with continuous control

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.5%
  • Shell 0.5%