GitHub - danielnbarbosa/angela: A modular deep reinforcement learning framework that supports a variety of algorithms, environments and models.

█████╗ ███╗   ██╗ ██████╗ ███████╗██╗      █████╗
██╔══██╗████╗  ██║██╔════╝ ██╔════╝██║     ██╔══██╗
███████║██╔██╗ ██║██║  ███╗█████╗  ██║     ███████║
██╔══██║██║╚██╗██║██║   ██║██╔══╝  ██║     ██╔══██║
██║  ██║██║ ╚████║╚██████╔╝███████╗███████╗██║  ██║
╚═╝  ╚═╝╚═╝  ╚═══╝ ╚═════╝ ╚══════╝╚══════╝╚═╝  ╚═╝

ANGELA: Artificial Neural Generated Environment Learning Agent

Introduction

Angela is a deep reinforcement learning agent, or rather a collection of agents, capable of solving a variety environments. She implements several different RL algorithms and neural network models.

She can work with both discrete and continuous action spaces allowing her to tackle anything from Atari games to robotic control problems.

She is coded in python3 and pytorch and is getting smarter every day :).

Basically I use Angela as a modular way to test out different RL algorithms in a variety of environments. It's great for prototyping and getting an agent training quickly without having to re-write a lot of boilerplate.

While it is fairly easy to throw a new environment at her using one of the supported algorithms, it often requires some hyperparameter tuning to succeed at a specific problem. Configuration files with good hyperparameters, along with training results and some of my notes are included for all the environments below.

Visualizations are provided for some of the environments just to whet your appetite.

Features

Environments

Open AI Gym

Atari: Pong
Box2D: BipedalWalker | LunarLander | LunarLanderContinuous | CarRacing
Classic control: Acrobot | Cartpole | MountainCar | MountainCarContinuous | Pendulum
MuJoCo: HalfCheetah | Hopper | InvertedDoublePendulum | InvertedPendulum | Reacher
NES: SuperMarioBros
Toy text: FrozenLake | FrozenLake8x8

Unity ML

Udacity DRLND: Banana | Crawler | Reacher | Tennis | VisualBanana
Example Environments: 3DBall | Basic | PushBlock

PyGame Learning Environment

Games: FlappyBird

Algorithms

dqn: Deep Q Networks with experience replay, fixed Q-targets, double DQN and prioritized experience replay
hc: Hill Climbing with adaptive noise, steepest ascent and simulated annealing
pg: Vanilla Policy Gradient (REINFORCE)
ppo: Proximal Policy Optimization
ddpg: Deep Deterministic Policy Gradient
maddpg: Multi-Agent Deep Deterministic Policy Gradient with shared (v1) and separate (v2) actor/critic for each agent

Models

dqn: multi-layer perceptron, dueling networks, CNN
hc: single-layer perceptron
pg: multi-layer perceptron, CNN
ppo: multi-layer perceptron, CNN
ddpg: low dimensional state spaces
maddpg: low dimensional state spaces

Misc

loads hyperparameters from configuration files
outputs training stats via console, tensorboard and matplotlib
summarizes model structure
saves and loads model weights
renders an agent in action

Installation

The below process works for MacOS, but should be easily adopted for Windows. For AWS see separate instructions.

Step 1: Install dependencies

Create an anaconda environment that contains all the required dependencies to run the project. If you want to work with mujoco environments see additional requirements. Note that ppaquette_gym_super_mario downgrades gym to 0.10.5.

git clone https://github.com/danielnbarbosa/angela.git
conda create -y -n angela python=3.6 anaconda
source activate angela
conda install -y pytorch torchvision -c pytorch
conda install -y pip swig opencv scikit-image
conda uninstall -y ffmpeg # needed for gym monitor
conda install -y -c conda-forge opencv ffmpeg  # needed for gym monitor
pip install torchsummary tensorboardX dill gym Box2D box2d-py unityagents pygame ppaquette_gym_super_mario
cd ..

brew install fceux  # this is the NES emulator

Step 2: Install environment toolkits

git clone https://github.com/openai/gym.git
cd gym
pip install -e '.[atari]'
cd ..

git clone https://github.com/Unity-Technologies/ml-agents.git
cd ml-agents/ml-agents
pip install .
cd ../..

git clone https://github.com/ntasfi/PyGame-Learning-Environment
cd PyGame-Learning-Environment
pip install -e .
cd ..

Usage

To start training, use main.py and pass in the path to the desired configuration file. Training stops when the agent reaches the target solve score. For example, to train on the CartPole environment using the PPO algorithm (which takes about 6 seconds on my laptop):

./main.py --cfg=cfg/gym/cartpole/cartpole_ppo.py

To load a saved model:

./main.py --cfg=cfg/gym/cartpole/cartpole_ppo.py --load=checkpoints/last_run/solved.pth

To render an agent during training:

./main.py --cfg=cfg/gym/cartpole/cartpole_ppo.py --render

To render a saved model:

./main.py --cfg=cfg/gym/cartpole/cartpole_ppo.py --render --load=checkpoints/last_run/solved.pth

Project layout

The directory tree structure is as follows:

cfg: Configuration files with saved hyperparameters.
checkpoints: Saved model weights.
compiled_unity_environments: Pre-compiled unity environments for use with ML Agents.
docs: Auxiliary documentation.
libs: Shared libraries. Code for agents, environments and various utility functions.
logs: Copies of configs, weights and logging for various training runs.
results: Current best training results for each environment.
runs: Output of tensorboard logging.
scripts: Helper scripts.

Acknowledgements

Code from the following repos has been used to build this project:

Udacity Deep Reinforcement Learning, a nanodegree course that I took.
Learning Pong from Pixels by Andrej Karpathy.
Deep Policy Gradient Reinforcement Learning by Justin Francis.

Name		Name	Last commit message	Last commit date
Latest commit History 284 Commits
cfg		cfg
checkpoints		checkpoints
compiled_unity_environments		compiled_unity_environments
docs		docs
libs		libs
nbs		nbs
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py

License

danielnbarbosa/angela

Folders and files

Latest commit

History

Repository files navigation

Introduction

Features

Environments

Algorithms

Models

Misc

Installation

Step 1: Install dependencies

Step 2: Install environment toolkits

Usage

Project layout

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages