Skip to content

Commit

Permalink
Update benchmark + doc
Browse files Browse the repository at this point in the history
  • Loading branch information
araffin committed Nov 21, 2018
1 parent fe36594 commit 866e273
Show file tree
Hide file tree
Showing 5 changed files with 18 additions and 34 deletions.
15 changes: 12 additions & 3 deletions README.md
@@ -1,6 +1,6 @@
[![Build Status](https://travis-ci.com/araffin/rl-baselines-zoo.svg?branch=master)](https://travis-ci.com/araffin/rl-baselines-zoo)

# RL Baselines Zoo: a Collection of Trained Reinforcement Learning Agents
# RL Baselines Zoo: a Collection of Pre-Trained Reinforcement Learning Agents

<img src="images/BipedalWalkerHardcorePPO2.gif" align="right" width="35%"/>

Expand Down Expand Up @@ -52,6 +52,15 @@ Continue training (here, load pretrained agent for Breakout and continue trainin
python train.py --algo a2c --env BreakoutNoFrameskip-v4 -i trained_agents/a2c/BreakoutNoFrameskip-v4.pkl -n 5000
```

## Record a Video of a Trained Agent

Record 1000 steps:

```
python -m utils.record_video --algo ppo2 --env BipedalWalkerHardcore-v2 -n 1000
```


## Current Collection: 70+ Trained Agents!

Scores can be found in `benchmark.md`. To compute them, simply run `python -m utils.benchmark`.
Expand All @@ -64,7 +73,7 @@ Scores can be found in `benchmark.md`. To compute them, simply run `python -m ut
|----------|--------------------|--------------------|--------------------|-------|-------|--------------------|--------------------|
| A2C | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |:heavy_check_mark:| :heavy_check_mark: | :heavy_check_mark:| :heavy_check_mark:|
| ACER | :heavy_check_mark: | |:heavy_check_mark:|:heavy_check_mark: |:heavy_check_mark:|:heavy_check_mark:| :heavy_check_mark: |
| ACKTR |:heavy_check_mark:| :heavy_check_mark:| |:heavy_check_mark:| :heavy_check_mark:| :heavy_check_mark:| :heavy_check_mark: |
| ACKTR |:heavy_check_mark:| :heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:| :heavy_check_mark:| :heavy_check_mark:| :heavy_check_mark: |
| PPO2 |:heavy_check_mark:|:heavy_check_mark:| :heavy_check_mark: |:heavy_check_mark: |:heavy_check_mark:|:heavy_check_mark:| :heavy_check_mark: |
| DQN |:heavy_check_mark:| :heavy_check_mark: |:heavy_check_mark:| :heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|

Expand All @@ -73,7 +82,7 @@ Additional Atari Games (to be completed):
| RL Algo | MsPacman |
|----------|-------------|
| A2C |:heavy_check_mark:|
| ACER | |
| ACER |:heavy_check_mark:|
| ACKTR | |
| PPO2 |:heavy_check_mark: |
| DQN | |
Expand Down
2 changes: 2 additions & 0 deletions benchmark.md
Expand Up @@ -18,6 +18,7 @@
|acer |EnduroNoFrameskip-v4 | 0.000| 0.000| 149574| 45|
|acer |LunarLander-v2 | 185.210| 64.829| 149415| 248|
|acer |MountainCar-v0 | -131.213| 32.541| 149976| 1143|
|acer |MsPacmanNoFrameskip-v4 | 3908.105| 585.407| 148924| 95|
|acer |PongNoFrameskip-v4 | 20.667| 0.507| 148275| 57|
|acer |QbertNoFrameskip-v4 | 18880.469| 1648.937| 148617| 64|
|acer |SeaquestNoFrameskip-v4 | 872.121| 25.555| 149650| 66|
Expand All @@ -26,6 +27,7 @@
|acktr|BeamRiderNoFrameskip-v4 | 3760.976| 1826.059| 147414| 41|
|acktr|BreakoutNoFrameskip-v4 | 448.514| 88.882| 143118| 37|
|acktr|CartPole-v1 | 487.573| 63.866| 149685| 307|
|acktr|EnduroNoFrameskip-v4 | 0.000| 0.000| 149574| 45|
|acktr|LunarLander-v2 | 96.822| 64.020| 149905| 176|
|acktr|MountainCar-v0 | -111.917| 21.422| 149969| 1340|
|acktr|PongNoFrameskip-v4 | 19.224| 3.697| 147753| 67|
Expand Down
35 changes: 4 additions & 31 deletions hyperparams/ddpg.yml
Expand Up @@ -30,44 +30,17 @@ BipedalWalker-v2:
Walker2DBulletEnv-v0:
n_timesteps: !!float 2e6
policy: 'LnMlpPolicy'
noise_type: 'adaptive-param'
noise_std: 0.2
noise_type: 'ornstein-uhlenbeck'
noise_std: 0.1
memory_limit: 1000000
batch_size: 64
normalize_observations: True

HalfCheetahBulletEnv-v0:
n_timesteps: !!float 2e6
policy: 'LnMlpPolicy'
noise_type: 'adaptive-param'
noise_std: 0.2
memory_limit: 1000000
batch_size: 64
normalize_observations: True

AntBulletEnv-v0:
n_timesteps: !!float 2e6
policy: 'LnMlpPolicy'
noise_type: 'adaptive-param'
noise_std: 0.2
memory_limit: 1000000
batch_size: 64
normalize_observations: True

HopperBulletEnv-v0:
n_timesteps: !!float 2e6
policy: 'LnMlpPolicy'
noise_type: 'adaptive-param'
noise_std: 0.2
memory_limit: 1000000
batch_size: 64
normalize_observations: True

ReacherBulletEnv-v0:
n_timesteps: !!float 2e6
policy: 'LnMlpPolicy'
noise_type: 'adaptive-param'
noise_std: 0.2
noise_type: 'ornstein-uhlenbeck'
noise_std: 0.1
memory_limit: 1000000
batch_size: 64
normalize_observations: True
Binary file added trained_agents/acer/MsPacmanNoFrameskip-v4.pkl
Binary file not shown.
Binary file added trained_agents/acktr/EnduroNoFrameskip-v4.pkl
Binary file not shown.

0 comments on commit 866e273

Please sign in to comment.