Skip to content

Commit

Permalink
Update doc: hyperparameter tuning for rl zoo (#330)
Browse files Browse the repository at this point in the history
* Update doc: hyperparam tuning for rl zoo

* Add colab notebook link
  • Loading branch information
araffin authored and hill-a committed May 29, 2019
1 parent 90ab67a commit e78a29d
Show file tree
Hide file tree
Showing 5 changed files with 36 additions and 7 deletions.
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,11 @@ This toolset is a fork of OpenAI Baselines, with a major structural refactoring,

Documentation is available online: [https://stable-baselines.readthedocs.io/](https://stable-baselines.readthedocs.io/)

## RL Baselines Zoo: A Collection of 70+ Trained RL Agents
## RL Baselines Zoo: A Collection of 100+ Trained RL Agents

[RL Baselines Zoo](https://github.com/araffin/rl-baselines-zoo). is a collection of pre-trained Reinforcement Learning agents using
Stable-Baselines.
[RL Baselines Zoo](https://github.com/araffin/rl-baselines-zoo). is a collection of pre-trained Reinforcement Learning agents using Stable-Baselines.

It also provides basic scripts for training, evaluating agents and recording videos.
It also provides basic scripts for training, evaluating agents, tuning hyperparameters and recording videos.

Goals of this repository:

Expand Down
7 changes: 7 additions & 0 deletions docs/guide/custom_env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ To use the rl baselines with custom environments, they just need to follow the *
That is to say, your environment must implement the following methods (and inherits from OpenAI Gym Class):


.. note::

If you are using images as input, the input values must be in [0, 255] as the observation
is normalized (dividing by 255 to have values in [0, 1]) when using CNN policies.



.. code-block:: python
import gym
Expand Down
2 changes: 2 additions & 0 deletions docs/guide/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ notebooks:
- `Monitor Training and Plotting`_
- `Atari Games`_
- `Breakout`_ (trained agent included)
- `RL Baselines zoo`_

.. _Getting Started: https://colab.research.google.com/drive/1_1H5bjWKYBVKbbs-Kj83dsfuZieDNcFU
.. _Training, Saving, Loading: https://colab.research.google.com/drive/1KoAQ1C_BNtGV3sVvZCnNZaER9rstmy0s
.. _Multiprocessing: https://colab.research.google.com/drive/1ZzNFMUUi923foaVsYb4YjPy4mjKtnOxb
.. _Monitor Training and Plotting: https://colab.research.google.com/drive/1L_IMo6v0a0ALK8nefZm6PqPSy0vZIWBT
.. _Atari Games: https://colab.research.google.com/drive/1iYK11yDzOOqnrXi1Sfjm1iekZr4cxLaN
.. _Breakout: https://colab.research.google.com/drive/14NwwEHwN4hdNgGzzySjxQhEVDff-zr7O
.. _RL Baselines zoo: https://colab.research.google.com/drive/1cPGK3XrCqEs3QLqiijsfib9OFht3kObX

.. |colab| image:: ../_static/img/colab.svg

Expand Down
25 changes: 23 additions & 2 deletions docs/guide/rl_zoo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ RL Baselines Zoo

`RL Baselines Zoo <https://github.com/araffin/rl-baselines-zoo>`_. is a collection of pre-trained Reinforcement Learning agents using
Stable-Baselines.
It also provides basic scripts for training, evaluating agents and recording videos.
It also provides basic scripts for training, evaluating agents, tuning hyperparameters and recording videos.

Goals of this repository:

Expand All @@ -22,7 +22,7 @@ Installation
::

apt-get install swig cmake libopenmpi-dev zlib1g-dev ffmpeg
pip install stable-baselines>=2.2.1 box2d box2d-kengz pyyaml pybullet==2.1.0 pytablewriter
pip install stable-baselines box2d box2d-kengz pyyaml pybullet optuna pytablewriter

2. Clone the repository:

Expand Down Expand Up @@ -81,6 +81,27 @@ For example, enjoy A2C on Breakout during 5000 timesteps:
python enjoy.py --algo a2c --env BreakoutNoFrameskip-v4 --folder trained_agents/ -n 5000


Hyperparameter Optimization
---------------------------

We use `Optuna <https://optuna.org/>`_ for optimizing the hyperparameters.


Tune the hyperparameters for PPO2, using a random sampler and median pruner, 2 parallels jobs,
with a budget of 1000 trials and a maximum of 50000 steps:

::

python -m train.py --algo ppo2 --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 \
--sampler random --pruner median


Colab Notebook: Try it Online!
------------------------------

You can train agents online using Google `colab notebook <https://colab.research.google.com/drive/1cPGK3XrCqEs3QLqiijsfib9OFht3kObX>`_.


.. note::

You can find more information about the rl baselines zoo in the repo `README <https://github.com/araffin/rl-baselines-zoo>`_. For instance, how to record a video of a trained agent.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Github repository: https://github.com/hill-a/stable-baselines

RL Baselines Zoo (collection of pre-trained agents): https://github.com/araffin/rl-baselines-zoo

RL Baselines zoo also offers a simple interface to train and evaluate agents.
RL Baselines zoo also offers a simple interface to train, evaluate agents and do hyperparameter tuning.

You can read a detailed presentation of Stable Baselines in the
Medium article: `link <https://medium.com/@araffin/stable-baselines-a-fork-of-openai-baselines-reinforcement-learning-made-easy-df87c4b2fc82>`_
Expand Down

0 comments on commit e78a29d

Please sign in to comment.