Update doc: hyperparameter tuning for rl zoo (#330)

* Update doc: hyperparam tuning for rl zoo * Add colab notebook link
Stable-Baselines-Team · May 29, 2019 · e78a29d · e78a29d
1 parent 90ab67a
commit e78a29d
Show file tree

Hide file tree

Showing 5 changed files with 36 additions and 7 deletions.
diff --git a/README.md b/README.md
@@ -42,12 +42,11 @@ This toolset is a fork of OpenAI Baselines, with a major structural refactoring,
 
 Documentation is available online: [https://stable-baselines.readthedocs.io/](https://stable-baselines.readthedocs.io/)
 
-## RL Baselines Zoo: A Collection of 70+ Trained RL Agents
+## RL Baselines Zoo: A Collection of 100+ Trained RL Agents
 
-[RL Baselines Zoo](https://github.com/araffin/rl-baselines-zoo). is a collection of pre-trained Reinforcement Learning agents using
-Stable-Baselines.
+[RL Baselines Zoo](https://github.com/araffin/rl-baselines-zoo). is a collection of pre-trained Reinforcement Learning agents using Stable-Baselines.
 
-It also provides basic scripts for training, evaluating agents and recording videos.
+It also provides basic scripts for training, evaluating agents, tuning hyperparameters and recording videos.
 
 Goals of this repository:
 

diff --git a/docs/guide/custom_env.rst b/docs/guide/custom_env.rst
@@ -7,6 +7,13 @@ To use the rl baselines with custom environments, they just need to follow the *
 That is to say, your environment must implement the following methods (and inherits from OpenAI Gym Class):
 
 
+.. note::
+
+	 If you are using images as input, the input values must be in [0, 255] as the observation
+   is normalized (dividing by 255 to have values in [0, 1]) when using CNN policies.
+
+
+
 .. code-block:: python
 
   import gym

diff --git a/docs/guide/examples.rst b/docs/guide/examples.rst
@@ -13,13 +13,15 @@ notebooks:
 -  `Monitor Training and Plotting`_
 -  `Atari Games`_
 -  `Breakout`_ (trained agent included)
+-  `RL Baselines zoo`_
 
 .. _Getting Started: https://colab.research.google.com/drive/1_1H5bjWKYBVKbbs-Kj83dsfuZieDNcFU
 .. _Training, Saving, Loading: https://colab.research.google.com/drive/1KoAQ1C_BNtGV3sVvZCnNZaER9rstmy0s
 .. _Multiprocessing: https://colab.research.google.com/drive/1ZzNFMUUi923foaVsYb4YjPy4mjKtnOxb
 .. _Monitor Training and Plotting: https://colab.research.google.com/drive/1L_IMo6v0a0ALK8nefZm6PqPSy0vZIWBT
 .. _Atari Games: https://colab.research.google.com/drive/1iYK11yDzOOqnrXi1Sfjm1iekZr4cxLaN
 .. _Breakout: https://colab.research.google.com/drive/14NwwEHwN4hdNgGzzySjxQhEVDff-zr7O
+.. _RL Baselines zoo: https://colab.research.google.com/drive/1cPGK3XrCqEs3QLqiijsfib9OFht3kObX
 
 .. |colab| image:: ../_static/img/colab.svg
 

diff --git a/docs/guide/rl_zoo.rst b/docs/guide/rl_zoo.rst
@@ -6,7 +6,7 @@ RL Baselines Zoo
 
 `RL Baselines Zoo <https://github.com/araffin/rl-baselines-zoo>`_. is a collection of pre-trained Reinforcement Learning agents using
 Stable-Baselines.
-It also provides basic scripts for training, evaluating agents and recording videos.
+It also provides basic scripts for training, evaluating agents, tuning hyperparameters and recording videos.
 
 Goals of this repository:
 
@@ -22,7 +22,7 @@ Installation
 ::
 
    apt-get install swig cmake libopenmpi-dev zlib1g-dev ffmpeg
-   pip install stable-baselines>=2.2.1 box2d box2d-kengz pyyaml pybullet==2.1.0 pytablewriter
+   pip install stable-baselines box2d box2d-kengz pyyaml pybullet optuna pytablewriter
 
 2. Clone the repository:
 
@@ -81,6 +81,27 @@ For example, enjoy A2C on Breakout during 5000 timesteps:
   python enjoy.py --algo a2c --env BreakoutNoFrameskip-v4 --folder trained_agents/ -n 5000
 
 
+Hyperparameter Optimization
+---------------------------
+
+We use `Optuna <https://optuna.org/>`_ for optimizing the hyperparameters.
+
+
+Tune the hyperparameters for PPO2, using a random sampler and median pruner, 2 parallels jobs,
+with a budget of 1000 trials and a maximum of 50000 steps:
+
+::
+
+  python -m train.py --algo ppo2 --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 \
+    --sampler random --pruner median
+
+
+Colab Notebook: Try it Online!
+------------------------------
+
+You can train agents online using Google `colab notebook <https://colab.research.google.com/drive/1cPGK3XrCqEs3QLqiijsfib9OFht3kObX>`_.
+
+
 .. note::
 
 	You can find more information about the rl baselines zoo in the repo `README <https://github.com/araffin/rl-baselines-zoo>`_. For instance, how to record a video of a trained agent.
diff --git a/docs/index.rst b/docs/index.rst
@@ -13,7 +13,7 @@ Github repository: https://github.com/hill-a/stable-baselines
 
 RL Baselines Zoo (collection of pre-trained agents): https://github.com/araffin/rl-baselines-zoo
 
-RL Baselines zoo also offers a simple interface to train and evaluate agents.
+RL Baselines zoo also offers a simple interface to train, evaluate agents and do hyperparameter tuning.
 
 You can read a detailed presentation of Stable Baselines in the
 Medium article: `link <https://medium.com/@araffin/stable-baselines-a-fork-of-openai-baselines-reinforcement-learning-made-easy-df87c4b2fc82>`_