# RL Baselines3 Zoo: Training in Colab



Github Repo: [https://github.com/DLR-RM/rl-baselines3-zoo](https://github.com/DLR-RM/rl-baselines3-zoo)

Stable-Baselines3 Repo: [https://github.com/DLR-RM/rl-baselines3-zoo](https://github.com/DLR-RM/stable-baselines3)


# Install Dependencies



In [2]:
!apt-get install swig cmake ffmpeg freeglut3-dev xvfb

Reading package lists... Done
Building dependency tree       
Reading state information... Done
freeglut3-dev is already the newest version (2.8.1-3).
swig is already the newest version (3.0.12-1).
cmake is already the newest version (3.10.2-1ubuntu2.18.04.1).
ffmpeg is already the newest version (7:3.4.8-0ubuntu0.2).
xvfb is already the newest version (2:1.19.6-1ubuntu4.9).
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.


## Clone RL Baselines3 Zoo Repo

In [3]:
!git clone --recursive https://github.com/DLR-RM/rl-baselines3-zoo

fatal: destination path 'rl-baselines3-zoo' already exists and is not an empty directory.


In [4]:
%cd /content/rl-baselines3-zoo/

/content/rl-baselines3-zoo


### Install pip dependencies

In [5]:
!pip install -r requirements.txt

Collecting sphinxcontrib.spelling; extra == "docs"
  Using cached https://files.pythonhosted.org/packages/50/9d/7fd15b645c7eec20c6fe85b392ae296ccd893fec8179645dc81d0bae4ad8/sphinxcontrib_spelling-7.2.1-py3-none-any.whl
Collecting docutils<0.17
[?25l  Downloading https://files.pythonhosted.org/packages/81/44/8a15e45ffa96e6cf82956dd8d7af9e666357e16b0d93b253903475ee947f/docutils-0.16-py2.py3-none-any.whl (548kB)
[K     |████████████████████████████████| 552kB 7.4MB/s 
[31mERROR: sphinxcontrib-spelling 7.2.1 has requirement Sphinx>=3.0.0, but you'll have sphinx 1.8.5 which is incompatible.[0m
[31mERROR: sphinx-autodoc-typehints 1.12.0 has requirement Sphinx>=3.0, but you'll have sphinx 1.8.5 which is incompatible.[0m
[31mERROR: datascience 0.10.6 has requirement coverage==3.7.1, but you'll have coverage 5.5 which is incompatible.[0m
[31mERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible.[0m
Installing collected packages: sp

## Train an RL Agent


The train agent can be found in the `logs/` folder.

To train it on Pong (Atari), I just have to pass `--env PongNoFrameskip-v4`
I usually retrained the model by refercing the previous trained model made prior.

In [6]:
!python train.py --algo dqn --env PongNoFrameskip-v4 -i logs/dqn/PongNoFrameskip-v4_1/PongNoFrameskip-v4.zip -n 5000


Seed: 1074910586
Default hyperparameters for environment (ones being tuned will be overridden):
OrderedDict([('batch_size', 32),
             ('buffer_size', 10000),
             ('env_wrapper',
              ['stable_baselines3.common.atari_wrappers.AtariWrapper']),
             ('exploration_final_eps', 0.01),
             ('exploration_fraction', 0.1),
             ('frame_stack', 4),
             ('gradient_steps', 1),
             ('learning_rate', 0.0001),
             ('learning_starts', 100000),
             ('n_timesteps', 10000000.0),
             ('optimize_memory_usage', True),
             ('policy', 'CnnPolicy'),
             ('target_update_interval', 1000),
             ('train_freq', 4)])
Using 1 environments
Overwriting n_timesteps with n=5000
Creating test environment
Stacking 4 frames
Wrapping into a VecTransposeImage
Stacking 4 frames
Wrapping into a VecTransposeImage
Loading pretrained agent
Log path: logs/dqn/PongNoFrameskip-v4_1
---------------------------------

#### Evaluate trained agent


You can remove the `--folder logs/` to evaluate pretrained agent.

In [7]:
!python enjoy.py --algo dqn --env PongNoFrameskip-v4 --no-render --n-timesteps 1000 --folder logs/

Loading latest experiment, id=3
Loading logs/dqn/PongNoFrameskip-v4_3/PongNoFrameskip-v4.zip
Stacking 4 frames
Wrapping the env in a VecTransposeImage.
Atari Episode Score: -21.00
Atari Episode Length 3056


### Record  a Video

In [8]:
# Set up display; otherwise rendering will fail
import os
os.system("Xvfb :1 -screen 0 1024x768x24 &")
os.environ['DISPLAY'] = ':1'

In [9]:
!python -m utils.record_video --algo dqn --env PongNoFrameskip-v4 --exp-id 0 -f logs/ -n 1000

Loading latest experiment, id=1
Stacking 4 frames
Saving video to /content/rl-baselines3-zoo/logs/dqn/PongNoFrameskip-v4_1/videos/final-model-dqn-PongNoFrameskip-v4-step-0-to-step-10000.mp4


### Display the video

In [None]:
import base64
from pathlib import Path

from IPython import display as ipythondisplay

def show_videos(video_path='', prefix=''):
  """
  Taken from https://github.com/eleurent/highway-env

  :param video_path: (str) Path to the folder containing videos
  :param prefix: (str) Filter the video, showing only the only starting with this prefix
  """
  html = []
  for mp4 in Path(video_path).glob("{}*.mp4".format(prefix)):
      video_b64 = base64.b64encode(mp4.read_bytes())
      html.append('''<video alt="{}" autoplay 
                    loop controls style="height: 400px;">
                    <source src="data:video/mp4;base64,{}" type="video/mp4" />
                </video>'''.format(mp4, video_b64.decode('ascii')))
  ipythondisplay.display(ipythondisplay.HTML(data="<br>".join(html)))

In [None]:
show_videos(video_path='logs/videos/', prefix='dqn')