# RL Baselines3 Zoo: Training in Colab



Github Repo: [https://github.com/DLR-RM/rl-baselines3-zoo](https://github.com/DLR-RM/rl-baselines3-zoo)

Stable-Baselines3 Repo: [https://github.com/DLR-RM/rl-baselines3-zoo](https://github.com/DLR-RM/stable-baselines3)


# Install Dependencies



In [1]:
# for autoformatting
# %load_ext jupyter_black

In [1]:
!apt-get update && apt-get install swig cmake ffmpeg freeglut3-dev xvfb

Get:1 http://deb.debian.org/debian buster InRelease [122 kB]
Get:2 http://deb.debian.org/debian-security buster/updates InRelease [34.8 kB]
Get:3 http://deb.debian.org/debian buster-updates InRelease [56.6 kB]
Get:4 http://deb.debian.org/debian buster/main amd64 Packages [7,909 kB]
Get:5 http://deb.debian.org/debian-security buster/updates/main amd64 Packages [494 kB]
Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [8,788 B]
Fetched 8,625 kB in 5s (1,876 kB/s)




cmake is already the newest version (3.13.4-1).
The following additional packages will be installed:
  freeglut3 i965-va-driver intel-media-va-driver libaacs0 libaom0 libasound2
  libasound2-data libass9 libasyncns0 libavc1394-0 libavcodec58 libavdevice58
  libavfilter7 libavformat58 libavresample4 libavutil56 libbdplus0 libbluray2
  libbs2b0 libcaca0 libcap2 libcdio-cdda2 libcdio-paranoia2 libcdio18
  libchromaprint1 libcodec2-0.8.1 libcrystalhd3 libdc1394-22 libdrm-amdgpu1
  libdrm-common libdrm-dev li

## Clone RL Baselines3 Zoo Repo

In [2]:
!git clone https://github.com/DLR-RM/rl-baselines3-zoo

Cloning into 'rl-baselines3-zoo'...
remote: Enumerating objects: 5206, done.[K
remote: Counting objects: 100% (49/49), done.[K
remote: Compressing objects: 100% (39/39), done.[K
remote: Total 5206 (delta 17), reused 20 (delta 8), pack-reused 5157[K
Receiving objects: 100% (5206/5206), 3.77 MiB | 10.69 MiB/s, done.
Resolving deltas: 100% (3436/3436), done.
Checking out files: 100% (313/313), done.


In [3]:
%cd /content/rl-baselines3-zoo/

[Errno 2] No such file or directory: '/content/rl-baselines3-zoo/'
/


In [4]:
ls


[0m[01;34mrl-baselines3-zoo[0m/


### Install pip dependencies

In [5]:
!pip install -r requirements.txt

[31mERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'[0m[31m
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.[0m[33m
[0m

## Train an RL Agent


The train agent can be found in the `logs/` folder.

Here we will train A2C on CartPole-v1 environment for 100 000 steps. 


To train it on Pong (Atari), you just have to pass `--env PongNoFrameskip-v4`

Note: You need to update `hyperparams/algo.yml` to support new environments. You can access it in the side panel of Google Colab. (see https://stackoverflow.com/questions/46986398/import-data-into-google-colaboratory)

In [None]:
!python -m rl_zoo3.train --algo a2c --env BreakoutNoFrameskip-v4 --n-timesteps 100000

#### Evaluate trained agent


You can remove the `--folder logs/` to evaluate pretrained agent.

In [None]:
!python -m rl_zoo3.enjoy --algo a2c --env CartPole-v1 --no-render --n-timesteps 5000 --folder logs/

#### Tune Hyperparameters

We use [Optuna](https://optuna.org/) for optimizing the hyperparameters.

Tune the hyperparameters for PPO, using a tpe sampler and median pruner, 2 parallels jobs,
with a budget of 1000 trials and a maximum of 50000 steps

In [None]:
!python -m rl_zoo3.train --algo ppo --env MountainCar-v0 -n 50000 -optimize --n-trials 1000 --n-jobs 2 --sampler tpe --pruner median

### Record  a Video

In [None]:
# Set up display; otherwise rendering will fail
import os
os.system("Xvfb :1 -screen 0 1024x768x24 &")
os.environ['DISPLAY'] = ':1'

In [None]:
!python -m rl_zoo3.record_video --algo a2c --env CartPole-v1 --exp-id 0 -f logs/ -n 1000

### Display the video

In [None]:
import base64
from pathlib import Path

from IPython import display as ipythondisplay


def show_videos(video_path="", prefix=""):
    """
    Taken from https://github.com/eleurent/highway-env

    :param video_path: (str) Path to the folder containing videos
    :param prefix: (str) Filter the video, showing only the only starting with this prefix
    """
    html = []
    for mp4 in Path(video_path).glob("{}*.mp4".format(prefix)):
        video_b64 = base64.b64encode(mp4.read_bytes())
        html.append(
            """<video alt="{}" autoplay 
                    loop controls style="height: 400px;">
                    <source src="data:video/mp4;base64,{}" type="video/mp4" />
                </video>""".format(
                mp4, video_b64.decode("ascii")
            )
        )
    ipythondisplay.display(ipythondisplay.HTML(data="<br>".join(html)))

In [None]:
show_videos(video_path='logs/a2c/CartPole-v1_1/videos/', prefix='')

### Continue Training

Here, we will continue training of the previous model

In [None]:
!python -m rl_zoo3.train --algo a2c --env CartPole-v1 --n-timesteps 50000 -i logs/a2c/CartPole-v1_1/CartPole-v1.zip

In [None]:
!python -m rl_zoo3.enjoy --algo a2c --env CartPole-v1 --no-render --n-timesteps 1000 --folder logs/

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=61d3ed55-8581-4a5d-84e0-b5fd32554f7a' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>