<a href="https://colab.research.google.com/github/decoderkurt/HUF_RL_2022/blob/main/19/atari_games.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stable Baselines3 - Train on Atari Games

Github Repo: [https://github.com/DLR-RM/stable-baselines3](https://github.com/DLR-RM/stable-baselines3)


[RL Baselines3 Zoo](https://github.com/DLR-RM/rl-baselines3-zoo) is a collection of pre-trained Reinforcement Learning agents using Stable-Baselines3.

It also provides basic scripts for training, evaluating agents, tuning hyperparameters and recording videos.

Documentation is available online: [https://stable-baselines3.readthedocs.io/](https://stable-baselines3.readthedocs.io/)

## Install Dependencies and Stable Baselines Using Pip


```
pip install stable-baselines3[extra]
```

In [8]:
!pip install stable-baselines3[extra]



## Import policy, RL agent, ...

In [9]:
from stable_baselines3 import A2C
from stable_baselines3.common.env_util import make_atari_env
from stable_baselines3.common.vec_env import VecFrameStack

## Training on Atari

We will use atari wrapper (it will downsample the image and convert it to gray scale).

About Atari preprocessing: [Frame Skipping and Pre-Processing for Deep Q-Networks on Atari 2600 Games](https://danieltakeshi.github.io/2016/11/25/frame-skipping-and-preprocessing-for-deep-q-networks-on-atari-2600-games/)

![Pong](https://cdn-images-1.medium.com/max/800/1*UHYJE7lF8IDZS_U5SsAFUQ.gif)

In [13]:
! wget http://www.atarimania.com/roms/Roms.rar
! mkdir /content/ROM/
! unrar e /content/Roms.rar /content/ROM/
! python -m atari_py.import_roms /content/ROM/

# There already exists an environment generator that will make and wrap atari environments correctly.
env = make_atari_env('PongNoFrameskip-v4', n_envs=4, seed=0)
# Stack 4 frames
env = VecFrameStack(env, n_stack=4)

--2022-01-17 23:25:35--  http://www.atarimania.com/roms/Roms.rar
Resolving www.atarimania.com (www.atarimania.com)... 195.154.81.199
Connecting to www.atarimania.com (www.atarimania.com)|195.154.81.199|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11128004 (11M) [application/x-rar-compressed]
Saving to: ‘Roms.rar’


2022-01-17 23:25:49 (809 KB/s) - ‘Roms.rar’ saved [11128004/11128004]


UNRAR 5.50 freeware      Copyright (c) 1993-2017 Alexander Roshal


Extracting from /content/Roms.rar

Extracting  /content/ROM/HC ROMS.zip                                      36%  OK 
Extracting  /content/ROM/ROMS.zip                                         74% 99%  OK 
All OK
copying adventure.bin from ROMS/Adventure (1980) (Atari, Warren Robinett) (CX2613, CX2613P) (PAL).bin to /usr/local/lib/python3.7/dist-packages/atari_py/atari_roms/adventure.bin
copying air_raid.bin from ROMS/Air Raid (Men-A-Vision) (PAL) ~.bin to /usr/local/lib/python3.7/dist-pac

In [14]:
model = A2C('CnnPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

Using cuda device
Wrapping the env in a VecTransposeImage.
------------------------------------
| time/                 |          |
|    fps                | 391      |
|    iterations         | 100      |
|    time_elapsed       | 5        |
|    total_timesteps    | 2000     |
| train/                |          |
|    entropy_loss       | -1.4     |
|    explained_variance | 0.011    |
|    learning_rate      | 0.0007   |
|    n_updates          | 99       |
|    policy_loss        | -0.141   |
|    value_loss         | 0.173    |
------------------------------------
------------------------------------
| rollout/              |          |
|    ep_len_mean        | 3.41e+03 |
|    ep_rew_mean        | -20.8    |
| time/                 |          |
|    fps                | 407      |
|    iterations         | 200      |
|    time_elapsed       | 9        |
|    total_timesteps    | 4000     |
| train/                |          |
|    entropy_loss       | -1.76    |
|    explained_v

KeyboardInterrupt: ignored

## Download / Upload Trained Agent and Continue Training

Save and download trained model

In [15]:
from google.colab import files

In [16]:
model.save("a2c_pong")
files.download("a2c_pong.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Upload train agent from your local machine

In [None]:
files.upload()

In [None]:
!du -h a2c*

7.0M	a2c_pong.zip


Load the agent, and then you can continue training

In [None]:
trained_model = A2C.load("a2c_pong", verbose=1)
env = make_atari_env('PongNoFrameskip-v4', n_envs=4, seed=0)
env = VecFrameStack(env, n_stack=4)
trained_model.set_env(env)

Wrapping the env in a VecTransposeImage.


In [None]:
trained_model.learn(int(0.5e6))

In [None]:
trained_model.save("a2c_pong_2")
files.download("a2c_pong_2.zip")