<a href="https://colab.research.google.com/github/xaviercallens/lab/blob/master/amadeus_student_get_started_with_reinforcement_leanning_openai_gym.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Get started with OpenAI Gym
> Learn how to use OpenAI Gym and load an environment to test Reinforcement Learning strategies.

- toc: false
- badges: true
- comments: true
- author: dzlab
- categories: [tensorflow, reinforcement]


This article walks through how to get started quickly with [OpenAI Gym](https://github.com/openai/gym) environment which is a platform for training RL agents. Later, we will use Gym to test intelligent agents implemented with TensorFlow.

To fully install OpenAI Gym and be able to use it on a notebook environment like [Google Colaboratory](https://colab.research.google.com/) we need to install a set of dependencies:

- [xvfb](https://en.wikipedia.org/wiki/Xvfb) an X11 display server that will let us render Gym environemnts on Notebook
- [gym (atari)](https://github.com/openai/gym) the Gym environment for Arcade games
- [atari-py](https://github.com/openai/atari-py) is an interface for Arcade Environment. We will use it to load Atari games' Roms into Gym
- [gym-notebook-wrapper](https://github.com/ymd-h/gym-notebook-wrapper) A rendering helper that we will use to display OpenAI Gym games a Notebook

> Note: atari-py was depreacated and is replaced with [ale-py](https://github.com/mgbellemare/Arcade-Learning-Environment). However we can still use it.

In [None]:
%%capture
%%bash

apt install xvfb
pip install gym[atari]
pip install gym-notebook-wrapper
pip install atari-py

After installation we can check if Gym was installed properly and list names of all available environments sorted alphabetically:

In [None]:
from gym import envs
# Instead of using envs.registry.all(), directly access the values of the registry dictionary
env_names = list(envs.registry.keys())
for name in sorted(env_names[:10]):
    print(name)

ALE/Adventure-ram-v5
ALE/Adventure-v5
ALE/AirRaid-ram-v5
ALE/AirRaid-v5
ALE/Alien-ram-v5
ALE/Alien-v5
ALE/Amidar-ram-v5
ALE/Amidar-v5
ALE/Assault-ram-v5
ALE/Assault-v5


Next, we need to install Atari Arcade ROMs so that we could load those games into Gym.
1. We need to download the [Roms.rar](http://www.atarimania.com/roms/Roms.rar) file that contains the games
2. We load the Roms to make them accessible to Gym

In [None]:
%%capture
%%bash

curl -O http://www.atarimania.com/roms/Roms.rar
mkdir roms
yes | unrar e Roms.rar roms/
python -m atari_py.import_roms roms/

Now, we are ready to play with Gym using one of the available games (e.g. Alien-v4). We will start the display server, then for multiple times we execute a sampled actions for our agent and check the result. If the agent dies we start a new episode.

In [None]:
!pip install gym[accept-rom-license]

Collecting autorom~=0.4.2 (from autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gym[accept-rom-license])
  Downloading AutoROM-0.4.2-py3-none-any.whl.metadata (2.8 kB)
Collecting AutoROM.accept-rom-license (from autorom[accept-rom-license]~=0.4.2; extra == "accept-rom-license"->gym[accept-rom-license])
  Downloading AutoROM.accept-rom-license-0.6.1.tar.gz (434 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m434.7/434.7 kB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Downloading AutoROM-0.4.2-py3-none-any.whl (16 kB)
Building wheels for collected packages: AutoROM.accept-rom-license
  Building wheel for AutoROM.accept-rom-license (pyproject.toml) ... [?25l[?25hdone
  Created wheel for AutoROM.accept-rom-license: filename=AutoROM.accept_rom_license-0.6.1-py3-none-any.whl siz

In [None]:
!ale-import-roms roms/




Imported 0 / 0 ROMs


In [None]:
%%capture
%%bash

apt install xvfb
pip install gym[atari]
pip install gym-notebook-wrapper
pip install atari-py
pip install gym[accept-rom-license] # Install ROMs with license acceptance

In [None]:
%%bash

rm -rf game/*
mkdir -p game

In [None]:
%%capture
%%bash

curl -O http://www.atarimania.com/roms/Roms.rar
mkdir roms
yes | unrar e Roms.rar roms/
ale-import-roms roms/ # Import ROMs using ale-import-roms

In [None]:
import gnwrapper
import gym

# Start the display server
# Specify render_mode='rgb_array' when creating the environment
env = gnwrapper.Monitor(gym.make('Alien-v4', render_mode='rgb_array'), directory="./game")

o = env.reset()

# Take 1000 actions by randomly sampling from the action space
for _ in range(1000):
    action = env.action_space.sample()
    # Accommodate potential extra return values by using *_
    observation, reward, terminated, truncated, info = env.step(action)
    # Combine terminated and truncated into done for compatibility
    done = terminated or truncated
    if done:
        env.reset()

# display saved display images as movies
env.display()

  logger.warn(
  logger.warn(
  if not isinstance(terminated, (bool, np.bool8)):


Moviepy - Building video /content/game/rl-video-episode-0.mp4.
Moviepy - Writing video /content/game/rl-video-episode-0.mp4



                                                  

Moviepy - Done !
Moviepy - video ready /content/game/rl-video-episode-0.mp4




Moviepy - Building video /content/game/rl-video-episode-0.mp4.
Moviepy - Writing video /content/game/rl-video-episode-0.mp4





Moviepy - Done !
Moviepy - video ready /content/game/rl-video-episode-0.mp4
Moviepy - Building video /content/game/rl-video-episode-1.mp4.
Moviepy - Writing video /content/game/rl-video-episode-1.mp4





Moviepy - Done !
Moviepy - video ready /content/game/rl-video-episode-1.mp4


'rl-video-episode-0.mp4'

'rl-video-episode-1.mp4'

> Notice that there are more then one displayed video. This is because when the episode finishes (i.e. agent dies) we reset the environment with `env.reset()` to start a new episode. i.e. each video displayed corresponds to one episode in the game.

The followig explains the variables returned as part of the result of `env.step(action)` in the previous script:

- `observation` (Object): Observation returned by the environment. The object could be the RGB pixel data from the screen/camera, RAM contents, join angles and join velocities of a robot, and so on, depending on the environment.
- `reward` (Float): Reward for the previous action that was sent to the environment. The range of the Float value varies with each environment, but irrespective of the environment, a higher reward is always better and the goal of the agent should be to maximize the total reward.
- `done` (Boolean): Indicates whether the environment is going to be reset in the next step. When the Boolean value is true, it most likely means that the episode has ended (due to loss of like of the agent, timeout, or some other episode termination criteria).
- `info` (Dict): Some additional information that can optionally be sent out by an environment as a dictionary of arbitrary key-value pairs. The agent we develop should not rely on any of the information in this dictionary for taking action. It may be used (if available) for debugging purposes.

Here are some links I found useful:
- Run and Render OpenAI Gym on Google Colab (Gym-Notebook-Wrapper) - [link](https://ymd_h.gitlab.io/ymd_blog/posts/gym_on_google_colab_with_gnwrapper/)
- T81-558: Applications of Deep Neural Networks - [link](https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_01_ai_gym.ipynb)

I hope you enjoyed this article, feel free to leave a comment or reach out on twitter [@bachiirc](https://twitter.com/bachiirc)