![](imgs/gym.png)
# Introduction to OpenAI Gym

[Gym](https://gym.openai.com/) is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Pinball. It makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano.

## Installation

First, we need to install `gym` in our local machine. To do this, simply install `gym` using pip.

In [None]:
!pip install gym

## Gym Environments

The OpenAI Gym involves a diverse suite of physical simulation environments ranging from easy to challenging tasks that we can play with and test our reinforcement learning algorithms with it. These include **Classic control** games, **Atari**, **MuJoCo**, **Robotics** and much more. You can find more about gym environments [here](https://gym.openai.com/envs).

In this course, we will focus on **Toy text**, **Classic control** and **Atari** environments.

### Classic control and Toy text

These environments include small-scale tasks, mostly from the RL literature.

<img src="imgs/mountain_car.gif" width="400" align="left">

### Atari

These include classic atari games, which had a big impact on reinforcement learning research.

<img src="imgs/breakout.gif" width="200" align="left">

## Creating your first environment

We can simply call `gym.make("env_name")` to create an environment object. Here "env_name" denotes the name of the environment we are calling. All the available names of the environments can be found [here]().

In [43]:
# importing openai gym library
import gym
import numpy as np

# create classic cart-pole env
env = gym.make('MountainCar-v0')

In [46]:
# reset/initialize the env
env.reset()

for _ in range(200):
    env.step(env.action_space.sample()) # take a random action
    env.render() # render the environment

# close the rendering
env.close()

We can interact with the environment by two main methods:

1. `env.reset()`
2. `env.step(action)`

> `obs = env.reset()` method initialize and returns an initial observation (or state) of the environment. We will learn more about gym observations later.

> `obs_next, reward, done, info = env.step(action)` method interacts with the environment by taking an action as an input and returns four values: obs_next(env.observation_space), reward(float), done(bool) and info(dict).

## Spaces

Every gym environment comes with an action_space and an observation_space. The formats of action and observation of an environment are defined by `env.action_space` and `env.observation_space` respectively, which are of type `Space`.

Types of gym `spaces`:

- `gym.spaces.Discrete(n)`: fixed range of non-negative discrete numbers from 0 to n-1.
- `gym.spaces.Box`: represents an n-dimensional box, where the upper and lower bounds of each dimension are defined by `Box.low` and `Box.high`.

Lets explore these two spaces.

In [12]:
# import spaces module from gym
from gym import spaces

In [14]:
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}

In [15]:
space

Discrete(8)

In [16]:
space.sample()

4

In [27]:
low_value = np.array([0,0,-1])
high_value = np.array([1,1,1])

box = spaces.Box(low_value, high_value)

In [28]:
box

Box(-1.0, 1.0, (3,), float32)

In [29]:
box.low

array([ 0.,  0., -1.], dtype=float32)

In [30]:
box.high

array([1., 1., 1.], dtype=float32)

We can now check what are the spaces of previous cart-pole example used. You can find more about the spaces of cart-pole environment [here](https://github.com/openai/gym/blob/master/gym/envs/classic_control/cartpole.py#L26).

In [31]:
env = gym.make('CartPole-v0')

print(env.action_space)

print(env.observation_space)

Discrete(2)
Box(-3.4028234663852886e+38, 3.4028234663852886e+38, (4,), float32)


In [33]:
env.observation_space.low

array([-4.8000002e+00, -3.4028235e+38, -4.1887903e-01, -3.4028235e+38],
      dtype=float32)

In [34]:
env.observation_space.high

array([4.8000002e+00, 3.4028235e+38, 4.1887903e-01, 3.4028235e+38],
      dtype=float32)