# OpenAI Gym

This repository demonstrates a working OpenAI Gym environment in a Jupyter Notebook. It can be used as a starting point to use reinforcement algorithms to train the agent to learn. 

OpenAI Gym's repository can be found <a href='https://github.com/openai/gym'>here</a> with its extensive documentation, including installation instructions, found <a href='https://gym.openai.com/docs/'>here</a>.

First the `gym` library needs to be imported.

In [1]:
import gym

A gym environment called `env` can be created that simulates a car on a mountain. 

In [4]:
env = gym.make('BipedalWalker-v2')

AttributeError: module 'gym.envs.box2d' has no attribute 'BipedalWalker'

The environment first needs to be reset to become active. 

In [3]:
observation = env.reset()
print('The car is at X position {} and Y position {} after being reset.'.format(observation[0], observation[1]))

The car is at X position -0.41194042991768265 and Y position 0.0 after being reset.


The environment can be now be displayed.

In [4]:
env.render()

True

The state of the car is given by two coordinates. This can also be determined by executing the following command.

In [5]:
env.observation_space

Box(2,)

Both the upper and lower boundaries of the observation space can be determined by:

In [6]:
print('The upper boundary of the observation space is {}'.format(env.observation_space.high))
print('The lower boundary of the observation space is {}'.format(env.observation_space.low))

The upper boundary of the observation space is [0.6  0.07]
The lower boundary of the observation space is [-1.2  -0.07]


A list of the actions that the car can make can be determined by:

In [7]:
env.action_space

Discrete(3)

This shows that there are three discrete actions that the car can make. The car can perform an action by using the `step` class method. For example, `env.step(0)` will propel the car in the left direction. The class method `step` will return four things: `observation, reward, done, info`. In order to visually see the action taken, the `render` class method needs to be called.
A simple example that shows the car always choosing action 0 is shown below:

In [8]:
env.reset()

done = False

while done is False:
    
    observation, reward, done, info = env.step(0)
    env.render()

As can be seen, `env.step.(0)` moves the car to the left. The car can also exectute an action at random. The code below shows the car doing exactly this.

In [9]:
env.reset()

done = False

while done is False:
    
    observation, reward, done, info = env.step(env.action_space.sample())
    env.render()

In order to close the environment that was created, the following command needs to be executed:    

In [10]:
env.close()

## Conclusion


As is seen in this notebook, OpenAI's Gym allows a user to create a toy environment that allows an object to be controlled. In this notebook, the Mountain Cart environment was used. The car could be controlled by three actions that change the state of the car. This setup allows a user to write a reinforcement algorithm that would allow the car to surmount the mountain. 