# OpenAI Gym CartPole Environment

![](CartPole.png)

The OpenAI Gym website contains information about widely used environments for testing reinforcement learning algorithms. It is at [https://gym.openai.com/](https://gym.openai.com/). This notebook demonstrates the steps to go through the Getting Started information at [Getting Started with Gym](https://gym.openai.com/docs/).

OpenAI Gym also has a GitHub site at [https://github.com/openai/gym/](https://github.com/openai/gym/). More information about the CartPole problem can be found at [CartPole Wiki](https://github.com/openai/gym/wiki/CartPole-v0).

## OpenAI Gym Installation ##
OpenAI gym is not installed as part of the Anaconda Python or standard Python distributions.  You need to install it from your command prompt or terminal. 

For Anaconda Python remember, to activate your Python (base) environment.
```text
conda activate base
```
Then use *conda* to install gym.
```text
conda install -c conda-forge gym
```

Use *pip* for the standard Python install.

```text
pip install gym
```


### Getting Started with Gym ###
Next, we will go through the [Getting Started with Gym](https://gym.openai.com/docs/) examples.

#### Getting Started Example 1 ###

You may be excited to try the first example and see the cart and pole move around on your screen. I know that I was! However, there are problems rendering graphics from within the Jupyter environment. To circumvent these problems you can place the code in a .py file and then run Python on it from the terminal or command prompt window.

You can also run the code from within the Jupyter environment using the Launcher. Select Textfile from the Launcher and paste the code below into the window. Then rename the file to 'cartpole1.py' and save it. Open a terminal from the Launcher. There run the command
```bash
python cartpole1.py
```

Below is the contents pf the cartpol1.py file
```python
import gym
env = gym.make('CartPole-v0')
env.reset()
for _ in range(100):
    env.render()
    env.step(env.action_space.sample()) # take a random action
env.close()
```

In the figure below you can see the cartpole1.py that has been pasted into a Textfile.

![](PythonFile1.png)

In this figre you see the terminal window with the execution of python on the file cartpole1.py. (Ignore the wwarning for now.)

![](Terminal1.png)

Here you see one frame of the rendered image.

![](RenderedCartPole.png)

#### Getting Started Example 2 ####

Below is the second example. I used the same process to create a .py ilfe  
```python
import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()
```

Upon execution, this renders the cartpole problem until the 'done' flag is True and then starts over again. This is done 100 times. In addition to rendering a movie of the cart pole pole, it prints out the state information at each step and then the total number of steps executed until the done flag is set to true as shown below.

![](Terminal2.png)

#### Example ####
1. Import the gym environment
2. Instantiate the Cart Pole environment
3. Print information on the observation space
4. Print information about the action space

In [1]:
import gym
env = gym.make('CartPole-v0')
print('Observation Space:', env.observation_space)
print('Action Space:', env.action_space)

Observation Space: Box(-3.4028234663852886e+38, 3.4028234663852886e+38, (4,), float32)
Action Space: Discrete(2)


#### Example ####
1. Print the largest values in the observation space
2. Print the smallest values in the observation space

In [2]:
print('high:',env.observation_space.high)
print('low: ', env.observation_space.low)

high: [4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38]
low:  [-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38]


#### Example ####
Reset the observation space to place the cart and pole into an initial state. The four observation values are set to random small numbers to start the problem. These values will be different each time you start the problem. The four values are

1. Position of the cart
2. Velocity of the cart
3. Angular position of the end of the pole
4. Angular velocity of the end of the pole

In [3]:
env.reset()

array([ 0.01128073,  0.02070885, -0.02623478,  0.02326702])