# Using Gym and other rl frameworks

[Gym](https://github.com/openai/gym/) provides a set of environments to test reinforcement learning algorithms. In this quick intro we will discuss:

* How to use it
* How to create our own wrappers
* How to use some other environments from different libraries

In [2]:
# single-import required for gym :D
import gym

# some other functionality we might need
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
%matplotlib inline

Gym provides various environments, each accessible through its own environment-id. The list can be found [here](https://github.com/openai/gym/blob/master/docs/environments.md)

In [7]:
# create and environment through its ID
env = gym.make( 'CartPole-v0' )
# inspect it a little bit
print( dir( env ) )

['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_elapsed_steps', '_max_episode_steps', 'action_space', 'class_name', 'close', 'compute_reward', 'env', 'metadata', 'observation_space', 'render', 'reset', 'reward_range', 'seed', 'spec', 'step', 'unwrapped']


In [10]:
env.action_space.n

2

Recall the RL agent-environment interface. Gym provides just that through the following API:

* **env.reset()** : resets the environment to a initial configuration from the initial state distribution $\rho_{0}$
* **env.step( action )** : executes the given **action** in the environment, and returns the transition information,  consisting of **next state(s)**, **reward(s)**, and **termination flag(s)**.

Aditionally, most environments provided some kind of visualization, which can be accesses through the **env.render()** method.

In [17]:
# reset environment and grab state from initial distribution
s = env.reset()
while True :
    a = np.random.randint( 0, env.action_space.n )
    snext, r, done, _ = env.step( a )
    env.render()
    
    if done :
        break
env.close()