# Intro to gym api

The tutorials are taken from https://deeplizard.com/learn/video/QK_PP_2KgGE


In [12]:
import numpy as np
import gym
import random
import time
from IPython.display import clear_output


## Creating The Environment

Next, to create our environment, we just call `gym.make()` and pass a string of the name of the environment we want to set up. We'll be using the environment FrozenLake-v0. All the environments with their corresponding names you can use here are available on [Gym's website](https://gym.openai.com/envs/#classic_control).


In [13]:
env = gym.make("FrozenLake-v1")


## Loading The Q-Table

We're now going to load our trained Q-table as in `L-07-gym-Q-learning`

Remember, the number of rows in the table is equivalent to the size of the state space in the environment, and the number of columns is equivalent to the size of the action space. We can get this information using using env.observation_space.n and env.action_space.n, as shown below.


In [14]:
action_space_size = env.action_space.n
state_space_size = env.observation_space.n

q_table = np.loadtxt("L-07-Q_table.txt")
q_table

array([[0.50757154, 0.44019976, 0.45177627, 0.45741024],
       [0.2855845 , 0.31854987, 0.27639278, 0.42394095],
       [0.33069364, 0.26150264, 0.26573058, 0.27020972],
       [0.08498068, 0.15198439, 0.07541347, 0.10260243],
       [0.52518316, 0.3934814 , 0.34749928, 0.37584832],
       [0.        , 0.        , 0.        , 0.        ],
       [0.16623461, 0.1321614 , 0.31466261, 0.10396448],
       [0.        , 0.        , 0.        , 0.        ],
       [0.37000304, 0.42302033, 0.41291809, 0.55496024],
       [0.4862758 , 0.58800947, 0.42354015, 0.36065907],
       [0.5296818 , 0.32909486, 0.38978523, 0.24914056],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.30754057, 0.55994573, 0.74227529, 0.49746954],
       [0.73491478, 0.84302153, 0.76762264, 0.75333791],
       [0.        , 0.        , 0.        , 0.        ]])

## Watch The Agent Play The Game
This block of code is going to allow us to watch our trained agent play Frozen Lake using the knowledge it's gained from the training we completed.

In [15]:
max_steps_per_episode = 100

for episode in range(3):
    state = env.reset()
    done = False
    print("*****EPISODE ", episode+1, "*****\n\n\n\n")
    time.sleep(1)
    for step in range(max_steps_per_episode):        
        clear_output(wait=True)
        env.render()
        time.sleep(0.3)
        action = np.argmax(q_table[state,:])        
        new_state, reward, done, info = env.step(action)
        if done:
            clear_output(wait=True)
            env.render()
            if reward == 1:
                print("****You reached the goal!****")
                time.sleep(3)
            else:
                print("****You fell through a hole!****")
                time.sleep(3)
                clear_output(wait=True)
            break
        state = new_state
env.close()

****You fell through a hole!****
