# Frozen Lake

**"Winter is here. You and your friends were tossing around a frisbee at the park when you made a wild throw that left the frisbee out in the middle of the lake. The water is mostly frozen, but there are a few holes where the ice has melted. If you step into one of those holes, you'll fall into the freezing water. At this time, there's an international frisbee shortage, so it's absolutely imperative that you navigate across the lake and retrieve the disc. However, the ice is slippery, so you won't always move in the direction you intend."**

from [openai](https://gym.openai.com/envs/FrozenLake-v0/)


In [1]:
# all imports
import gym
import numpy as np

In [2]:
# creating the MDP
wrapper = gym.Wrapper(gym.make("FrozenLake-v0"))

## The MDP has the following form:


![frozen](img/fl.png)




We always can use some methods from the wrapper class to see some aspects of the MDP

In [3]:
print("observation space: {}".format(wrapper.observation_space))
print("Actions space: {}".format(wrapper.action_space))
print("reward range: {}".format(wrapper.reward_range))

observation space: Discrete(16)
Actions space: Discrete(4)
reward range: (-inf, inf)


## Using the method render, we can visualize the agent moving in the enviroment

- S is start

- F is frozen

- H is hole

- G is goal

In [4]:
plan = [1,1,1,2]

wrapper.reset()
wrapper.render()

for i in range(len(plan)):
        action = plan[i]
        obs, reward , done , info = wrapper.step(action)
        wrapper.render()


[41mS[0mFFF
FHFH
FFFH
HFFG
  (Down)
[41mS[0mFFF
FHFH
FFFH
HFFG
  (Down)
S[41mF[0mFF
FHFH
FFFH
HFFG
  (Down)
SFFF
F[41mH[0mFH
FFFH
HFFG
  (Right)
SFFF
F[41mH[0mFH
FFFH
HFFG


## Playing 100 episodes with random actions

In [5]:
total_reward = 0

episodes = 200

for i in range(episodes):
    done = False
    wrapper.reset()
    while done is False:
        action = np.random.randint(0,4)
        obs, reward , done , _ = wrapper.step(action)
        total_reward += reward
        

print("Average reward = {}".format(total_reward / episodes))

Average reward = 0.005
