## Classic Control

### FrozenLake (v1)

The agent controls the movement of a character in a grid world. 
Some tiles of the grid are walkable, and others lead to the agent falling into the water. 
Additionally, the movement direction of the agent is uncertain and only partially depends on the chosen direction. 
The agent is rewarded for finding a walkable path to a goal tile.

**Story:**
_Winter is here. You and your friends were tossing around a frisbee at the park when you made a wild throw that left the frisbee out in the middle of the lake._
_The water is mostly frozen, but there are a few holes where the ice has melted._
_If you step into one of those holes, you'll fall into the freezing water._

_At this time, there's an international frisbee shortage, so it's absolutely imperative that you navigate across the lake and retrieve the disc._
_However, the ice is slippery, so you won't always move in the direction you intend._

_The surface is described using a grid like the following:_

```
    SFFF
    FHFH
    FFFH
    HFFG

S : starting point, safe
F : frozen surface, safe
H : hole, fall to your doom
G : goal, where the frisbee is located
```

[Environment Source code (GitHub)](https://github.com/openai/gym/blob/master/gym/envs/toy_text/frozen_lake.py)

In [1]:
from IPython.display import Markdown, display, clear_output

In [2]:
import engine
import envs

In [3]:
env = engine.instantiate(envs.FROZEN_LAKE, is_slippery=False)

print("Observation space type:", type(env.observation_space))
print("Observation space size:", env.observation_space.n, "*", env.observation_space.dtype)

display(Markdown('---'))

print("Action space type:", env.action_space)

display(Markdown('---'))

print("Reward range:", env.reward_range)

Observation space type: <class 'gym.spaces.discrete.Discrete'>
Observation space size: 16 * int64


---

Action space type: Discrete(4)


---

Reward range: (0, 1)


In [4]:
display(Markdown("### Initial state:"))

desc = [[c.decode("utf-8") for c in line] for line in env.env.desc]
print("\n".join(" ".join(line) for line in desc))

### Initial state:

S F F F
F H F H
F F F H
H F F G


In [5]:
from agents import AStarAgent

agent = AStarAgent(env, (4, 4))    

In [6]:
for i in range(1):
    clear_output(wait=True)
    reward = engine.run(env, agent, timeout=0.5)
    print("Iteration", i, "->", "Reward", reward)

  (Down)
SFFF
[41mF[0mHFH
FFFH
HFFG
  (Down)
SFFF
FHFH
[41mF[0mFFH
HFFG
  (Right)
SFFF
FHFH
F[41mF[0mFH
HFFG
  (Down)
SFFF
FHFH
FFFH
H[41mF[0mFG
  (Right)
SFFF
FHFH
FFFH
HF[41mF[0mG
  (Right)
SFFF
FHFH
FFFH
HFF[41mG[0m
Iteration 0 -> Reward 1.0
