# Q-Learning on FrozenLake (Gymnasium)
We will use the FrozenLake-v1 environment from Gymnasium and solve it with Q-learning.


## Imports and Helpers
We import the libraries, set a random seed, and make a helper for moving averages.


In [2]:
import numpy as np
import gymnasium as gym
from collections import deque
import matplotlib.pyplot as plt

# For reproducibility
SEED = 0
np.random.seed(0)

def moving_average(x, window=100):
    if len(x) < window:
        return np.array(x, dtype=float)
    cumsum = np.cumsum(np.insert(x, 0, 0))
    return (cumsum[window:] - cumsum[:-window]) / float(window)

ARROWS = {0: "←", 1: "↓", 2: "→", 3: "↑"}

## Environment
We use the classic **4x4 FrozenLake** environment.  
- Start = top left  
- Goal = bottom right  
- Holes = fall in, episode ends  
- Slippery ice makes actions stochastic


In [None]:
# Classic 4x4 FrozenLake
# +1 for reaching the goal, 0 otherwise
env = gym.make("FrozenLake-v1", map_name="4x4", is_slippery=True)
n_states = env.observation_space.n
n_actions = env.action_space.n

print("Observation space:", n_states)
print("Action space:", n_actions)

Observation space: 16
Action space: 4


## Q-table and Hyperparameters
- **Q-table**: stores values for each state-action pair  
- **α (alpha)**: learning rate  
- **γ (gamma)**: discount factor  
- **ε (epsilon)**: exploration rate
