This project implements a Q-learning algorithm to solve the FrozenLake-v1 environment from OpenAI's Gymnasium. The agent is trained to navigate a 4x4 grid of ice to reach the goal while avoiding holes, using a reinforcement learning approach.
- Environment: FrozenLake-v1 (4x4 map, deterministic movements)
- Algorithm: Q-learning with epsilon-greedy exploration
- Training Output: A Q-table that learns the optimal actions for each state
- Visualization: Training and testing results are saved as gifs (
training.pngandtest.png)
The agent learns through multiple episodes by interacting with the environment, updating its Q-table based on rewards and future expectations. After training, it navigates the frozen lake efficiently by following the optimal learned policy.
The following images demonstrate the training and testing phases of the Q-learning algorithm:
-
Training Progress: Shows how the agent learns to navigate the frozen lake over training episodes.
-
Testing Performance: Shows how the agent performs after training for 1000 episodes.
An interactive Streamlit app is available to visualize and interact with the Q-learning process:
- Training: Input the number of training episodes and click "Select" to train the agent. Higher numbers of episodes generally improve the agent's performance.
- Play: After training, click "Computer Plays" to watch the agent navigate the FrozenLake environment using the learned policy. You can see the agent's steps and its environment renderings.
The theoretical background and code for this repository is largely unchanged from its original source due to its simplicity and clear explanation. The content is based on the following resource:
- Hugging Face Deep Reinforcement Learning Course (Theoretical Background)
- Hugging Face Deep Reinforcement Learning Course (Code)
In extension to classical Q-Learning, there is Deep Q-Learning (DQN) to address environments with larger state spaces. DQN leverages neural networks to approximate the Q-values, making it feasible to handle more complex problems where a Q-table would be impractical.
For more details, including the implementation code and additional documentation, please visit the Deep Q-Learning Repository.


