<div align="center" style="line-height: 1.7;">
    <h2 style="font-weight: 600;"><strong>Welcome to the Reinforcement Learning Project</strong></h2>
</div> 

&nbsp;

This notebook serves as a quick **guide and orientation** for anyone opening this repository for the first time.
It explains what the project is about, how the files are organized, and how to get started running experiments in **JupyterLab**.



By: Felipe Campoverde

---

# Project Overview

Reinforcement Learning (RL) is a branch of Machine Learning that focuses on **how agents learn to make decisions** through interacting with an environment.
The agent receives **rewards or penalties** based on its actions and gradually learns a policy that **maximizes cumulative reward** over time.

In this project, the student implemented and compared several RL algorithms using a **custome GridWorld environment**:

- **Q-Learning** - model-free, off-policy learning.
- **SARSA** - model-free, on-policy learning.
- **Dyna-Q** - model-based RL that blends learning and planning.

Each algorithm is trained to navigate a **stochastic** grid world with obstacles, pits, and a goal state.

---

## Environment Setup

Before running notebooks, make sure your environment is active and functional:

```bash
source .venv/bin/activate
jupyter lab
```

If you have not yet installed dependencies:
```bash
# Preferable
pip install -r requirements.txt

# In case 'requirements.txt' is not available
pip install numpy scipy matplotlib jupyterlab ipykernel pandas tqdm \
            black ruff pytest pytest-cov mypy gymnasium pygame

# install packages in editable mode
pip install -e .

# Run Unit Tests
pytest -q
```
---

## Repository Structure

```text
reinforcement_learning/
├── Start_Here.ipynb     
├── notebooks/               ← main experiments
|   |── 00_RL.ipynb      ← You are here!
│   ├── 01_q_learning.ipynb
│   ├── 02_sarsa.ipynb
│   ├── 03_dyna_q.ipynb
│   ├── 04_comparison_models.ipynb
│   ├── 05_k_sweep.ipynb
│   ├── 06_robustness.ipynb
│   └── 07_results.ipynb
│ 
├── src/rl_capstone/         ← core implementation
│   ├── gridworld.py
│   ├── rl_algorithms.py
│   └── utils.py
│ 
├── data/                    ← training logs and results
├── figs/                    ← generated plots
├── reports/                 ← milestone & final reports
├── tests/                   ← unit tests
└── README.md                ← setup and project overview
```

---

## How to Get Started

1. Open the **Q-Learning notebook**:
   
- [Q-Learning](notebooks/01_q_learning.ipynb)

2. Run all cells top-to-bottom (**Shift + Enter**) to train the agent.

3. Explore other algorithms:
- [SARSA notebook](notebooks/02_sarsa.ipynb)
- [Dyna-Q notebook](notebooks/03_dyna_q.ipynb)

4. Compare results from algorithms:  
[Comparing Models](notebooks/04_comparison_models.ipynb)

5. Analyze differences in Dyna-Q behaviour under different tests:  
- [K Sweeping Dyna-Q](notebooks/05_k_sweep.ipynb)
- [Robustness & Generalization](notebooks/06_robustness.ipynb)

6. Results:  
- [Results](notebooks/07_results.ipynb)

---

## Notes

- All algorithm implementations live in the **src/rl_capstone/** folder.
- The GridWorld environment defines states, transitions, and rewards.
- You can modify hyperparameters (**alpha**, **gamma**, **epsilon**, etc) directly in each notebook to experiment.

---

## Next Steps
- Start with **Q-Learning** to understand the training loop.
- Proceed to **SARSA** to compare on-policy learning.
- Explore **Dyna-Q** to see how planning accelerates learning.
- Test different case scenarios using Dyna-Q to learn changes on its behavior.
- Document your results and insights in the **result** notebook.

---

<style>
    .button {
        background-color: #3b3b3b;
        color: white;
        padding: 25px 60px;
        border: none;
        border-radius: 12px;
        cursor: pointer;
        font-size: 30px;
        transition: background-color 0.3s ease;
    }

    .button:hover {
        background-color: #45a049;
        transform: scale(1.05);
    }
    
</style>

<div style=" text-align: center; margin-top:20px;">
    
  <a href="../Start_Here.ipynb">
    <button class="button">
      ⬅️ Prev: Start Here
    </button>
  </a>
  <span style="display:inline-block; width:200px;"></span>
  <a href="01_q_learning.ipynb">
    <button class="button">
      Next: Q-Learning ➡️
    </button>
  </a>
  
</div>
