rl-environment-study

A from-scratch study of tabular Q-Learning and DQN on a custom Gymnasium environment, exploring how ε-greedy exploration, reward shaping, and wind dynamics affect learned policies.

Highlights

Agent	Final Avg Reward (20-ep)	Steps to Goal	Key Behavior
Q-Learning	~ -15	~15 steps	Learns wind-compensating policy
DQN	~ -17	~17 steps	Comparable via neural function approx
Random	~ -200	timeout	No learning baseline

Project Structure

rl-environment-study/
├── src/
│   ├── environments.py     # WindyGridWorld (custom Gymnasium env)
│   ├── q_learning.py       # Tabular Q-Learning with ε-greedy
│   ├── dqn.py              # DQN with replay buffer + target network
│   └── visualization.py    # Reward curves, Q-table heatmaps, policy arrows
├── notebooks/
│   └── rl_study.ipynb      # Full study notebook with 9 sections
├── tests/
│   ├── test_environments.py # 10 environment tests
│   └── test_agents.py      # 14 agent tests (Q-Learning + DQN)
├── evidence/               # Exported PNG evidence from notebook runs
└── pyproject.toml

Key Concepts

Windy GridWorld: NxM grid with column-dependent upward wind. Agent must learn to compensate for wind to reach the goal efficiently
Q-Learning: Off-policy TD control — Q(s,a) ← Q(s,a) + α[r + γ max Q(s',a') - Q(s,a)]
ε-greedy exploration: Balance exploration vs exploitation with decaying ε
DQN: Neural function approximation with experience replay and target network stabilization
Bellman optimality: The Q-table converges to Q* when every state-action pair is visited infinitely often with decaying step sizes

PDF Report

A fully executed notebook with all outputs is available as a PDF:
notebooks/rl_study.pdf

Quick Start

# Install
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install gymnasium
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Open the study notebook
jupyter notebook notebooks/rl_study.ipynb

Requirements

Python ≥ 3.10
gymnasium ≥ 0.29
PyTorch ≥ 2.0
numpy, matplotlib, pandas

Author

Chris Schmidt — MS Applied Mathematics | AI Engineering MSE (JHU)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
DEMO_GUIDE.md		DEMO_GUIDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rl-environment-study

Highlights

Project Structure

Key Concepts

PDF Report

Quick Start

Requirements

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rl-environment-study

Highlights

Project Structure

Key Concepts

PDF Report

Quick Start

Requirements

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages