Skip to content

PCSchmidt/rl-environment-study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rl-environment-study

A from-scratch study of tabular Q-Learning and DQN on a custom Gymnasium environment, exploring how ε-greedy exploration, reward shaping, and wind dynamics affect learned policies.

CI

Highlights

Agent Final Avg Reward (20-ep) Steps to Goal Key Behavior
Q-Learning ~ -15 ~15 steps Learns wind-compensating policy
DQN ~ -17 ~17 steps Comparable via neural function approx
Random ~ -200 timeout No learning baseline

Project Structure

rl-environment-study/
├── src/
│   ├── environments.py     # WindyGridWorld (custom Gymnasium env)
│   ├── q_learning.py       # Tabular Q-Learning with ε-greedy
│   ├── dqn.py              # DQN with replay buffer + target network
│   └── visualization.py    # Reward curves, Q-table heatmaps, policy arrows
├── notebooks/
│   └── rl_study.ipynb      # Full study notebook with 9 sections
├── tests/
│   ├── test_environments.py # 10 environment tests
│   └── test_agents.py      # 14 agent tests (Q-Learning + DQN)
├── evidence/               # Exported PNG evidence from notebook runs
└── pyproject.toml

Key Concepts

  • Windy GridWorld: NxM grid with column-dependent upward wind. Agent must learn to compensate for wind to reach the goal efficiently
  • Q-Learning: Off-policy TD control — Q(s,a) ← Q(s,a) + α[r + γ max Q(s',a') - Q(s,a)]
  • ε-greedy exploration: Balance exploration vs exploitation with decaying ε
  • DQN: Neural function approximation with experience replay and target network stabilization
  • Bellman optimality: The Q-table converges to Q* when every state-action pair is visited infinitely often with decaying step sizes

PDF Report

A fully executed notebook with all outputs is available as a PDF:
notebooks/rl_study.pdf

Quick Start

# Install
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install gymnasium
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Open the study notebook
jupyter notebook notebooks/rl_study.ipynb

Requirements

  • Python ≥ 3.10
  • gymnasium ≥ 0.29
  • PyTorch ≥ 2.0
  • numpy, matplotlib, pandas

Author

Chris Schmidt — MS Applied Mathematics | AI Engineering MSE (JHU)

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors