Skip to content

srikanthbaride/Reinforcement-Learning-Explained-Code

Repository files navigation

Reinforcement Learning Explained — Companion Code

Build Status

ch2 ch3 ch4 ch5 ch6 ch7 ch8 ch9 ch10 ch11


This repository hosts chapter-wise companion code for the book Reinforcement Learning Explained.
It provides clean, minimal, and well-tested implementations of key reinforcement learning concepts.


📑 Chapter Navigation


📊 Chapter Progress

Chapter Title Status Notes
1 Introduction ✅ Complete Book only (no code needed)
2 The RL Problem Formulation ✅ Complete GridWorld, evaluation, policies, examples
3 Multi-Armed Bandits ✅ Complete Bandit envs, ε-greedy, UCB, Thompson
4 Dynamic Programming Approaches ✅ Complete Policy Iteration, Value Iteration
5 Monte Carlo Methods ✅ Complete Prediction, Control, On/Off-Policy
6 Temporal-Difference Learning ✅ Complete TD(0), n-step TD, prediction examples
7 TD Control ✅ Complete SARSA, Q-learning, Cliff-Walking, exploration
8 Eligibility Traces and TD(λ) ✅ Complete TD(λ), SARSA(λ), True Online TD(λ), gridworld demos
9 Model-Based RL and Planning ✅ Complete Dyna-Q, planning with rollouts, gridworld demos
10 Function Approximation Basics ✅ Complete Linear approx, tile coding, TD(0), SARSA, Mountain Car
11 Policy Gradient Fundamentals ✅ Complete REINFORCE, baselines, softmax & Gaussian policies

📂 Repository Structure

rl-fundamentals-code/
├─ ch2_rl_formulation/             # Chapter 2
├─ ch3_multi_armed_bandits/        # Chapter 3
├─ ch4_dynamic_programming/        # Chapter 4
├─ ch5_monte_carlo/                # Chapter 5
├─ ch6_td_learning/                # Chapter 6
├─ ch7_td_control/                 # Chapter 7
├─ ch8_td_lambda/                  # Chapter 8
├─ ch9_model_based_planning/       # Chapter 9
├─ ch10_function_approx/          # Chapter 10
├─ ch11_policy_gradient/          # Chapter 11
├─ utils/
└─ .github/workflows/

✅ Running Tests

To run all tests:

python -m pytest -q

⚙️ Continuous Integration

  • GitHub Actions (.github/workflows/*.yml) automatically run tests for each chapter on every push and pull request.
  • This ensures correctness and reproducibility of the examples.

📚 How to Cite

If you use this code or the accompanying book in your research or teaching, please cite:

Book (forthcoming):

@book{baride2025rlexplained,
  author    = {Srikanth Baride and Rodrigue Rizk and K. C. Santosh},
  title     = {Reinforcement Learning Explained},
  publisher = {CRC Press | Taylor \& Francis Group},
  year      = {2025},
  isbn      = {9781041252993},
  note      = {Accepted for publication; preprint available at \url{https://github.com/srikanthbaride/rl-explained-preprint}}
}

Companion Code (GitHub):

@misc{baride2025rlcode,
  author       = {Srikanth Baride},
  title        = {Reinforcement Learning Explained — Companion Code},
  year         = {2025},
  howpublished = {\url{https://github.com/srikanthbaride/Reinforcement-Learning-Explained-Code}},
  note         = {Accessed: YYYY-MM-DD}
}

📖 License

MIT License © 2025 — Srikanth Baride