A collection of hands-on RL projects built alongside my Reinforcement Learning From Scratch series on Medium. Each project is a self-contained implementation that accompanies an article — from tabular Q-Learning all the way to an LLM agent trained with RL to use tools.
| # | Project | Algorithm | Concepts |
|---|---|---|---|
| 01 | Grid World Navigator | Q-Learning | Custom Gym env, reward shaping, epsilon-greedy |
| 02 | Blackjack Strategy Learner | Monte Carlo | First-visit MC, model-free RL, value function |
| 03 | CliffWalking: TD vs MC | TD(0) vs MC | Temporal difference, bias-variance tradeoff |
| 04 | CliffWalking: SARSA vs Q-Learning | SARSA, Q-Learning | On-policy vs off-policy, path visualization |
| 05 | LunarLander with DQN | DQN | Deep RL, replay buffer, target network |
| 06 | MountainCar with A2C | A2C | Actor-Critic, continuous action space |
| 07 | BipedalWalker with PPO | PPO | Clip ratio, surrogate objective |
| 08 | LLM Tool-Use Agent | PPO + LLM | Agentic AI, tool selection, custom Gym env |
Each folder is fully self-contained. Install that project's dependencies and run it independently.
git clone https://github.com/yourusername/reinforcement-learning-projects.git
cd reinforcement-learning-projects
# Go into any project
cd project-01-gridworld
pip install -r requirements.txt
python train.pyPython 3.10+ recommended.
- Gymnasium — RL environments
- Stable-Baselines3 — PPO, A2C implementations
- PyTorch — Neural networks for deep RL projects
- NumPy & Matplotlib — Computation and visualization
Ebad Sayed — Final year, IIT (ISM) Dhanbad, Co-founder of Voke AI