# Welcome!

Welcome to **Deep Reinforcement Learning: Zero to Hero!** 🚀 This is a hands-on course designed to
take you through the fascinating world of Deep Reinforcement Learning (DRL).

In this series, you will write the most recent and popular _model-free_ RL algorithms from scratch.
We'll start with the fundamentals, replicating the
[2013 breakthrough](https://arxiv.org/abs/1312.5602) that taught an agent to play Atari games from
pixels. Then, we'll move on to landing a lunar module on the Moon using PPO, a state-of-the-art
algorithm recently applied to fine-tune large language models (LLMs) such as ChatGPT and Google
Gemini.

<div style="width: 50%">
  <img src="assets/00_Intro_DQN_Atari.png">
</div>

Later in the course, we'll dive into more advanced and cutting-edge topics to build even more
capable and intelligent agents, including exploration with random network distillation (and solve
`MiniGrid-DoorKey-16x16` with no memory), multi-agent RL, AlphaZero and Monte Carlo Tree Search,
model-based RL, RLHF, and more!

**NOTE**: The algorithms implemented in these lectures are optimized for _learning_. They retain the
spirit and key details of the original algorithms but avoid unnecessary code complexity (e.g.,
manually moving tensors between GPU and CPU) to keep the focus on the core concepts.

## Why is Reinforcement Learning So Exciting?

Deep Reinforcement Learning can feel less accessible than other areas of AI, but it's also one of
the most intriguing. This is because it tackles a fundamentally different and more autonomous
learning problem: **how can an agent learn to make optimal decisions by interacting with an
environment?**

Unlike other machine learning paradigms, the agent isn't given a dataset of "correct" answers.
Instead, it learns through trial and error, guided only by a sparse **reward signal**. This is much
closer to how humans and animals learn. This unique, interactive learning process is what allows RL
agents to achieve superhuman performance in complex games like Go and to solve challenging control
problems like robotic manipulation.

<div style="width:30%">
  <img src="assets/00_Intro_robot.png">
  <br>
  <small>From <a href="https://openai.com/research/solving-rubiks-cube">OpenAI</a></small>
</div>

### Where Does RL Fit in AI?

AI visionary and Turing Award winner Yann LeCun proposed a powerful analogy to contextualize
different types of learning in AI, which has come to be known as "LeCun's Cake" or "LeCake."

He described the landscape of machine learning as a cake:

- **The Cake**: **Unsupervised (or Self-Supervised) Learning** is the bulk of the cake. It
  represents the vast majority of learning, where models learn the underlying structure of data
  without explicit labels.
- **The Icing**: **Supervised Learning (and Fine-Tuning)** is the icing on the cake. It's a thinner
  layer because it relies on labeled data, which is far less abundant than unlabeled data.
- **The Cherry**: **Reinforcement Learning** is the cherry on top. The reward signal in RL provides
  very little information compared to the rich, high-bandwidth data used in supervised and
  unsupervised learning. As LeCun puts it, it's like "a bit of information per action."

<br>
<div style="width: 25%">
  <img src="assets/00_Intro_lecake.png">
</div>
<br>

This analogy highlights both the power and the challenge of RL. While the learning signal is sparse,
the ability to learn from it is the key to creating truly intelligent, autonomous agents that can
navigate the complexities of the real world. This course aims to give you the tools to build that
"cherry."

## Prerequisites

To make the most out of this course, you should have the following:

- **Proficiency in Python**: You should be comfortable with Python programming, including concepts
  like classes, data structures (lists, dictionaries), and general control flow.
- **Familiarity with Deep Learning & PyTorch**: You should understand the fundamentals of deep
  learning, such as neural networks, activation functions, and gradient descent. This course uses
  **PyTorch**, so prior experience with it is highly beneficial.
- **Mathematical Foundations (for theory)**: To fully grasp the theory behind the algorithms (which
  is optional but recommended), a solid understanding of basic **calculus** (derivatives), **linear
  algebra** (vectors, matrices), **statistics**, and **probability theory** is required.

<div>
  <div style="width: 10%;display:inline-block;background-color:white;">
    <img src="assets/00_Intro_nn.svg">
  </div>
  <div style="margin-left:50px;width:10%;margin-top:20px;display:inline-block;vertical-align:center">
    <img src="assets/00_Intro_python.png">
  </div>
</div>

Don't worry if you need a refresher! There are fantastic resources available. For Python, consider
the [official tutorial](https://docs.python.org/3/tutorial/appetite.html) or
[W3Schools](https://www.w3schools.com/python/). For deep learning, a great resource is the
[Deep Learning Specialization](https://www.deeplearning.ai/courses/deep-learning-specialization/)
from DeepLearning.AI.

## Credits

This series wouldn't be possible without all the resources from which I learnt and that I actively
consulted myself. In fact, there is very little novelty in this repository! My only hope is that
this step by step guide of building up algorithms and theory from first principles and ground up
will help more people to approach Deep RL in a practical way. Below you find all the resources that
effectively contributed to this material.

## Resources

[**Reinforcement Learning: An Introduction (2nd Edition) 2020**](http://incompleteideas.net/book/the-book-2nd.html):
The bible of Reinforcement Learning, everybody should read this book to grasp the foundations.

**[Foundations of Deep RL Lectures](https://www.youtube.com/playlist?list=PLwRJQ4m4UJjNymuBM9RdmB3Z9N5-0IlY0)
by Peter Abbeel**: A comprehensive and theoretically complete basic introductions to Deep
Reinforcement Learning by one of the biggest experts in the field. Part of this work explicitly
references these lectures.

**[Neural Networks: Zero to Hero](https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ)
by Andrej Karpathy**: An introduction to deep neural networks from their foundations to advanced
topics, starting from first principles the way the only Andrej (one of my heros!) can do! I
shamelessly copied the title!

**[Deep Reinforcement Learning Udacity Nanodegree](https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893)**:
A Udacity nano-degree that I took, and helped me significantly in gaining experience and practice
Deep RL. Some of the algorithms in this course are adaptation of algorithms I wrote for the exams of
this course (and I'll link them on GitHub).

**[OpenAI Spinning Up](https://spinningup.openai.com/en/latest/index.html)**: Educational resource
produced by OpenAI that makes it easier to learn about deep reinforcement learning.

**[Hugging Face Deep RL Course](https://huggingface.co/learn/deep-rl-course/en/unit0/introduction)**:
Another incredibly easy to digest and pedagogic online resource about Deep RL! Thanks Hugging Face!

**[CS-285 Berkeley](https://rail.eecs.berkeley.edu/deeprlcourse-fa22/)**: Course on Deep RL by the
Berkeley University.

[**CleanRL**](https://docs.cleanrl.dev/): Single-file implementations of most Deep RL algorithms,
benchmarked and clean code.

[**Deep RL Doesn't Work Yet**](https://www.alexirpan.com/2018/02/14/rl-hard.html): An incredible
blog post about the challenges that Deep Reinforcement Learning faces today, a must read.

[**Stable-Baselines3**](https://stable-baselines3.readthedocs.io/): Reliable implementations of RL
algorithms in PyTorch.
