# Next - A Bridge to Advanced Topics

Congratulations on making it through the foundations of Deep Reinforcement Learning! You've built
some of the most influential algorithms from scratch, from DQN to PPO, and have developed a solid
base for tackling even more complex challenges. The journey so far has equipped you with the core
principles needed to understand the current landscape of RL.

<br>
<div style="width:20%; margin:auto;">
  <img src="assets/07_Next_robot.png">
</div>
<br>

Now, you stand at a bridge to the frontiers of the field. The following chapters delve into
specialized and advanced topics. Unlike the foundational section, **there is no required order
here**. Feel free to jump into the topics that excite you the most. Each notebook is a
self-contained exploration of a fascinating subfield.


## What's Next? Your Path Forward

Here’s a look at the advanced topics waiting for you. Pick your passion and dive in!

- **Exploration and Curiosity (`08_EXPL.ipynb`)**: What happens when rewards are rare? This notebook
  tackles the challenge of sparse rewards by implementing an _intrinsic curiosity module_. You'll
  use **Random Network Distillation (RND)** to encourage an agent to explore its environment out of
  a sense of novelty, allowing it to solve difficult tasks like navigating the MiniGrid world.

- **Multi-Agent Reinforcement Learning (`09_MARL.ipynb`)**: Move beyond single-agent scenarios and
  into a world of cooperation and competition. You'll explore the unique challenges of MARL, such as
  non-stationarity, and implement the **Multi-Agent DDPG (MA-DDPG)** algorithm using the paradigm of
  Centralized Training with Decentralized Execution (CTDE).

- **Imitation Learning (`10_IL.ipynb`)**: Sometimes, the easiest way to teach an agent is to show it
  how an expert behaves. In this notebook, you’ll learn how to train an agent by imitating expert
  demonstrations. You'll implement **Behavioral Cloning (BC)**, treat it as a supervised learning
  problem, and discuss its core challenges like covariate shift.

- **Monte Carlo Tree Search & AlphaZero (`11_MCTS.ipynb`)**: Uncover the secrets behind the
  algorithms that mastered Go and chess. You’ll start by building a **Monte Carlo Tree Search
  (MCTS)** agent from scratch to play classic games. Then, you'll level up by implementing the
  legendary **AlphaZero** algorithm, which combines MCTS with deep neural networks and self-play.

- **Productionizing RL (`12_PROD.ipynb`)**: Bridge the gap between theory and real-world
  application. This notebook equips you with the essential tools for making your agents
  production-ready. You’ll learn about experiment tracking with **Tensorboard**, parallelization
  with **Ray**, and automated hyperparameter tuning with **Optuna**.

- **Model-Based Reinforcement Learning (`13_MBRL.ipynb`)**: Why learn from the real world when you
  can learn from a simulated one? This notebook introduces you to **Model-Based RL**, where the
  agent first learns the dynamics of the environment. You'll build a version of the **Model-Based
  Policy Optimization (MBPO)** algorithm to see how a learned world model can drastically improve
  sample efficiency.

- **Reinforcement Learning with Human Feedback (`14_RLHF.ipynb`)**: Step into the world of Large
  Language Models! This notebook demystifies the process of **RLHF**, the technique used to align
  models like ChatGPT and Gemini with human values. You'll implement a simplified RLHF pipeline to
  fine-tune a small LLM, guiding it to generate responses preferred by humans.

---

Each of these notebooks offers a unique perspective on the power and breadth of Deep Reinforcement
Learning. They represent active areas of research and application that are shaping the future of AI.
Choose your adventure, and continue your journey from zero to hero. Happy learning!
