<a href="https://colab.research.google.com/github/suresh-venkate/Deep_Reinforcement_Learning/blob/main/Huggingface_DeepRL_Course/Deep_RL_Algorithms/DQN/DQN_Unit_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DQN - Atarti - HFC_Unit 3

# Sample Illustration

In [None]:
# %%html
# <video controls autoplay><source src="https://huggingface.co/ThomasSimonini/ppo-SpaceInvadersNoFrameskip-v4/resolve/main/replay.mp4" type="video/mp4"></video>

# Preliminaries

## Create a virtual display

During the notebook, we'll need to generate a replay video. To do so, with colab, **we need to have a virtual screen to be able to render the environment** (and thus record the frames). Hence the following cell will install the librairies and create and run a virtual screen.

In [None]:
%%capture
!apt install python-opengl
!apt install ffmpeg
!apt install xvfb
!pip3 install pyvirtualdisplay

In [None]:
# Additional dependencies for RL Baselines3 Zoo
!apt-get install swig cmake freeglut3-dev 

In [None]:
!pip install pyglet==1.5.1

In [None]:
# Virtual display
from pyvirtualdisplay import Display

virtual_display = Display(visible=0, size=(1400, 900))
virtual_display.start()

## Clone RL-Baselines3 Zoo Repo
You can now directly install from python package `pip install rl_zoo3` but since we want **the full installation with extra environments and dependencies** we're going to clone `RL-Baselines3-Zoo` repository and install from source.

In [None]:
!git clone https://github.com/DLR-RM/rl-baselines3-zoo

## Install dependencies
We can now install the dependencies RL-Baselines3 Zoo needs (this can take 5min)

In [None]:
%cd /content/rl-baselines3-zoo/
!git checkout 61a0d1349fa98ff3ee371c70cb4f1f45ec29f5b0

In [None]:
!pip install setuptools==65.5.0
!pip install -r requirements.txt
# Since colab uses Python 3.9 we need to add this installation
!pip install gym[atari,accept-rom-license]==0.21.0

# Train the agent to Play Space Invaders 👾

To train an agent with RL-Baselines3-Zoo, we just need to do two things:
1. We define the hyperparameters in `/content/rl-baselines3-zoo/hyperparams/dqn.yml`

<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/unit3/hyperparameters.png" alt="DQN Hyperparameters">


2. We run `train.py` and save the models on `logs` folder 📁

In [None]:
!python train.py --algo dqn  --env SpaceInvadersNoFrameskip-v4 -f logs/

# Evaluate our agent 👀
- RL-Baselines3-Zoo provides `enjoy.py`, a python script to evaluate our agent. In most RL libraries, we call the evaluation script `enjoy.py`.
- Let's evaluate it for 5000 timesteps 🔥

In [None]:
!python enjoy.py  --algo dqn  --env SpaceInvadersNoFrameskip-v4  --no-render  --n-timesteps 5000  --folder logs/

# Publish the trained model on HF Hub

By using `rl_zoo3.push_to_hub.py` **we evaluate, record a replay, generate a model card of our agent and push it to the hub**.


Three steps required for this:

1) Sign into HF account, and then, store the authentication token from the Hugging Face website.
- Create a new token (https://huggingface.co/settings/tokens) **with write role**
- Copy the token 
- Run the cell below and past the token


In [None]:
from huggingface_hub import notebook_login # To log to our Hugging Face account to be able to upload models to the Hub.
notebook_login()
!git config --global credential.helper store

In [None]:
!python -m rl_zoo3.push_to_hub  --algo dqn  --env SpaceInvadersNoFrameskip-v4  --repo-name dqn-SpaceInvadersNoFrameskip-v4  -orga svenkate  -f logs/

# Yet to be reviewed

## Load a powerful trained model 🔥
- The Stable-Baselines3 team uploaded **more than 150 trained Deep Reinforcement Learning agents on the Hub**.

You can find them here: 👉 https://huggingface.co/sb3

Some examples:
- Asteroids: https://huggingface.co/sb3/dqn-AsteroidsNoFrameskip-v4
- Beam Rider: https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4
- Breakout: https://huggingface.co/sb3/dqn-BreakoutNoFrameskip-v4
- Road Runner: https://huggingface.co/sb3/dqn-RoadRunnerNoFrameskip-v4

Let's load an agent playing Beam Rider: https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4

In [None]:
%%html
<video controls autoplay><source src="https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4/resolve/main/replay.mp4" type="video/mp4"></video>

1. We download the model using `rl_zoo3.load_from_hub`, and place it in a new folder that we can call `rl_trained`

In [None]:
# Download model and save it into the logs/ folder
!python -m rl_zoo3.load_from_hub --algo dqn --env BeamRiderNoFrameskip-v4 -orga sb3 -f rl_trained/

2. Let's evaluate if for 5000 timesteps

In [None]:
!python enjoy.py --algo dqn --env BeamRiderNoFrameskip-v4 -n 5000  -f rl_trained/

Why not trying to train your own **Deep Q-Learning Agent playing BeamRiderNoFrameskip-v4? 🏆.**

If you want to try, check https://huggingface.co/sb3/dqn-BeamRiderNoFrameskip-v4#hyperparameters **in the model card, you have the hyperparameters of the trained agent.**

But finding hyperparameters can be a daunting task. Fortunately, we'll see in the next Unit, how we can **use Optuna for optimizing the Hyperparameters 🔥.**


## Some additional challenges 🏆
The best way to learn **is to try things by your own**!

In the [Leaderboard](https://huggingface.co/spaces/huggingface-projects/Deep-Reinforcement-Learning-Leaderboard) you will find your agents. Can you get to the top?

Here's a list of environments you can try to train your agent with:
- BeamRiderNoFrameskip-v4
- BreakoutNoFrameskip-v4 
- EnduroNoFrameskip-v4
- PongNoFrameskip-v4

Also, **if you want to learn to implement Deep Q-Learning by yourself**, you definitely should look at CleanRL implementation: https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/dqn_atari.py

<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit4/atari-envs.gif" alt="Environments"/>