# Hands-on

- https://huggingface.co/learn/deep-rl-course/unit3/hands-on

### 🎮 Environments:

- [SpacesInvadersNoFrameskip-v4](https://gymnasium.farama.org/environments/atari/space_invaders/)

You can see the difference between Space Invaders versions here 👉 
- https://gymnasium.farama.org/environments/atari/space_invaders/#variants
- https://ale.farama.org/environments/space_invaders/

### 📚 RL-Library:

- [RL-Baselines3-Zoo](https://github.com/DLR-RM/rl-baselines3-zoo)

```
pip install git+https://github.com/DLR-RM/rl-baselines3-zoo
```

## Train our Deep Q-Learning Agent to Play Space Invaders 👾

To train an agent with RL-Baselines3-Zoo, we just need to do two things:

1. Create a hyperparameter config file that will contain our training hyperparameters called `dqn.yml`.

This is a template example:

```
SpaceInvadersNoFrameskip-v4:
  env_wrapper:
    - stable_baselines3.common.atari_wrappers.AtariWrapper
  frame_stack: 4
  policy: 'CnnPolicy'
  n_timesteps: !!float 1e6
  buffer_size: 100000
  learning_rate: !!float 1e-4
  batch_size: 32
  learning_starts: 100000
  target_update_interval: 1000
  train_freq: 4
  gradient_steps: 1
  exploration_fraction: 0.1
  exploration_final_eps: 0.01
  # If True, you need to deactivate handle_timeout_termination
  # in the replay_buffer_kwargs
  optimize_memory_usage: False
```

Here we see that:
- We use the `Atari Wrapper` that preprocess the input (Frame reduction ,grayscale, stack 4 frames)
- We use `CnnPolicy`, since we use Convolutional layers to process the frames
- We train it for 1 million `n_timesteps`
- Memory (Experience Replay) size is 100000, aka the amount of experience steps you saved to train again your agent with.

In terms of hyperparameters optimization, my advice is to focus on these 3 hyperparameters:
- `learning_rate`
- `buffer_size (Experience Memory size)`
- `batch_size`

As a good practice, you need to **check the documentation to understand what each hyperparameters does**: https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html#parameters

2. We start the training and save the models on `logs` folder 📁

- Define the algorithm after `--algo`, where we save the model after `-f` and where the hyperparameter config is after `-c`.

```
python -m rl_zoo3.train --algo dqn  --env SpaceInvadersNoFrameskip-v4 -f logs/ -c dqn.yml
```

## Let's evaluate our agent 👀
- RL-Baselines3-Zoo provides `enjoy.py`, a python script to evaluate our agent. In most RL libraries, we call the evaluation script `enjoy.py`.
- Let's evaluate it for 5000 timesteps 🔥

In [None]:
!python -m rl_zoo3.enjoy  --algo dqn  --env SpaceInvadersNoFrameskip-v4  --no-render  --n-timesteps 5000  --folder logs/

Loading latest experiment, id=1
Loading logs/dqn/SpaceInvadersNoFrameskip-v4_1/SpaceInvadersNoFrameskip-v4.zip
A.L.E: Arcade Learning Environment (version 0.10.1+unknown)
[Powered by Stella]
Stacking 4 frames
Atari Episode Score: 315.00
Atari Episode Length 2573
Atari Episode Score: 605.00
Atari Episode Length 4439
Atari Episode Score: 800.00
Atari Episode Length 4201
Atari Episode Score: 575.00
Atari Episode Length 4274


### Record  a Video

In [None]:
!python -m rl_zoo3.record_video --algo dqn --env SpaceInvadersNoFrameskip-v4 -f logs/ -n 1000

Loading latest experiment, id=1
Loading logs/dqn/SpaceInvadersNoFrameskip-v4_1/SpaceInvadersNoFrameskip-v4.zip
A.L.E: Arcade Learning Environment (version 0.10.1+unknown)
[Powered by Stella]
Stacking 4 frames
Loading logs/dqn/SpaceInvadersNoFrameskip-v4_1/SpaceInvadersNoFrameskip-v4.zip
Wrapping the env in a VecTransposeImage.
Saving video to /home/mgj/wsl/notebooks/Huggingface/course_deep_RL/unit3_deep_Q_learning/logs/dqn/SpaceInvadersNoFrameskip-v4_1/videos/final-model-dqn-SpaceInvadersNoFrameskip-v4-step-0-to-step-1000.mp4
MoviePy - Building video /home/mgj/wsl/notebooks/Huggingface/course_deep_RL/unit3_deep_Q_learning/logs/dqn/SpaceInvadersNoFrameskip-v4_1/videos/final-model-dqn-SpaceInvadersNoFrameskip-v4-step-0-to-step-1000.mp4.
MoviePy - Writing video /home/mgj/wsl/notebooks/Huggingface/course_deep_RL/unit3_deep_Q_learning/logs/dqn/SpaceInvadersNoFrameskip-v4_1/videos/final-model-dqn-SpaceInvadersNoFrameskip-v4-step-0-to-step-1000.mp4

MoviePy - Done !                           