# Unit 5: An Introduction to ML-Agents



## Clone the repository and install the dependencies 🔽


In [None]:
%%capture
# Clone the repository
!git clone --depth 1 https://github.com/Unity-Technologies/ml-agents

In [None]:

# Go inside the repository and install the package
%cd ml-agents
!pip3 install -e ./ml-agents-envs
!pip3 install -e ./ml-agents

## Worm


In [None]:
# Here, we create training-envs-executables and linux
!mkdir ./training-envs-executables
!mkdir ./training-envs-executables/linux

downloaded the script from here https://drive.google.com/file/d/1QceC2ruHIsL11-YXMlLZTr7rRDH-4EpL/edit

moved it to the correct folder manually

the following line doesn't work

In [None]:
# !gdown "https://drive.google.com/file/d/1QceC2ruHIsL11-YXMlLZTr7rRDH-4EpL" -O ./training-envs-executables/linux/Worm.zip

We unzip the executable.zip file

In [None]:
%%capture
!unzip -d ./training-envs-executables/linux/ ./training-envs-executables/linux/Worm.zip

Make sure your file is accessible

In [None]:
!chmod -R 755 ./training-envs-executables/linux/Worm

### Define the Worm config file
- In ML-Agents, you define the **training hyperparameters into config.yaml files.**

There are multiple hyperparameters. To know them better, you should check for each explanation with [the documentation](https://github.com/Unity-Technologies/ml-agents/blob/release_20_docs/docs/Training-Configuration-File.md)


So you need to create a `Worm.yaml` config file in ./content/ml-agents/config/ppo/

We'll give you here a first version of this config (to copy and paste into your `Worm.yaml file`), **but you should modify it**.


As an experimentation, you should also try to modify some other hyperparameters. Unity provides very [good documentation explaining each of them here](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md).

Now that you've created the config file and understand what most hyperparameters do, we're ready to train our agent 🔥.

```
behaviors:
  Worm:
    trainer_type: ppo
    hyperparameters:
      batch_size: 2024
      buffer_size: 20240
      learning_rate: 0.0003
      beta: 0.005
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
    network_settings:
      normalize: true
      hidden_units: 512
      num_layers: 3
      vis_encode_type: simple
    reward_signals:
      extrinsic:
        gamma: 0.9995
        strength: 1.0
    keep_checkpoints: 5
    max_steps: 5000000
    time_horizon: 1000
    summary_freq: 30000
```

### Train the agent

To train our agent, we just need to **launch mlagents-learn and select the executable containing the environment.**

We define four parameters:

1. `mlagents-learn <config>`: the path where the hyperparameter config file is.
2. `--env`: where the environment executable is.
3. `--run_id`: the name you want to give to your training run id.
4. `--no-graphics`: to not launch the visualization during the training.

<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/mlagentslearn.png" alt="MlAgents learn"/>

Train the model and use the `--resume` flag to continue training in case of interruption.

> It will fail first time if and when you use `--resume`, try running the block again to bypass the error.



The training will take 10 to 35min depending on your config, go take a ☕️you deserve it 🤗.

In [None]:
!mlagents-learn ./config/ppo/Worm.yaml --env=./training-envs-executables/linux/Worm/Worm --run-id="Worm1" --no-graphics

### Push the agent to the 🤗 Hub

- Now that we trained our agent, we’re **ready to push it to the Hub to be able to visualize it playing on your browser🔥.**

To be able to share your model with the community there are three more steps to follow:

1️⃣ (If it's not already done) create an account to HF ➡ https://huggingface.co/join

2️⃣ Sign in and then, you need to store your authentication token from the Hugging Face website.
- Create a new token (https://huggingface.co/settings/tokens) **with write role**

<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/notebooks/create-token.jpg" alt="Create HF Token">

- Copy the token
- Run the cell below and paste the token

In [None]:
from huggingface_hub import notebook_login
notebook_login()

If you don't want to use a Google Colab or a Jupyter Notebook, you need to use this command instead: `huggingface-cli login`

Then, we simply need to run `mlagents-push-to-hf`.

And we define 4 parameters:

1. `--run-id`: the name of the training run id.
2. `--local-dir`: where the agent was saved, it’s results/<run_id name>, so in my case results/First Training.
3. `--repo-id`: the name of the Hugging Face repo you want to create or update. It’s always <your huggingface username>/<the repo name>
If the repo does not exist **it will be created automatically**
4. `--commit-message`: since HF repos are git repository you need to define a commit message.

<img src="https://huggingface.co/datasets/huggingface-deep-rl-course/course-images/resolve/main/en/unit7/mlagentspushtohub.png" alt="Push to Hub"/>

For instance:

`!mlagents-push-to-hf  --run-id="SnowballTarget1" --local-dir="./results/SnowballTarget1" --repo-id="ThomasSimonini/ppo-SnowballTarget"  --commit-message="First Push"`

In [None]:
!mlagents-push-to-hf --run-id="Worm1" --local-dir="./results/Worm1" --repo-id="chirbard/ppo-Worm" --commit-message="First Push"

But now comes the best: **being able to visualize your agent online 👀.**

As an experimentation, you should also try to modify some other hyperparameters, Unity provides a very [good documentation explaining each of them here](https://github.com/Unity-Technologies/ml-agents/blob/main/docs/Training-Configuration-File.md).

We’re now ready to train our agent 🔥.