# Enemy detection model - YOLOv5

This notebook will be used to train a convolutional neural network (CNN) model using [YOLOv5](https://github.com/ultralytics/yolov5).

The model generated by the YOLO algorithm will serve the purpose of detecting enemies in real-time in the first map of the game [Dusk's](https://store.steampowered.com/app/519860/DUSK/) Endless mode (The Farm). The model should be able to detect the five different types of enemies present on the map.

To keep up with Dusk's frantic pace, a model that is both accurate and fast enough is needed. Therefore, YOLOv5 was chosen for being a good compromise between accuracy and speed compared to other YOLO versions, at least on normal GPU systems.

## 1. First steps

### 1.1 Google Colab

This subsection will be helpful if you plan on running this notebook on Google Colab. Otherwise, skip to subsection 1.2.

First, upload this notebook under the desired directory of your Google Drive account.

After doing that, execute the following cell to connect your Drive account to Colab:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

After that, insert the path where this notebook was saved in your Drive account:

In [None]:
import os

NOTEBOOK_PATH = 'dusk-aimbot/model training' # CAN BE CHANGED
GOOGLE_DRIVE_PATH = os.path.join('drive', 'My Drive', NOTEBOOK_PATH)

%cd ./$GOOGLE_DRIVE_PATH

### 1.2 Clone YOLOv5 repository and import libraries

To start, place this notebook under the desired directory and clone the YOLOv5 Github repository from Ultralytics:

In [None]:
!git clone https://github.com/ultralytics/yolov5

Install the packages required by YOLOv5 with the following command:

In [4]:
%pip install -qr requirements.txt

Note: you may need to restart the kernel to use updated packages.


Change the current working directory to `yolov5`:

In [5]:
%cd yolov5

[Errno 2] No such file or directory: 'yolov5'
/home/leleo/dusk-aimbot/model training/yolov5


Don't forget to import all the needed libraries:

In [11]:
import torch
from yolov5 import utils
import torch
from IPython import display
from IPython.display import clear_output
from pathlib import Path
import yaml
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import glob
import random


%matplotlib inline

random.seed(108)

## 2. Data handling

### 2.1 Organize Directories

[According to the YOLOv5 wiki](https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data), it is recommended to organize the data inside a `/datasets` directory next to the `/yolov5` directory, as represented below:

![data_directory_structure](https://miro.medium.com/max/1400/1*J2UTo9Z2hJCeaTwB1d_aNw.png)

The following function will generate data subdirectories splitted into train, val, and test inside `/datasets`:

In [None]:
def create_data_directories(data_name):
    Path(f"../datasets/{data_name}/images/train").mkdir(parents=True, exist_ok=True)
    Path(f"../datasets/{data_name}/images/valid").mkdir(parents=True, exist_ok=True)
    Path(f"../datasets/{data_name}/images/test").mkdir(parents=True, exist_ok=True)
    Path(f"../datasets/{data_name}/labels/train").mkdir(parents=True, exist_ok=True)
    Path(f"../datasets/{data_name}/labels/valid").mkdir(parents=True, exist_ok=True)
    Path(f"../datasets/{data_name}/labels/test").mkdir(parents=True, exist_ok=True)

In [None]:
create_data_directories('dusk_enemies')

### 2.2 Organize data

Our data directories were generated, but we still need to split our data and move it there.
Thus, create a `/image_data` directory next to the `/datasets` and `/yolov5` directories.

Inside this directory, create: 
1. `/background_images` for images without YOLO labels;
2. `/labeled_images` for images with YOLO labels related to them;
3. `/labels` for YOLO labels (.txt files) related to those images in `/labeled_images`.

as shown below:
```bash
.
├── yolov5/
├── datasets/
└── image_data/
    ├── background_images/
    ├── labeled_images/
    └── labels/
```

Background images are separated from labeled images because they are split in a different manner.

However, if you have all your images in a single folder, put all of them inside `/labeled_images` and run the following cell to create `/background_images` automatically and move your background images inside it:

In [None]:
IMG_PATH = "../image_data/labeled_images"
LABEL_PATH = "../image_data/labels"
BACKGROUND_PATH = "../image_data/background_images"

# Creates a directory for background images if it doesn't exist
Path(BACKGROUND_PATH).mkdir(parents=True, exist_ok=True)

bg_count = 0
for image in glob.iglob(f"{IMG_PATH}/*.jpeg"):

    # Gets image name
    name = image.split(IMG_PATH)[1].split(".jpeg")[0]
    
    # Checks if there is a correspondent txt label file related to the image
    label_exists = os.path.isfile(f"{LABEL_PATH}/{name}.txt")
    
    # If the label doesn't exist, the image is a background image (no labels related to it)
    if not label_exists:
        try: # Move background image to an appropriate directory
            bg_count += 1
            shutil.move(image, BACKGROUND_PATH)
        except:
            print(f)
            assert False

print(f"\n{bg_count} background images moved from \"{IMG_PATH}\" to \"{BACKGROUND_PATH}\".\n")

### 2.3 Split data

After generating data directories, our data should be split and moved to their respective directories.

For this model, the train, validation, and test sets will contain 80%, 10%, and 10% of the data, respectively.

To make sure that each labeled image and its label is split into the same group, labeled and background images will be split separately.

First, the set of labeled images will be split:

In [None]:
## Code adapted from https://blog.paperspace.com/train-yolov5-custom-data/

IMG_PATH = "../image_data/labeled_images"
LABEL_PATH = "../image_data/labels"

# Read images and labels
images = [os.path.join(IMG_PATH, x) for x in os.listdir(IMG_PATH)]
labels = [os.path.join(LABEL_PATH, x) for x in os.listdir(LABEL_PATH) if x[-3:] == "txt"]

images.sort()
labels.sort()

# Split the dataset into train / valid / test splits 
train_images, remain_images, train_labels, remain_labels = train_test_split(images, labels, train_size = 0.8, random_state = 1)
val_images, test_images, val_labels, test_labels = train_test_split(remain_images, remain_labels, test_size = 0.5, random_state = 1)

In [None]:
## Code adapted from https://blog.paperspace.com/train-yolov5-custom-data/

# Utility function to move images to their respective directory
def move_files_to_directory(list_of_files, destination_dir):
    for f in list_of_files:
        try:
            shutil.move(f, destination_dir)
        except:
            print(f)
            assert False

In [None]:
DATASET_IMG_PATH = '../datasets/dusk_enemies/images'
DATASET_LABEL_PATH = '../datasets/dusk_enemies/labels'

# Move the splits into their folders
move_files_to_directory(train_images, f'{DATASET_IMG_PATH}/train/')
move_files_to_directory(val_images, f'{DATASET_IMG_PATH}/valid/')
move_files_to_directory(test_images, f'{DATASET_IMG_PATH}/test/')
move_files_to_directory(train_labels, f'{DATASET_LABEL_PATH}/train/')
move_files_to_directory(val_labels, f'{DATASET_LABEL_PATH}/valid/')
move_files_to_directory(test_labels, f'{DATASET_LABEL_PATH}/test/')

This time, we will use slicing to split background images:

In [None]:
BG_IMG_PATH = "../image_data/background_images"

# Read images and labels
bg_images = [os.path.join(BG_IMG_PATH, x) for x in os.listdir(BG_IMG_PATH)]

random.shuffle(bg_images) # Shuffle background images' list to guarantee randomness

bg_80_percent = int(0.8 * len(bg_images))
bg_90_percent = int(0.9 * len(bg_images))

train_bg_images = bg_images[:bg_80_percent] # The first 80% of background images is the train set
val_bg_images = bg_images[bg_80_percent : bg_90_percent] # The 10% between 80~90% is the validation set
test_bg_images = bg_images[bg_90_percent:] # The last 10% is the test set

In [None]:
DATASET_BG_PATH = '../datasets/dusk_enemies/images'

# Move the splits into their folders
move_files_to_directory(train_bg_images, f'{DATASET_BG_PATH}/train/')
move_files_to_directory(val_bg_images, f'{DATASET_BG_PATH}/valid/')
move_files_to_directory(test_bg_images, f'{DATASET_BG_PATH}/test/')

### 2.4 Create dataset.yaml

A dataset config file (YAML file) should also be created. It defines:

1. the dataset root directory `path` and relative paths to `train` / `val` / `test` image directories;
2. the number of classes `nc` that you want to detect;
3. and the names corresponding to those classes, represented by `names`.

YAML files are commonly created inside `/yolov5/data`, and so will be ours:

In [16]:
!cat data/dusk_enemies.yaml

## 3. Training

Because our dataset is quite small, we will use the YOLOv5s model to start training from pretrained weights.

The training parameters may be changed as desired.

In [None]:
!python train.py --img 1280 --batch 16 --epochs 150\
    --data 'data/dusk_enemies.yaml' --weights 'yolov5s.pt'\
    --project 'dusk_runs' --name 'enemy_detection_train'

clear_output()

If the training stops for whatever reason, run `train.py` with the `--resume` flag to resume it:

In [None]:
!python train.py --resume

clear_output()

If Google Colab is used, the `plot_results` function can be used to plot the training results in CSV to a PNG image:

In [None]:
from utils.plots import plot_results

# CSV to PNG
CSV_RESULTS_PATH = ""
plot_results(CSV_RESULTS_PATH)

# Shows the resulting plots
PNG_RESULTS_PATH = ""
display.Image(PNG_RESULTS_PATH, width=1000)

## Validation

The validation script will be used to evaluate the trained model. The `--task` flag controls which dataset partition will be used on the validation process.

Below, the model performance is evaluated over the test partition:

In [None]:
!python val.py --weights './dusk_runs/enemy_detection_train2/weights/best.pt' --batch 32\
    --data 'data/dusk_enemies.yaml' --task test\
    --project 'dusk_runs' --name 'validation_on_test_data'

clear_output()

## Inference

Finally, the model can be used for inference having different sources as input.

Below, the inference occurs with a video and confidence threshold of 0.7:

In [None]:
PATH_TO_VIDEO = ""
PATH_TO_MODEL = ""

!python detect.py --weights PATH_TO_MODEL --img 1280 --conf 0.7\
    --source PATH_TO_VIDEO\
    --project 'dusk_runs' --name 'detect_test'

clear_output()