### YOLOv8 demo for CAST

<b>Run block for required dependencies:</b>

In [None]:
# use this block to download requirements into virtual environment
%pip install -r requirements.txt

<b>Import statements:<b>

In [None]:
import os
from ultralytics import YOLO
import cv2
import torch
import requests
import zipfile
import shutil
import subprocess

### <u>Importing training dataset for demo</u>

For this demo, we will be training a YOLOv8 on African wildlife imagery pulled from https://ultralytics.com/assets/african-wildlife.zip. From ultralytics, "This dataset showcases four common animal classes typically found in South African nature reserves. It includes images of African wildlife such as buffalo, elephant, rhino, and zebra, providing valuable insights into their characteristics. Essential for training computer vision algorithms, this dataset aids in identifying animals in various habitats, from zoos to forests, and supports wildlife research.... 

The African wildlife objects detection dataset is split into three subsets:

Training set: Contains 1052 images, each with corresponding annotations.
Validation set: Includes 225 images, each with paired annotations.
Testing set: Comprises 227 images, each with paired annotations."

In [None]:
# using requests, download zip file and extract contents

# url of zipfile
url_of_zip = 'https://ultralytics.com/assets/african-wildlife.zip'
url_of_yaml = 'https://raw.githubusercontent.com/ultralytics/ultralytics/main/ultralytics/cfg/datasets/african-wildlife.yaml'

# directory to be saved into
wd = os.getcwd()
zip_directory = os.path.join(wd, 'animals.zip')
yaml_directory = os.path.join(wd, 'data.yaml')

# download request
response = requests.get(url_of_zip)
with open(zip_directory, 'wb') as f:
    f.write(response.content)

response2 = requests.get(url_of_yaml)
with open(yaml_directory, 'wb') as f:
    f.write(response2.content)

# use the working directory as the extraction target
extract_to = wd  # Since we're extracting to the working directory

# extract the zip file
with zipfile.ZipFile(zip_directory, 'r') as zip_ref:
    zip_ref.extractall(extract_to)


<b>Clear directory of training files if needed:</b>

In [None]:
# initialize file name
file_name = 'animals.zip'
file_name2 = 'data.yaml'

# construct the file path
file_path = os.path.join(os.getcwd(), file_name)
file_path2 = os.path.join(os.getcwd(), file_name2)

# delete the zip file
if os.path.exists(file_path):
    os.remove(file_path)
    print(f"The file {file_name} has been deleted.")
else:
    print(f"The file {file_name} does not exist.")

# delete the zip file
if os.path.exists(file_path2):
    os.remove(file_path2)
    print(f"The file {file_name2} has been deleted.")
else:
    print(f"The file {file_name2} does not exist.")

# list of directory paths you want to delete
directories = ["./train", "./test", "./valid"]

# loop through each directory in the list
for directory_path in directories:

    # check if the directory exists
    if os.path.exists(directory_path) and os.path.isdir(directory_path):
        
        # use shutil.rmtree() to delete the directory
        shutil.rmtree(directory_path)
        print(f"The directory {directory_path} has been deleted.")

    else:
        print(f"The directory {directory_path} does not exist.")


### <u>Training the detection model on local machine</u>

This block will train the YOLO model using your local machine. Depending on the size of the training/val data and model size, this could be too computationally intensive for your hardware; we will discuss using UARK's high performance computing center if this is the case.

<b>Hyperparameters</b>

YOLOv8 has a handful of customizable hyperparameters you can read about here: https://docs.ultralytics.com/usage/cfg/#train-settings. Hyperparameters affect how YOLO is trained and can subsequently affect accuracy and time of convergence.

This code block will prompt you for a few commonly customized hyperparameters.

A note about batch size and optimizer: using batch size = -1 will find the computationally optimal batch size for your local machine. Similarly, using optimizer = auto will do the same.

<b>Configuration</b>

In your working directory, there should now be train, test, and valid folders. These are the image sets and associated annotations YOLO will be trained on. There should also be a data.yaml file; this yaml file tells the model the configuration of your directory, i.e. where to find the training data it needs.

<b>Tensorboard Updates</b>

TensorBoard offers an enhanced visualization experience for monitoring the YOLO (You Only Look Once) training process, providing deep insights into various metrics such as loss components, learning rate, and performance indicators like precision, recall, and mean Average Precision (mAP). Its integration with YOLO allows for real-time tracking of these metrics, enabling the identification and resolution of training challenges swiftly. The visualization of weight distributions, feature maps, and prediction outcomes further aids in understanding the model's learning behavior. 

TensorBoard will also provide a training time for each epoch.


In [None]:
# check if cuda GPU training is available. if not, set to CPU training.
if torch.cuda.is_available():
    device_name = torch.device("cuda")
else:
    device_name = torch.device('cpu')
print("Using {}.".format(device_name))

# load a model
model = YOLO('yolov8n.pt')
model.to(device_name)

# prompt for hyperparameters
print("\033[1m" + "Hyperparameter intialization" + "\033[0m")

# epochs
print(f'Enter the number of epochs: ')
epochs = int(input())
print(epochs)

# batch size
print(f'Enter the batch size: ')
batch_size = int(input())
print(batch_size)

# optimizer
print(f'Enter the optimizer (SGD, Adam, AdamW, NAdam, RAdam, RMSProp, or auto): ')
optimizer = input()
print(optimizer)

# cos_lr
print('Enter status of cos_lr (False or True): ')
cos_lr = bool(input())
print(cos_lr)

# train model
results = model.train(data="data.yaml", epochs=epochs, batch=batch_size, optimizer=optimizer, cos_lr=cos_lr)

### <u>Training the detection model on UARK's high performance computing servers</u>

Once again, training machine learning models on a laptop or PC is possible on a couple thousand of images, but the model tuning is demanding of most PC and laptop processers. Industrial scale GPUs, like NVIDIA's V100, are designed to execute a greater load of operations at the same time and are well equipped for math heavy machine learning algorithms. These GPUs allow for fitting models with a more complicated architecture to a larger dataset. 

The Arkansas HPC (High performance computer) server has 20 Industrial scale GPUs for computationally heavy programs. The AHPCC is available to faculty, staff and students at all of the Arkansas public universities. 

For our purposes, we will train the same model on a larger set of training data using these GPUs. Our process will walk through how to run these files on the AHPC server.

#### Connecting to the University of Arkansas HPC Server

<u> Setup </u>

You will need an account to use the GPUs on the AHPC server.
U of A students can request an account to use the AHPC here: https://hpc.uark.edu/hpc-support/user-account-requests/internal.php

You must be connected to university wifi or connected to the university VPN. Connections outside of the university network will fail on the server side.

Test your connection by signing in to a login node. Enter a bash terminal and log in with this command

In [None]:

# <username> should be replaced by your uark username
ssh <username>@hpc-portal2.hpc.uark.edu 
# This will prompt you to enter your UARK password to finish connecting.

You are now able to upload files to the server that may run on the AHPC GPU nodes. Now we need to format files to send to the server GPU. A computation node needs to know the code to run, the files that a code script accesses, and the packages necessary to run the script. 

1) Convert notebook code to python script
2) Locate all build files and training data to one location
3) Send the python environment needed to train the data as a container

#### Creating the Python Script

Copy and paste the next to code chunk into a new python script. Call it "train_model.py"

In [None]:
"""
train_model.py
~~~~~~~~~~~~
Used to train YOLO models on UARK hpc.
"""

# Setting environmental variable and import statements
import os
import torch
from ultralytics import YOLO
from typing import List, Any
import pandas as pd

# Specifying cuda GPU memory allocation. Required in google colab to deal with memory overrun, using as safeguard for HPC
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:256'

# Intializing GPU use if available for training.
if torch.cuda.is_available():
    device_name = torch.device("cuda")
else:
    device_name = torch.device('cpu')
print(f"Using {device_name} for training.")

# load a model
model = YOLO('yolov8n.pt')
model.to(device_name)

# prompt for hyperparameters
print("\033[1m" + "Hyperparameter intialization" + "\033[0m")

# epochs
print(f'Enter the number of epochs: ')
epochs = int(input())
print(epochs)

# batch size
print(f'Enter the batch size: ')
batch_size = int(input())
print(batch_size)

# optimizer
print(f'Enter the optimizer (SGD, Adam, AdamW, NAdam, RAdam, RMSProp, or auto): ')
optimizer = input()
print(optimizer)

# cos_lr
print('Enter status of cos_lr (False or True): ')
cos_lr = bool(input())
print(cos_lr)

# train model
results = model.train(data="data.yaml", epochs=epochs, batch=batch_size, optimizer=optimizer, cos_lr=cos_lr)

#### Locate all build files and training data to one folder

Make a copy of this directory and remove the python notebook from the new directory. The script, yaml file, and training data should be the only files in this folder.

In [None]:
cp -r upload_folder <current_dir> ..   #bash command to copy the '<current_dir>' called 'upload_folder' to the parent folder
cd ..
cd upload_folder                        # Enter the upload folder
rm *.ipynb
ls -a                                   # print all files in the current directory

Add a folder called "images" that contains two folders of images and labels: one for training and one for validation. Both training and validation folders should have two folders inside: one for all of the images and one for all labels. 

### <u> Output of training </u>

By the end of training, YOLO models provide several key outputs that give insights into the model's performance and its capability to detect objects in images.

<b> Loss Function </b>

The loss function is a critical output of the training process, as it quantifies how well the YOLO model is performing. YOLO's loss function is composed of several components:

- **Localization Loss**: Measures how accurately the model predicts the location of bounding boxes for each detected object.
- **Confidence Loss**: Represents the error in the confidence scores for the bounding boxes, including those boxes that do not contain objects (background).
- **Classification Loss**: Calculates the error in predicting the class of the detected objects.

Monitoring the loss function during training helps in understanding how well the model learns to detect objects and classify them. A decreasing loss over epochs indicates that the model is learning effectively.

<b> Precision and Recall </b>

After training, evaluating the model's performance involves looking at precision and recall metrics:

- **Precision**: Indicates the accuracy of the predictions, i.e., the percentage of correct positive predictions out of all positive predictions made.
- **Recall**: Measures the model's ability to detect all relevant instances, i.e., the percentage of correct positive predictions out of all actual positives.

These metrics are crucial for understanding the trade-off between correctly detecting objects (recall) and minimizing false positives (precision).

<b> mAP (Mean Average Precision) </b>

mAP is a comprehensive metric used to evaluate the accuracy of object detectors like YOLO. It averages the precision-recall curve into a single value, providing an overall measure of the model's performance across all classes and IoU (Intersection over Union) thresholds. High mAP values indicate a robust model capable of accurately detecting and classifying objects across different scenarios.

<b> Detection Speed </b>

YOLO is designed for real-time object detection, and its detection speed (usually measured in FPS, frames per second) is a crucial output. This metric tells us how fast the model can process images to detect objects, which is vital for applications requiring real-time analysis, such as video surveillance and autonomous driving.

<b> Visualization of Detections </b>

Finally, visualizing the detections made by the YOLO model on test images or videos is an intuitive way to understand the model's performance. These visualizations typically include bounding boxes around detected objects, class labels, and confidence scores. They provide immediate visual feedback on how well the model can detect and classify objects in various conditions.

<b> Trained Model Directory Structure </b>

After successfully training a YOLO model, the next steps involve accessing the trained model for inference or further fine-tuning. The trained model weights are typically saved in a specific directory structure, often under a `weights` folder and within `runs` folders for different experiments. Understanding how to navigate these folders and use the saved weights is crucial for applying your YOLO model to real-world tasks.

- **Weights Folder**: This folder contains the saved weights of your model after training. YOLO saves weights at regular intervals during training, as well as the final weights once training is complete. The saved weights include:
  - `best.pt`: The weights of the model that achieved the best performance on the validation set during training.
  - `last.pt`: The weights of the model at the last training epoch. These may not be the best-performing weights but are useful for resuming training.
  
- **Runs Folder**: The `runs` folder is organized by training experiments. Each experiment (or training run) has its own subfolder, typically named with the experiment's start date and time or a custom name you specify. Within each experiment's folder, you'll find:
  - Subfolders for each training phase (e.g., `train`, `val`), containing logs and outputs specific to those phases.
  - TensorBoard logs, if TensorBoard was used during training, allowing you to visually monitor the training process.
  - Any additional outputs specified during training, such as plots of the loss function over time, precision-recall curves, and example predictions.


<b>Clear runs and trained model if needed:</b>

In [None]:
# initialize file name
file_name = 'yolov8n.pt'

# construct the file path
file_path = os.path.join(os.getcwd(), file_name)

# delete the zip file
if os.path.exists(file_path):
    os.remove(file_path)
    print(f"The file {file_name} has been deleted.")
else:
    print(f"The file {file_name} does not exist.")

# list of directory paths you want to delete
directory_path = "./runs"

# check if the directory exists
if os.path.exists(directory_path) and os.path.isdir(directory_path):
    
    # use shutil.rmtree() to delete the directory
    shutil.rmtree(directory_path)
    print(f"The directory {directory_path} has been deleted.")

else:
    print(f"The directory {directory_path} does not exist.")


### <u>Using the model for inference</u>

Once a model is trained, it can be used for infernecnce on 