# Retrain the YOLO model

To retrain the YOLO model we need a prepared dataset of car images with moderate and severe accident labels.  We have such a dataset (from RoboFlow) that has annotated images and split them into training and validation datasets.  We will use this training set to retrain our currentl YOLO model.

1. The encode classes of objects we want to teach our model to detect is 0-'moderate' and 1-'severe'.
2. We have created a folder for the dataset (data) and have have 2 subfolders in it: 'train' and 'valid'.  Within each subfolder we have created 2 subfolders:  'images' and 'labels'.
3. Each image has an annotation text file in the 'labels' subfolder. The annotation text files have the same names as the image files.

Once the images and associated annotations are ready, we create a dataset descriptor YAML file (data.yaml) that points to the created datasets and describes the object classes in them.  This YAML file is passed to the 'train' method of the model to start the training process.

Let's get started by installing ultralytics!

In [1]:
!pip install ultralytics 
from ultralytics import YOLO


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m23.3.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


Next let's load a YOLO model 'yolo8m.pt'

In [2]:
# Load model
model = YOLO('yolov8m.pt')  # load a pretrained model (recommended for training)

Once we've loaded our model we are going to start a training loop.  'data' is the only required option.  You pass the YAML descriptor file to it.
Each cycle has a training phase and validation phase.  

## Get the training data

We have provided the following 2 training data sets, available as 'zip files', and located in an S3 bucket:  
1) accident-full.zip   - to be used to fully re-train the model.
2) accident-sample.zip - to be used to partially re-train the model when we don't have the time to fully re-train the model.

Your instructor will let you know which data set 'zip file' you will be using in this workshop.

In [None]:
# *************************************************************************************************
# ********************                     VERY IMPORTANT!!!                      *****************
# ********************  ONLY EXECUTE below cell for FULLY RE-TRAINING the model   *****************
# *************************************************************************************************

In [None]:
%%bash

# Check if the directory exists, if not, create it
if [ ! -d "./datasets/" ]; then
    mkdir -p ./datasets/
fi

cd ./datasets/

URL="https://rhods-public.s3.amazonaws.com/sample-data/accident-data/accident-full.zip" 

# Check if the file exists, if not, download it
if [ ! -e "accident-full.zip" ]; then
    # curl $URL -o accident.zip
    echo "Downloading file"
    time curl -L -O -J \
        --retry 3 \
        --retry-delay 5 \
        --retry-max-time 30 \
        $URL
    ls -alh accident-full.zip    

    echo "unzipping file"
    time unzip -q accident-full.zip 
fi

In [None]:
# *****************************************************************************************************
# ********************                     VERY IMPORTANT!!!                          *****************
# ********************  ONLY EXECUTE below cell for PARTIALLY RE-TRAINING the model   *****************
# *****************************************************************************************************

In [3]:
%%bash

# Check if the directory exists, if not, create it
if [ ! -d "./datasets/" ]; then
    mkdir -p ./datasets/
fi

cd ./datasets/

URL="https://rhods-public.s3.amazonaws.com/sample-data/accident-data/accident-sample.zip" 

# Check if the file exists, if not, download it
if [ ! -e "accident-sample.zip" ]; then
    # curl $URL -o accident.zip
    echo "Downloading file"
    time curl -L -O -J \
        --retry 3 \
        --retry-delay 5 \
        --retry-max-time 30 \
        $URL
    ls -alh accident-sample.zip    

    echo "unzipping file"
    time unzip -q accident-sample.zip 
fi

Downloading file


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 5309k  100 5309k    0     0  14.4M      0 --:--:-- --:--:-- --:--:-- 14.3M

real	0m0.380s
user	0m0.019s
sys	0m0.017s


-rw-r--r--. 1 1003670000 1003670000 5.2M Jan  5 02:48 accident-sample.zip
unzipping file



real	0m0.055s
user	0m0.044s
sys	0m0.008s


## Re-training our YOLO model

Let's start by understanding what an 'epoch' is.  Machine learning models are trained with specific datasets passed through the algorithm. Each time a dataset passes through an algorithm, it is said to have completed an <b>epoch</b>. Therefore, <b>epoch</b>, in machine learning, refers to the one entire passing of training data through the algorithm

In the training run below you would see 'n' number of <b>epochs</b> based on the number of <b>epoch</b> training runs you set in the following code snippet:  
<b>results = model.train(data='data.yaml', epochs=1, imgsz=640) </b>

In your training run, each <b>epoch</b> will show a summary for both the training and validation phases: lines 1 and 2 show results of the training phase and lines 3 and 4 show the results of the validation phase for each epoch.  

Execute the following cell to start re-training the model!

In [None]:
# Train model

#results = model.train(data='data.yaml', epochs=7, imgsz=640)
results = model.train(data='datasets/data.yaml', epochs=1, imgsz=640)


Ultralytics YOLOv8.0.235 🚀 Python-3.8.6 torch-1.13.1+cpu CPU (Intel Xeon Platinum 8175M 2.50GHz)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8m.pt, data=datasets/data.yaml, epochs=1, time=None, patience=50, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train22, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_c

[34m[1mtrain: [0mScanning /opt/app-root/src/insurance-claim-processing/lab-materials/04/datasets/train/labels... 22 images, 0 backgrounds, 0 corrupt: 100%|██████████| 22/22 [00:00<00:00, 1071.29it/s]

[34m[1mtrain: [0mNew cache created: /opt/app-root/src/insurance-claim-processing/lab-materials/04/datasets/train/labels.cache



[34m[1mval: [0mScanning /opt/app-root/src/insurance-claim-processing/lab-materials/04/datasets/valid/labels... 22 images, 0 backgrounds, 0 corrupt: 100%|██████████| 22/22 [00:00<00:00, 1644.83it/s]

[34m[1mval: [0mNew cache created: /opt/app-root/src/insurance-claim-processing/lab-materials/04/datasets/valid/labels.cache





Plotting labels to runs/detect/train22/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.001667, momentum=0.9) with parameter groups 77 weight(decay=0.0), 84 weight(decay=0.0005), 83 bias(decay=0.0)
1 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


  0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
#Note:  if we are happy with our model training results, we would export our model model to ONNX format. 
#ObjDetOXModel = YOLO("runs/detect/train/weights/best.pt").export(format="onnx")

## Interpreting our Training Results

If you would like to further interpret the training results, click here.


Each epoch shows a summary for both the training and validation phases: lines 1 and 2 show results of the training phase and lines 3 and 4 show the results of the validation phase for each epoch.

The training phase includes a calculation of the amount of error in a loss function, so the most valuable metrics here are box_loss and cls_loss.

box_loss shows the amount of error in detected bounding boxes.
cls_loss shows the amount of error in detected object classes.

If the model really learns something from the data, then you should see that these values decrease from epoch to epoch. 
In a previous screenshot the box_loss decreased: 1.271, 1.113, 0.8679 and the cls_loss decreased too: 1.893, 1.404, 0.9703.

The most valuable quality metric is mAP50-95, which is Mean Average Precision. If the model learns and improves, the precision should grow from epoch to epoch.  In a previous screenshot mAP50-95 increased: 0.314 (epoch1), 0.663 (epoch4), 0.882 (epoch7)

If after the last epoch you did not get acceptable precision, you can increase the number of epochs and run the training again. Also, you can tune other parameters like batch, lr0, lrf or change the optimizer you're using.

During training we export the trained model, after each epoch, to the /runs/detect/train/weights/last.pt file and the model with the highest precision to the /runs/detect/train/weights/best.pt file. So, after training is finished, you can get the best.pt file to use in production.

Note:  In real world problems, you need to run much more epochs (then we have shown here) and be prepared to wait hours or days (like we did!) until training finishes.




Now that we have retrained our model let's test it against images with car accidents!   &nbsp; <B> Please go to notebook '04-04-accident-recog.ipynb'</B>