# Transfer Learning for Multiwavelength Drone Images

## Spring Season - May

The objective of this project is to evaluate the performance of pretrained Convolution Neural Networks (CNNs) on different sets of data. Here we train the newly upgraded YOLOv8 on drone landmine images taken during Spring Season of May 2024. The model is trained on the data and then trained on the 2024 Winter Season. We then compare the performance of the model for images taken in the visible band and infrared band.  

Here are the steps of producing the results.

First let's import the libraries that we are going to use for this tasks

In [40]:
import csv
import os
import shutil
import random
import yaml
import numpy as np
import pandas as pd
from ultralytics import YOLO
import xml.etree.ElementTree as ET

Next, we define the path location for our images

In [2]:
image_dir = "../data/raw/may"   
os.makedirs(image_dir, exist_ok = True)

### Pre-Preprocessing

We want the images and the labels to have standard naming format such that the name tells you the period and the image type.

For example; "may_afternoon_0_lwir_89.xml"

<div class="alert alert-block alert-warning">

<b>!! Attention !!</b> DO NOT Run these cells twice. Even though, the code has been written to take care of that, it is recommended to avoid running it twice.

</div>

First rename the images

In [5]:
prefix = "may_noon_"

for filename in os.listdir(image_dir):

    if filename.endswith(".jpg") and not filename.startswith(prefix):
        
        old_path = os.path.join(image_dir, filename)
        new_filename = prefix + filename
        new_path = os.path.join(image_dir, new_filename)

        os.rename(old_path, new_path)

Next rename the labels

In [6]:
for filename in os.listdir(image_dir):

    if filename.endswith(".xml") and not filename.startswith(prefix):
        
        old_path = os.path.join(image_dir, filename)
        new_filename = prefix + filename
        new_path = os.path.join(image_dir, new_filename)

        os.rename(old_path, new_path)

Each of the objects in the images are labelled as either *'ap_metal', 'ap_plastic', 'at_metal'* or , *'at_plastic'*. Let us write a code that iterates through all the labels and extract these unique classes.

In [3]:
unique_classes = set()

for filename in os.listdir(image_dir):
    
    if filename.endswith(".xml"):
        
        filepath = os.path.join(image_dir, filename)
        tree = ET.parse(filepath)
        root = tree.getroot()
        
        # Iterate over each object tag
        for obj in root.findall("object"):
            
            class_name = obj.find("name").text
            unique_classes.add(class_name)

# Convert to a sorted list
class_list = sorted(list(unique_classes))

print("Unique classes found:", class_list)

Unique classes found: ['ap_metal', 'ap_plastic', 'at_metal', 'at_plastic']


Next, we need to convert the labels into a format that is acceptable by YOLO. We achieve this by writing a function that accepts the ".xml" annotation file, it extracts the image width and height, loops over each label to find the image class, check if its known against the class list from the previous code and then convert the class names into unique index as YOLO only recognizes IDs and not names.

The function then extracts bounding box coordinates from the .xml file before converting it to YOLO format by normalizing them from 0 to 1. 

In [8]:
def convert_voc_to_yolo(xml_file):
    """
    This function reads .xml annotation file, 
    extracts bounding boxes and class names 
    before converting them to YOLO format of 
    one string per object.

    Parameters
    ----------
    xml_file : string
        The path to a Pascal VOC-style XML 
        annotation label file.

    Returns
    -------
    """
    tree = ET.parse(xml_file)
    root = tree.getroot()
    w = int(root.find("size/width").text)
    h = int(root.find("size/height").text)
    
    yolo_lines = []
    for obj in root.findall("object"):
        
        cls = obj.find("name").text
        if cls not in class_list:
            
            continue
        cls_id = class_list.index(cls)
        xmlbox = obj.find("bndbox")
        xmin = int(xmlbox.find("xmin").text)
        ymin = int(xmlbox.find("ymin").text)
        xmax = int(xmlbox.find("xmax").text)
        ymax = int(xmlbox.find("ymax").text)

        # Convert to YOLO format
        x_center = ((xmin + xmax) / 2) / w
        y_center = ((ymin + ymax) / 2) / h
        bw = (xmax - xmin) / w
        bh = (ymax - ymin) / h
        yolo_lines.append(f"{cls_id} {x_center} {y_center} {bw} {bh}")
        
    return yolo_lines

We loop over the .xml files in label folder, convert the annotations from VOC format to YOLO using the our function and then save them as text files. 

In [9]:
for xml_file in os.listdir(image_dir):
    
    if not xml_file.endswith(".xml"):
        
        continue
        
    xml_path = os.path.join(image_dir, xml_file)
    txt_path = os.path.join(image_dir, xml_file.replace(".xml", ".txt"))
    
    yolo_data = convert_voc_to_yolo(xml_path)
    with open(txt_path, "w") as f:
        
        f.write("\n".join(yolo_data))

#### Preparing the data before training

Now that we have processed the data, our task is now to split the data into "train", "validation" and "test".

We train the YOLO model on 75% of the data, we then validate it on 20% of the data and then test it on the remaining 5%. However, before splitting the data into three portions, we need to shuffle and randomize them as shown in the next cell.

In [4]:
output_base = "../results/may/dataset"
train_ratio, val_ratio, test_ratio = 0.75, 0.2, 0.5

#Shuffle the original images
images = [f for f in os.listdir(image_dir) if f.endswith((".jpg", ".png"))]
random.shuffle(images)

# Compute split indices
total = len(images)
train_end = int(total * train_ratio)
val_end = train_end + int(total * val_ratio)

# Split image filenames
split_data = {"train": images[:train_end], "val": images[train_end:val_end],
    "test": images[val_end:]}

In [5]:
total

830

We then dynamically create the "images" and "labels" folders to store the training, validation and test data.

<div class="alert alert-block alert-info">
    
<b>Note:</b> This cell will likely return some warning of missing labels for some images. There is no need to worry about this since some images did not have the target objects.

</div>

In [6]:
# Create folder structure and copy files
for split in ["train", "val", "test"]:
    img_out_dir = os.path.join(output_base, "images", split)
    lbl_out_dir = os.path.join(output_base, "labels", split)
    os.makedirs(img_out_dir, exist_ok=True)
    os.makedirs(lbl_out_dir, exist_ok=True)

    for img_file in split_data[split]:
        # Copy image
        shutil.copy(os.path.join(image_dir, img_file), os.path.join(img_out_dir, img_file))

        # Copy corresponding label
        txt_file = os.path.splitext(img_file)[0] + ".txt"
        src_lbl = os.path.join(image_dir, txt_file)
        if os.path.exists(src_lbl):
            shutil.copy(src_lbl, os.path.join(lbl_out_dir, txt_file))
        else:
            print(f"⚠️ Label not found for image: {img_file}")

⚠️ Label not found for image: may_morning_2_lwir_27.jpg
⚠️ Label not found for image: may_afternoon_1_lwir_11.jpg
⚠️ Label not found for image: may_afternoon_0_lwir_134.jpg
⚠️ Label not found for image: may_noon_0_lwir_10.jpg
⚠️ Label not found for image: may_afternoon_1_lwir_10.jpg
⚠️ Label not found for image: may_noon_0_lwir_11.jpg
⚠️ Label not found for image: may_afternoon_0_lwir_11.jpg
⚠️ Label not found for image: may_afternoon_0_lwir_10.jpg
⚠️ Label not found for image: may_morning_0_lwir_7.jpg
⚠️ Label not found for image: may_afternoon_0_lwir_116.jpg


Next we need to create a .yaml file that tells YOLOv8 model where our dataset is and the classes that we are using. 

<div class="alert alert-block alert-info">
    
<b>Note:</b> A yaml file is a plain-text configuration file format commonly used to store structured data into human-readable way especially for machine learning models and also describing metadata.

</div>

In [7]:
data = {
    "path":output_base, 
    "train": os.path.join("/Users/bosoro/Documents/GitHub/flight/results/may/dataset/images/train"),
    "val": os.path.join("/Users/bosoro/Documents/GitHub/flight/results/may/dataset/images/val"),
    "test": os.path.join("/Users/bosoro/Documents/GitHub/flight/results/may/dataset/images/test"),
    "nc": len(class_list),
    "names": class_list,
}

yaml_path = os.path.join(output_base, "data.yaml")
with open(yaml_path, "w") as f:
    
    yaml.dump(data, f, default_flow_style = False)

Now we load our YOLOv8 using the ultralytics library 

In [8]:
model = YOLO("yolov8n.pt") 

[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt': 100% ━━━━━━━━━━━━ 6.2/6.2MB 16.1MB/s 0.4s


We then use the model to train our data

In [9]:
yaml_file = os.path.join(output_base, "data.yaml")
model.train(data = yaml_file, epochs = 100, patience = 50, imgsz = 640, batch = 16)

New https://pypi.org/project/ultralytics/8.3.187 available 😃 Update with 'pip install -U ultralytics'
Ultralytics 8.3.186 🚀 Python-3.11.11 torch-2.8.0 CPU (Apple M1 Pro)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=../results/may/dataset/data.yaml, degrees=0.0, deterministic=True, device=cpu, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=100, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train, nbs=64, nms=False, opset=None, optimize=Fa

ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([0, 1, 2, 3])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x17bb87710>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.033033,    0.034034,    0.035035,    0.036036,    0.037037,    0.038038,    0.039039,     0.04004,    0.041041,    0.042042,    0.043043,    0.044044,    0.045045,    0.046046,    0.047047,
          0.04

### Model Testing

Now that we have trained our model, we can test it on the 5% images that we set aside earlier.

<div class="alert alert-block alert-info">
    
<b>Note:</b> YOLOv8 automatically saves the model on training. The saved model can be found in this path where the training script is located. *runs/detect/train/exp*/weights/*

The model is automatically named as *"best.pt"*

</div>

We first define the location of the saved model that we have just trained.

In [14]:
model_path = os.path.join("/Users/bosoro/Documents/GitHub/flight/scripts/runs/detect/train/weights", "best.pt")

We also define the path of the test images.

In [24]:
test_images = os.path.join("../results/may/dataset/images/test")

We now load the saved model in notebook.

In [26]:
may_model = YOLO(model_path)

And make predictions on the test images.

In [None]:
results = may_model.predict(source = test_images, save = True, imgsz = 640)

Now we quantitavely evaluate the model on the test data to understand its performance on the data that it has not seen before.

In [43]:
metrics = may_model.val(data = yaml_file, split = "test") 

Ultralytics 8.3.186 🚀 Python-3.11.11 torch-2.8.0 CPU (Apple M1 Pro)
[34m[1mval: [0mFast image access ✅ (ping: 0.2±0.1 ms, read: 386.6±64.7 MB/s, size: 186.7 KB)
[K[34m[1mval: [0mScanning /Users/bosoro/Documents/GitHub/flight/results/may/dataset/labels/test.cache... 42 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 42/42 101650.8it/s 0.0s0s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 3/3 0.58it/s 5.1s
                   all         42        360      0.872      0.902      0.921      0.547
              ap_metal         22         38      0.799      0.868      0.875      0.361
            ap_plastic         26         52      0.749      0.769      0.835      0.431
              at_metal         30         45      0.961      0.978      0.979      0.662
            at_plastic         41        225      0.977      0.991      0.994      0.735
Speed: 0.8ms preprocess, 105.6ms inference, 0.0ms loss, 0.4ms postproces

Next we save the performance metrics for the model for future comparison

In [82]:
class_names = may_model.names
rows = []
for i, name in class_names.items():
    
    p, r, ap50, ap = metrics.box.class_result(i)
    rows.append({"class_id": i, "class_name": name, "precision": p,
        "recall": r, "ap50": ap50, "ap50_95": ap,
        "season": "spring", "band": "optical"})

df = pd.DataFrame(rows)
RESULTS = "../results"
output_file = os.path.join(RESULTS, "may_images_metrics.csv")
df.to_csv(output_file, index=False)