# Aircraft Detection with YOLOv8

In this notebook we will demonstrate how to use and fine-tune the YOLOv8 model to detect aircrafts on the ground.

The [YOLOv8](https://github.com/ultralytics/ultralytics) architecture is developed by Ultralytics and you can easilly install all the required tools by runnnig:
```console
pip install ultralytics
```

Now let's check that everything has been installed correctly.

In [None]:
!yolo check

In [None]:
import ast
import cv2
import numpy as np
import os
import pandas as pd
import random
import torch

from collections import Counter
from tqdm.notebook import tqdm
from ultralytics import YOLO

from matplotlib import pyplot as plt
plt.rcParams['figure.figsize'] = [20, 15]

### Dataset

We wil lbe working with the [Airbus](https://www.kaggle.com/datasets/airbusgeo/airbus-aircrafts-sample-dataset) satellite dataset developed by Aribus Defense and Space Intelligence to detect grounded aircrafts.

The dataset contains 103 extract of satellite imagery at roughly 50 cm resolution. Each each image is stored as a JPEG file of size 2560 x 2560 pixels (i.e. 1280 meters on ground). The locations are various airports worldwide. 

All aircrafts have been annotated with bounding boxes on the provided imagery. The annotations are provided in the form of closed GeoJSON polygons.

In [None]:
def imread(filename):
    return cv2.cvtColor(cv2.imread(filename), cv2.COLOR_BGR2RGB)

data_folder = '/media/janko/DATA3/data/datasets/airbus/'
imfiles = os.listdir(os.path.join(data_folder, 'images'))
imfiles = [os.path.join(data_folder, 'images', f) for f in imfiles if os.path.splitext(f)[-1] == '.jpg']

sample = random.choice(imfiles)
image = imread(sample)
rows, cols, channels = image.shape

plt.imshow(image)

print('Number of samples:', len(imfiles))
print('Image shape:      ', image.shape)

Check that all images have the same size.

In [None]:
assert all(cv2.imread(imfile).shape == (rows, cols, channels) for imfile in tqdm(imfiles))

Now let's load the file containing the annotations. When you open the csv file, you will see that the geometry information is provided as a string. We need to convert it to a more "ML friendly" format. This can be achieved by using a custom converter function.

In [None]:
def geo_to_ndarray(x): 
    return np.array(ast.literal_eval(x))

annotations = pd.read_csv('/media/janko/DATA3/data/datasets/airbus/annotations.csv', 
                          converters={'geometry': geo_to_ndarray})
annotations

We see a total of 3425 annotated aircarfts. Each aircraft is annotaed with a bounding box in a closed format. Let's now check that there are no objects other than aircrafts.

In [None]:
Counter(annotations['class'])

Also let's make sure that all geometry objects are closed bounding boxes. This means, they contains 5 points and the coordinates of the origin (i.e. the first point) and the last points are the same.

In [None]:
assert all(len(geo) == 5 for geo in annotations['geometry'])
assert all(np.array_equal(geo[0, :], geo[-1, :]) for geo in annotations.geometry)

It is always important to visualize the data and annotation to make sure that we properly understand and handle the format and that the annotations are, in fact, correct.

In [None]:
sample = random.choice(imfiles)
image = imread(sample)

labels = annotations[annotations.image_id == os.path.basename(sample)]
points = [geo.reshape((-1, 1, 2)) for geo in labels.geometry]
cv2.polylines(image, points, isClosed=True, color=(0, 255, 0), thickness=5)

plt.imshow(image)

### Data Format Conversion

In order to use the YOLOv8 training tools efficiently, we need to use the [YOLOv8 data format](https://docs.ultralytics.com/datasets/detect/#ultralytics-yolo-format).

In addition, we will split each into 512x512 crops. This is commonly done to be able to train the model and not incur into memory problems. Therefore, we create a separate dataset with image crops. We will need to adjust the bounding boxes as well.

In [None]:
def recompute_box_coors(box, x_origin, y_origin, width, height, threshold):
    """Recompute box coordinates to new origin
    
    Args:
        box (np.ndarray): Bounding box coordinates in form (x_min, y_min, x_max, y_max).
        x_origin (int): X coordinate of the origin of the new coordinate system.
        y_origin (int): Y coordinate of the origin of the new coordinate system.
        width (int): Width of the new coordinate system. Recomputed box coordinates
            that would fall beyond will be truncated.
        threshold (float): Rejection ratio of bounding box after truncation. Recomputed
            boxes that are heavily truncated will be discarded.
            
    Returns:
        (tuple): Recomputed bounding boxes in YOLOv8 data format.
        
    """
    # Recompute bounds coordinates to new reference
    x_min, y_min, x_max, y_max = box
    x_min, y_min, x_max, y_max = x_min - x_origin, y_min - y_origin, x_max - x_origin, y_max - y_origin

    # Return None if the box does not lie within image crop
    if (x_min > width) or (x_max < 0.0) or (y_min > height) or (y_max < 0.0):
        return None
    
    # Truncate box x coordinates if necessary
    x_max_trunc = min(x_max, width)
    x_min_trunc = max(x_min, 0)
    # Skip if truncate too much
    if (x_max_trunc - x_min_trunc) / (x_max - x_min) < threshold:
        return None

    # Repeat for y coordinates
    y_max_trunc = min(y_max, width) 
    y_min_trunc = max(y_min, 0) 
    if (y_max_trunc - y_min_trunc) / (y_max - y_min) < threshold:
        return None
        
    # Convert to YOLOv8 format
    x_center = (x_min_trunc + x_max_trunc) / 2.0 / width
    y_center = (y_min_trunc + y_max_trunc) / 2.0 / height
    x_extend = (x_max_trunc - x_min_trunc) / width
    y_extend = (y_max_trunc - y_min_trunc) / height
    
    return (0, x_center, y_center, x_extend, y_extend)

In [None]:
def get_boxes(geometry):
    return np.min(geometry[:, 0]), np.min(geometry[:, 1]), np.max(geometry[:, 0]), np.max(geometry[:, 1])

annotations.loc[:,'boxes'] = annotations.loc[:,'geometry'].apply(get_boxes)
annotations.head(10)

#### Data Splitting

Split data into train and validation set and prepare the folders for storing the images and annotations.

In [None]:
fnames = list(annotations['image_id'].unique())
np.random.shuffle(fnames)
train_split = fnames[0:int(len(fnames)*0.8)]

print('Num samples', len(fnames))
print('Train split', len(train_split))
print('Val split  ', len(fnames) - len(train_split))

In [None]:
folder_crops = {'train': '/media/janko/DATA3/data/datasets/airbus/train/images/',
                'val': '/media/janko/DATA3/data/datasets/airbus/val/images/'}

folder_labels = {'train': '/media/janko/DATA3/data/datasets/airbus/train/labels/',
                 'val': '/media/janko/DATA3/data/datasets/airbus/val/labels/'}

for folders in [folder_crops, folder_labels]:
    for _, folder in folders.items():
        if not os.path.isdir(folder):
            os.makedirs(folder)

Create image crops and adjust the corresponding labels. The labels will be stored in seperate txt files (one for each image), as required by YOLOv8 format.

In [None]:
crop_size = 512
crop_overlap = 64
trunc_th = 0.3
step = crop_size - crop_overlap


for imfile in (imfiles):    
    image = cv2.imread(imfile)
    folder = 'train' if os.path.basename(imfile) in train_split else 'val'

    # Get annotations for image
    labels = annotations[annotations['image_id'] == os.path.basename(imfile)]
    img_id = os.path.splitext(os.path.basename(imfile))[0]    
 
    # Extract crops
    for x_start in tqdm(np.arange(0, cols - crop_size, step)):
        for y_start in np.arange(0, rows - crop_size, step):

            x_end = x_start + crop_size
            y_end = y_start + crop_size
            
            filename_crop = os.path.join(folder_crops[folder],
                                         img_id + '_' + str(x_start) + '_' + str(y_start) + '.jpg')
            filename_label = os.path.join(folder_labels[folder],
                                          img_id + '_' + str(x_start) + '_' + str(y_start) + '.txt')
                                        
            crop = image[y_start:y_end, x_start:x_end, :]
            assert crop.shape == (crop_size, crop_size, channels)                
            cv2.imwrite(filename_crop, crop)

            boxes = [recompute_box_coors(boxes, x_start, y_start, crop_size, crop_size, trunc_th)
                     for boxes in labels['boxes']]
            boxes = [box for box in boxes if box is not None]            

            # save labels
            with open(filename_label, 'w+') as f:
                for box in boxes:
                    f.write(' '.join(str(x) for x in box) + '\n')

Let us visualize the crops and the corresponding labels to check that the cropping has worked properly.

In [None]:
for idx, sample in enumerate(np.random.choice(os.listdir(folder_crops['train']), 4)):
    
    # Load image and corresponding labels
    image = imread(os.path.join(folder_crops['train'], sample))
    with open(os.path.join(folder_labels['train'], sample.replace('.jpg', '.txt')), 'r') as f:
        labels = f.readlines()

    for box in labels:
        box = np.array([d for d in box.split(' ')], dtype=np.float32)
        
        # Undo coordinate normalization
        x_center = box[1] * crop_size
        y_center = box[2] * crop_size

        width = box[3] * crop_size
        height = box[4] * crop_size

        # Convert from YOLOv8 format to OpenCV rectangle format
        x_start, y_start = int(x_center - width/2), int(y_center - height/2)
        x_end, y_end = int(x_center + width/2), int(y_center + height/2)

        cv2.rectangle(image, (x_start, y_start), (x_end, y_end), color=(0, 255, 0), thickness=2)

    plt.subplot(1,4,idx+1), plt.imshow(image)

### YOLOv8

Let's now load the detection model. There are different model [sizes](https://github.com/ultralytics/ultralytics) pretrained on [COCO](https://docs.ultralytics.com/datasets/detect/coco/) that you can chose from. We will use the small model here.

In [None]:
model = YOLO("yolov8s.pt")
model.info()

In [None]:
model

In [None]:
results[0].boxes

In [None]:
for idx, sample in enumerate(np.random.choice(imfiles, 4)):
    image = cv2.imread(sample)
    image = image[1500:, 1500:, :]

    result = model.predict(image, conf=0.2)[0]
    boxes = result.boxes.cpu().numpy().xyxy.astype(np.int16)

    for box_idx, box in enumerate(boxes):
        start, stop = box[0:2], box[2:]
        cv2.rectangle(image, start, stop, color=(0, 255, 0), thickness=5)
        font = cv2.FONT_HERSHEY_SIMPLEX
        image = cv2.putText(image, result.names[result.boxes.cls[box_idx].item()], (box[0], box[1]),
                            cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 6, cv2.LINE_AA)

    plt.subplot(1,4,idx+1), plt.imshow(image)

### Train YOLOv8 on Custom Dataset

In [None]:
config = """
# train and val datasets (image directory or *.txt file with image paths)
train: /media/janko/DATA3/data/datasets/airbus/train
val: /media/janko/DATA3/data/datasets/airbus/val

# number of classes
nc: 1

# class names
names: ['Aircraft']
"""

with open("data.yaml", "w") as f:
    f.write(config)

Training settings:
    https://docs.ultralytics.com/modes/train/#augmentation-settings-and-hyperparameters

In [None]:
root = "/home/janko/data/projects/"
!yolo task=detect mode=train model=yolov8s.pt data={root}/data.yaml epochs=10 imgsz=512 mosaic=0.0 flipud=0.5 scale=0.0

In [None]:
retrained = YOLO('/home/janko/data/projects/runs/detect/train4/weights/best.pt')

In [None]:
for idx, sample in enumerate(np.random.choice(imfiles, 4)):
    image = cv2.imread(sample)
    image = image[1800:, 1800:, :]

    result = model.predict(image, conf=0.2)[0]
    boxes = result.boxes.cpu().numpy().xyxy.astype(np.int16)

    for box_idx, box in enumerate(boxes):
        start, stop = box[0:2], box[2:]
        cv2.rectangle(image, start, stop, color=(0, 255, 0), thickness=5)
        font = cv2.FONT_HERSHEY_SIMPLEX
        image = cv2.putText(image, result.names[result.boxes.cls[box_idx].item()], (box[0], box[1]),
                            cv2.FONT_HERSHEY_SIMPLEX, 3, (255, 0, 0), 6, cv2.LINE_AA)
        
    result = retrained.predict(image, conf=0.2)[0]
    boxes = result.boxes.cpu().numpy().xyxy.astype(np.int16)

    for box_idx, box in enumerate(boxes):
        start, stop = box[0:2], box[2:]
        cv2.rectangle(image, start, stop, color=(0, 0, 255), thickness=5)
        font = cv2.FONT_HERSHEY_SIMPLEX
        image = cv2.putText(image, result.names[result.boxes.cls[box_idx].item()], (box[0], box[1]),
                            cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 0, 255), 6, cv2.LINE_AA)

    plt.subplot(1,4,idx+1), plt.imshow(image)