# Aircraft detection using YOLO11

This notebook contains details and a step-by-step guide to build an object detection program locally for aircraft detection. We will use models from YOLO11. This is only for academic and learning purposes, therefore we claim no originality in the methods and techniques used. This notebook is highly inspired from guides given by EJ Technology Consultants and the YOLO11 documentation.

<!-- ## License
This notebook is licensed under the MIT License. See the [LICENSE](LICENSE) file for details. -->

## References
1. EJ Technology Consultants (2024). How to Train YOLO 11 Object Detection Models Locally with NVIDIA. Retrieved from [here](https://www.ejtech.io/learn/train-yolo-models#:~:text=This%20guide%20provides%20step-by-step%20instructions%20for%20training%20a,on%20a%20local%20PC%20using%20an%20NVIDIA%20GPU.)].
2. Ultralytics. (2024). YOLO Documentation. https://docs.ultralytics.com/


## Contents
1. [Step 1: Preliminaries, environment and required packages](#Step-1:-Preliminaries,-environment-and-required-packages)

2. [Step 2: Dataset and Organization](#Step-2:-The-dataset-and-labelings)

3. [Step 3: Setting up labels](#Step-3:-Setting-up-labels)

4. [Step 4: Training configuration](#Step-4:-Training-Configuration)

5. [Step 5: Training the model](#Step-5:-Train-the-model)



## Step 1: Preliminaries, environment and required packages

We start by creating a virtual environment in our project folder in order to install all the required packages. Recall this is made by first navigating to the project directory and execute: 

```sh 
cd "c:\Users\user\project_directory"
```

Next, activate the virtual environment and install the packages using the `requirements.txt` file: 

```sh
.\env\Scripts\activate
pip install -r requirements.txt 
```

When your environment is active, you will see the name of the environment in parentheses at the left of the command line. 


Now, we will install the `ultralytics` library (if not yet installed from the `requirements.txt` file):

In [None]:
%pip install ultralytics;

Since we will train in our local GPU, we need to use the GPU-enabled version of PyTorch. You can find the command to install it [here](https://pytorch.org/get-started/locally/). 

In [None]:
%pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118;

We can confirm if PyTorch-GPU is correctly installed by running the following lines: 

In [None]:
import torch

# Check if CUDA is available
cuda_available = torch.cuda.is_available()

# Print the result
print(f"CUDA available: {cuda_available}")

# If CUDA is available, print the GPU name
print(f"GPU Name: {torch.cuda.get_device_name(0)}" if cuda_available else "No GPU available.") 

# device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

## Step 2: The dataset and labelings

Before training we need to preprocess our data. In this notebook we will use the Kaggle's [Airbus Aircraft dataset](https://www.kaggle.com/datasets/airbusgeo/airbus-aircrafts-sample-dataset) from Airbus High Resolution Satellite Imagery. We can download the dataset using kagglehub. 

Note: We may need to set up our API keys from kaggle. For this visit https://www.kaggle.com/docs/api#authentication. 

First, let us install kagglehub:

```bash
%pip install kagglehub;
```

Now we download the last version of the dataset. 

```python
import kagglehub

# Download latest version
download_path = kagglehub.dataset_download("airbusgeo/airbus-aircrafts-sample-dataset")

print("Path to dataset files:", download_path)
```

We can then copy the downloaded dataset to our project folder from the `download_path` to a folder named `dataset`

```python
import shutil
import os

shutil.copytree(download_path, os.path.join(os.getcwd(), "dataset"))
```

__Note:__ Of course you can also download manually the dataset from Kaggle and the copy it to your project folder. 

According to the description of the Airbus Aircraft dataset we have the following:

### Imagery for training

The `images` folder contains 103 extract of Pleiades imagery at roughly 50 cm resolution. Each each image is stored as a JPEG file of size 2560 x 2560 pixels (i.e. 1280 meters on ground). 

We will use this set of images to train our model. Before doing so we have to split our image dataset into train and validation sets.

### Annotations (labels)

All aircrafts have been annotated with bounding boxes on the provided imagery. These annotations are provided in the form of closed GeoJSON polygons. A CSV file named `annotations.csv` provides all anotations - one annotation per line with the corresponding filename of the image as `image_id` and the class of the annotation, mainly `Aircraft` or `Truncated_Aircraft` for aircrafts located at the border of the image.

Sometimes the dataset may not have annotations or labels corresponding to the objects we want to detect. In this case one can construct annotations using label tools such as [LabelImg](https://github.com/HumanSignal/labelImg) or [LabelStudio](https://labelstud.io/). It is important to notice that our dataset comes with annotations in the form of closed GeoJSON polygons so we will have to convert them into Yolo annotation format. We will see this in Step 3. 

### Extra imagery

A folder named `extras` contains 6 extra images which are not annotated but could be used to test a model on new - unseen before - images. We will use these images to make predictions once we train the YOLO model. 

## Step 3: Setting up labels

### Understanding our dataset

As mentioned before our dataset comes with annotations in the form of closed GeoJSON polygons. Recall that a GeoJSON polygon has the following format: 

```sh 
{
  "type": "Feature",
  "geometry": 
  {
    "type": "Polygon",
    "coordinates": [
      [
        [x1, y1], [x2, y2], [x3, y3], [x4, y4], [x1, y1]
      ]
    ]
  },
  "properties": 
  {
    "class": "object_name"
  }
}
```
The `annotations.cvs` file has two key columns: image_id and geometry. The image_id column provides unique image identifiers. Since multiple aircrafts of interest can exist within the same image, image_id values may repeat across multiple rows.
The geometry column contains the spatial information of the aircrafts within the corresponding image. Each entry in the geometry column has the polygon coordinates in the format above. This structure means that a single image can have multiple associated geometries, each contained in a different row but linked by the same `image_id`. Let us use the polygon geometries of the first image_id in our dataset to graph the corresponding polygons in the image.

In [None]:
#Load the dataset into a pandas dataframe and print first 19 entries
import pandas
annotations = pandas.read_csv("dataset/annotations.csv")
annotations.head(19)

As you can see, the first 18 rows have the same image_id: `4f833867-273e-4d73-8bc3-cb2d9ceb54ef.jpg`. Let us first observe this image. 

In [None]:
from IPython.display import Image, display

pic_1_path = "dataset/images/4f833867-273e-4d73-8bc3-cb2d9ceb54ef.jpg"
pic_1_id = "4f833867-273e-4d73-8bc3-cb2d9ceb54ef.jpg"
pic_1 = filename=(pic_1_path) 
display(Image(pic_1))

The image above includes some fog and in a very naive approach we can indeed count 18 aircrafts. Let us extract the aircraft geometries associated to this image. We will refer to the image above as `pic_1`.  

In [None]:
import ast

# Group annotations by image_id and collect geometries
grouped_geometries = annotations.groupby("image_id")["geometry"]

# Apply the conversion function to parse string geometries into actual objects. Recall that the geometries are stored as strings in the CSV file.
all_geometries = grouped_geometries.apply(lambda x: [ast.literal_eval(geom) for geom in x])

# Extract geometries for our specific image
pic_1_geometries = all_geometries[pic_1_id]
pic_1_geometries


Now we can plot each of the polygons that enclose the aircrafts:

In [None]:
import matplotlib.pyplot as plt
from shapely.geometry import shape
from matplotlib.patches import Polygon
from PIL import Image as PILImage

# Reload pic_1 with PIL
pic_1 = PILImage.open(pic_1_path)

# Create a plot 
fig, ax = plt.subplots(1, figsize=(10,10))

# Display the image
ax.imshow(pic_1) 

# Plot each polygon
for geom in pic_1_geometries:
    patch = Polygon(geom, closed=True, edgecolor='red', facecolor='none', linewidth=2)
    ax.add_patch(patch)

# Show the plot
plt.show()


### GeoJSON to YOLO11 annotations

As we remarked before, in order to train our model we will transform the GeoJSON annotations to the standard YOLO annotation format. This latter consist of the following:

1. Text files (.txt) for each image. 
2. One line per object in each text file.
3. Each line follows this structure: class_id x_center y_center width height.

Where:

- class_id is an integer representing the object class (starting from 0). For the annotations we were given, we will only have two different class id's: 0 for `Airplane` and 1 for `Truncated_airplane`.
- x_center, y_center are the normalized coordinates (0-1) of the bounding box center.
- width, height are the normalized dimensions (0-1) of the bounding box.

We can then write a python script to generate the YOLO annotations from our GeoJSON annotations contained in `annotations.csv`. Since we will also need an specific folder structure for the data we will work on the script later.

### Structuring the dataset

We are almost ready to start training the YOLO model. Ultralytics requires a particular folder structure to store the training data. The root folder is named "data" and inside it contains two main folders for the training and validation data. The folder structure is given as follows: 

```
data/
│── train/
│   ├── images/
│       ├── img_01.jpg
│   ├── labels/
│       ├── img_01.txt
│ 
│── validation/
│   ├── images/
│       ├── img_02.jpg
│   ├── labels/
│       ├── img_02.txt
│── classes.txt

```

As a consequence we have to create a script to generate the YOLO annotations from our `annotations.csv` file and additionally we have to store the data and the created annotations as in the structure above. We will then write a python script that will output the root folder "data" with the required folder structure and at the same time it will create the YOLO annotations we want. This script is under the name `Geojson_to_YOLO.py`

In [None]:
from Geojson_to_YOLO import YOLOConverter

# Create an instance of YOLOConverter
converter = YOLOConverter(
    image_width=2560,  # Replace with your actual image width
    image_height=2560, # Replace with your actual image height
    output_dir="data"  # This will create the structure shown in your diagram
)

# Convert annotations and organize dataset
converter.convert_csv_to_yolo(
    csv_file="dataset/annotations.csv", #path to the annotations file
    images_dir="dataset/images/", #path to the images directory
    train_split=0.8  # 80% training, 20% validation
)

We can now check that we have a directory called `data` in which we have stored the train and validation sets in the required manner. 

## Step 4: Training Configuration

Before training we have to create the configuration YAML file. An example of such file can be found in ultralytics documentation (see [here](https://docs.ultralytics.com/datasets/detect/objects365/#dataset-yaml)). For this, we can use our prefered text editor and add a new `.yaml` file with the following contents: 

![image-2.png](attachment:image-2.png)

Notice that in path we want to write the path to the `data` folder we build in the last step. Make sure you do not let spaces as in lines 4 and 6, otherwise we will have some error.  

## Step 5: Train the model

There are several YOLO models you can use. Information about metrics can be found in [Ultralytics website](https://docs.ultralytics.com/models/yolo11/#overview). We will use the YOLO11 model. 

In [None]:
from ultralytics import YOLO

#Load the pretrained model from the Ultralytics repository
model = YOLO("yolo11n.pt", task="detect")

#Train the model on our dataset
results = model.train(
    data= "data.yaml", 
    epochs=60, 
    device="cuda")

# Evaluate the model's performance on the validation set
results = model.val();


If everything worked out well, you should have a folder in the working directory called `runs`. Inside we have folders containing the weights obtained form training. We will use these weights to make predictions.

In [None]:
from ultralytics import YOLO
from PIL import Image
import matplotlib.pyplot as plt
from pathlib import Path

def show_predictions(image_pred_dir, img):
    """
    Display image with predictions using PIL and matplotlib
    Args:
        image_pred_dir: Path to the image directory for predictions
    """
    image_pred_path = Path(image_pred_dir)
    image = Image.open(image_pred_path / img)
    
    # Load model with the best weights from training
    model = YOLO('runs/detect/train3/weights/best.pt')
    
    # Run inference on a single image
    result = model(image)[0]  
    
    # Plot the image with predictions using matplotlib
    plt.figure(figsize=(12, 8))
    plt.imshow(result.plot())  # result.plot() returns a numpy array with the predictions drawn
    plt.axis('off')  # Hide axes
    plt.show()  # Display to screen        

image_pred_dir = "dataset/extras/"
show_predictions(image_pred_dir, "65825eef-f8a1-41b3-ac87-4a0a7d482a0e.jpg")