Skip to content

toelt-llc/FlightScope_Bench

Repository files navigation

FlightScope

Official implementation of the paper "FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery"

Paper: arXiv

Summary

This study compares multiple deep learning algorithms, including Faster RCNN, DETR, SSD, RTMdet, RetinaNet, CenterNet, YOLOv5, and YOLOv8, trained and evaluated on aerial images for the detection and localization of aircrafts. The graphical summary of the work is presented if the following figure:

The following video shows the inference of Barcelona Airport of the trained algorithms with a detection threshold of 70%. You can find the original video at ShutterStock.

concatenated_video.webm

Datasets:

  • HRPlanesv2 Dataset (for the training and evaluation):

The HRPlanesv2 dataset contains 2120 VHR Google Earth images. To further improve experiment results, images of airports from many different regions with various uses (civil/military/joint) selected and labeled. A total of 14,335 aircrafts have been labeled. Each image is stored as a ".jpg" file of size 4800 x 2703 pixels, and each label is stored as YOLO ".txt" format. Dataset has been split into three parts as 70% train, 20% validation, and test. The aircrafts in the images in the train and validation datasets have a percentage of 80 or more in size. Link

  • GDIT Dataset (for evaluation):

The GDIT Aerial Airport dataset consists of aerial images (satellite/remote sensing) containing instances of parked airplanes. All plane types are grouped into a single named "airplane". The total number of images is 338 broadcasted between train, test and validation subsets. All the annotation are in yolo format as well. Link

Algorithms Brief Description

The following table groups the 8 tested models:

Model Description
SSD SSD is a real-time object detection algorithm that predicts bounding boxes and class scores for multiple fixed-size anchor boxes at different scales. It efficiently utilizes convolutional feature maps to achieve fast and accurate detection.
Faster RCNN Faster-RCNN is a two-stage object detection framework. It employs a region proposal network (RPN) to generate potential bounding box proposals, combining them with deep feature maps for accurate object detection.
CenterNet CenterNet is a single-stage object detection approach that focuses on predicting object centers and regressing bounding boxes. It achieves high accuracy through keypoint estimation for precise object localization.
RetinaNet RetinaNet is recognized for its focal loss, addressing the class imbalance issue in one-stage detectors. By combining a feature pyramid network with focal loss, RetinaNet excels in detecting objects at various scales with improved accuracy.
DETR DETR is a transformer-based object detection model that replaces traditional anchor-based methods with a set-based approach. It utilizes the transformer architecture to comprehend global context and achieve precise object localization.
RTMdet RTMdet is an advanced object detection model that leverages a novel framework called Rotate to Maximum (RTM) to improve accuracy compared to traditional Faster R-CNN models. The model is effective in handling objects with varying orientations, resulting in improved detection accuracy. However, its computational complexity may impact performance compared to other state-of-the-art models.
YOLOv5 YOLOv5 utilizes methods of anchor box refinement, PANet feature pyramid network, and CSPNet for detecting different scale targets. It improves accuracy and efficiency in object detection tasks.
YOLOv8 YOLOv8 introduces advancements in object detection by refining the architecture, incorporating feature pyramid networks, and optimizing the training pipeline. It enhances accuracy and speed in detecting objects.

Instructions

Clone Repository

sudo apt-get update && upgrade
git clone https://github.com/toelt-llc/FlightScope_Bench.git
cd FlightScope_Bench/

Setup Conda Environment

To create conda environment, run the following code:

conda create --name flightscope python=3.8 -y
conda activate flightscope

Then proceed by following the instruction in the next step

Set Up Workflows

The study utilizes two popular deep learning frameworks for object detection:

Annotation Conversion

As the HRPlanesv2 dataset is provided with YOLO annotation (txt file with bounding boxes), conversion to JSON COCO annotation is necessary for detectron2 and mmdetection compatibility. The conversion process is detailed in "__data_collection.ipynb" using the Yolo-to-COCO-format-converter repository.

Weights Download

The resulting weights of the trained deep learning model have are publically available on Google-Drive. These steps are also available at the begining of the result_vizualiser.ipynb.

You can download and extract the files using the following commands:

# Make sure you have the gdown lib installed other wise run the following line
pip install gdown

# upgrade gdown
pip install --upgrade gdown

# Download the file
!gdown https://drive.google.com/file/d/13aXBJcxKXjqyq7ycAg4LIe8TEmrX-kxa/view?usp=sharing

# Unzip it in the home folder
%unzip output_tensor.zip

Usage

(Train from scratch)

All the training and inferences are given in the notebooks:

SSD, YOLO v5, YOLO v8, Faster RCNN, CenterNet, RetinaNet, DETR, RTMDet.

All the obtained results are given in the notebooks:

9_result_vizualiser.ipynb, 10_evaluation_metrics.ipynb and inference_tester.ipynb.

Evaluation Metrics

The evaluation metrics used for this project are: iou, recall and average precision. This histogram figure from the paper summerizes the obtained metrics: output_histogram

Some previews of the results:

  • Preview samples of detection results of CenterNet and DETR
Detected Aircrafts Detected Aircrafts

Citation

If our work is useful for your research, please consider citing us:

@misc{ghazouali2024flightscope,
      title={FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery}, 
      author={Safouane El Ghazouali and Arnaud Gucciardi and Nicola Venturi and Michael Rueegsegger and Umberto Michelucci},
      year={2024},
      eprint={2404.02877},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Copyright Notice

The provided code is based on the following open-source libraries:

This code is free to use for research and non-commercial purposes.


This code is provided by Safouane El Ghazouali, PhD, Senior researcher AND Arnaud Gucciardi. For personal contact: safouane.elghazouali@gmail.com.