<a href="https://colab.research.google.com/github/wsimpso1/recent_projects/blob/main/Satellite_Imagery_Analysis/SatDeV_data_preparation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SatDeV: Satellite and Aerial Detection of Vehicles

William Simpson \
DATA 690: Applied Artifical Intelligence \
Spring 2022

## Project Description:
Leverage transfer learning to retrain an object detection model to identify 5 classes of vehicles in satellite and aerial imagery: small vehicles, large vehicles, ships, planes, helicopters.

## References:
1. Xia, G. S., Bai, X., Ding, J., Zhu, Z., Belongie, S., Luo, J., ... & Zhang, L. (2018). DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3974-3983). https://doi.org/10.48550/arXiv.1711.10398
2. Solawetz, J., Nelson, J., MAY 21, S. S., & Read, 2020 9 Min. (2020, May 21). How to Train YOLOv4 on a Custom Dataset. Roboflow Blog. https://blog.roboflow.com/training-yolov4-on-a-custom-dataset/ 
3. Ivan Goncharov. (2019, July 18). Set Up YOLOv3 & Darknet on Google Colab. https://www.youtube.com/watch?v=USdaipqgZR8
4. Techzizou. (2021, February 24). TRAIN A CUSTOM YOLOv4-tiny OBJECT DETECTOR USING GOOGLE COLAB. Analytics Vidhya. https://medium.com/analytics-vidhya/train-a-custom-yolov4-tiny-object-detector-using-google-colab-b58be08c9593#d4cc
5. Radio Free Europe/Radio Liberty. (2021, April 21). Satellite Images Show Military Buildup In Russia, Ukraine. Radio Free Europe/Radio Liberty. https://www.rferl.org/a/russia-ukraine-military-buildup-satellite-images/31214867.html
6. Reuters. (2022, January 20). Satellite images show Russian troop build-up near Ukraine border. https://www.youtube.com/watch?v=u06ePMYR3IU
7. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection (arXiv:1506.02640). arXiv. https://doi.org/10.48550/arXiv.1506.02640

---

# Notebook 1: Data Preparation

In [None]:
import os
import re
import shutil

In [None]:
# mount colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# scripts for transforming DOTA data to a format recognized by YOLO/Darknet
# Source: https://github.com/ringringyi/DOTA_YOLOv2
!git clone https://github.com/ringringyi/DOTA_YOLOv2.git

Cloning into 'DOTA_YOLOv2'...
remote: Enumerating objects: 1060, done.[K
remote: Total 1060 (delta 0), reused 0 (delta 0), pack-reused 1060[K
Receiving objects: 100% (1060/1060), 32.42 MiB | 20.81 MiB/s, done.
Resolving deltas: 100% (198/198), done.


In [None]:
%cd /content/DOTA_YOLOv2/data_transform/

/content/DOTA_YOLOv2/data_transform


In [None]:
from YOLO_Transform import dota2darknet

In [None]:
%cd ../../drive/MyDrive/data690-ai/data/train

/content/drive/MyDrive/data690-ai/data/train


In [None]:
!mkdir labels

In [None]:
# transform data from DOTA format to a format recognized by YOLO
dota2darknet('images',
             'labelTxt',
             'labels',
             ['small-vehicle','large-vehicle','plane','ship','helicopter'])

In [None]:
# file paths

label_path = '/content/drive/MyDrive/data690-ai/data/train/labels'
image_path = '/content/drive/MyDrive/data690-ai/data/train/images'

In [None]:
# count number of files
 
count_txt = 0

# iterate over files in directory
for filename in os.listdir(label_path):
  count_txt += 1

print('Total number of files:', count_txt)

Total number of files: 1411


In [None]:
!mkdir labels_not_for_use

In [None]:
# remove label files that do not contain objects from the selected classes

file_destination = '/content/drive/MyDrive/data690-ai/data/train/labels_not_for_use'

for filename in os.listdir(label_path):
  # label file paths
  label_file = os.path.join(label_path, filename)
  # check if label is empty -- meaning no objects in selected classes were found
  with open(label_file, 'r') as f:
    objects = f.read()
    if objects == '':
      # remove empty label files
      shutil.move(label_file, file_destination)
      #os.remove(label_file)
    if objects == ' ':
      shutil.move(label_file, file_destination)

In [None]:
# confirm that all empty label files have been removed 
# and count the number of remaining files

empty_files = []
count_txt = 0

for filename in os.listdir(label_path):
  count_txt+=1
  label_file = os.path.join(label_path, filename)
  with open(label_file, 'r') as f:
    objects = f.read()
    if objects == '':
      print(filename)
      empty_files.append(filename)
      
print('Number of empty files:', len(empty_files))
print('Number of non-empty files:', count_txt)

Number of empty files: 0
Number of non-empty files: 944


In [None]:
!mkdir images_not_for_use

In [None]:
# remove images that do not contain objects of the selected classes

file_destination = '/content/drive/MyDrive/data690-ai/data/train/images_not_for_use'

for image in os.listdir(image_path):
  image_to_filename = re.sub('.png', '.txt', image)
  if image_to_filename not in os.listdir(label_path):
    shutil.move(os.path.join(image_path, image), file_destination)
    #os.remove(os.path.join(image_path, image))

In [None]:
# confirm number of images matches the number of label files

count_image = 0 
for image in os.listdir(image_path):
  count_image += 1
  
print('Number of images:', count_image)
print('Number of images equals number of annotation files:', count_image==count_txt)

Number of images: 944
Number of images equals number of annotation files: True


### Reorganize File Structure for Darknet

In [None]:
!mkdir combined

In [None]:
# copy all images into new folder that will contain both images and annotations
!cp -R /content/drive/MyDrive/data690-ai/data/train/images/* /content/drive/MyDrive/data690-ai/data/train/combined

In [None]:
# copy all label file in to that new combined folder
!cp -R /content/drive/MyDrive/data690-ai/data/train/labels/* /content/drive/MyDrive/data690-ai/data/train/combined