# Low power person detection on UAVs

This is the notebook with holds the complete pipeline for our project in TinyML!

Our goal is to deploy optimized person detection models on edge devices with regards to trade-offs between power-consumption, inference speed and accuracy. 

We load, retrain, optimize, benchmark and deploy these models in this jupyternotebook, which is a compressed version of the source code in this repository


In [12]:
from ultralytics import YOLO
import tensorflow as tf
import tensorflow_hub as hub
import torch
import os


## We start with loading different models 
At the start of the project we used models from EfficientDet, Fomo, Yolo and mobilenet_ssd. After comparison we decided to only move forward with the YOLO model, therefore later code is written only for the yolo architecture.

For EfficientDet and Mobilenet-ssd we use tensorflow-hub to get the models, while FOMO is only available via manual download from Edge-Impulse. The usage of YOLO is greatly simplified by using the ultralytics library for YOLO, which handles download and provides a training framework

In [13]:
# load_model.py

def load_yolo(model_name : str, model_name_ext: str):
    """
    Loads a YOLO modle
    """
    os.makedirs("models", exist_ok=True)

    model = YOLO(model_name_ext)
    exported_path = model.export(format="saved_model")


    return exported_path


def load_mobilenet_ssd(model_name: str, model_url: str = "https://tfhub.dev/tensorflow/ssd_mobilenet_v2/2"):
    """Loads MobileNet SSD from TensorFlow Hub"""
    os.makedirs("models", exist_ok=True)
    
    model = hub.load(model_url)
    saved_model_path = f"{model_name}_saved_model"
    tf.saved_model.save(model, saved_model_path)
    return saved_model_path


def load_efficientdet(model_name: str, model_url: str = "https://tfhub.dev/tensorflow/efficientdet/d0/1"):
    """Loads EfficientDet from TensorFlow Hub"""
    os.makedirs("models", exist_ok=True)

    model = hub.load(model_url)
    saved_model_path = f"{model_name}_saved_model"
    tf.saved_model.save(model, saved_model_path)
    return saved_model_path

In [14]:

# download the yolo model
model_name = "yolo11n"
model_name_ext = "yolo11n.pt"
yolo_saved_model_path = load_yolo(model_name, model_name_ext)




Ultralytics 8.3.208 🚀 Python-3.12.3 torch-2.8.0+cu128 CPU (AMD Ryzen 7 7735U with Radeon Graphics)
YOLO11n summary (fused): 100 layers, 2,616,248 parameters, 0 gradients, 6.5 GFLOPs

[34m[1mPyTorch:[0m starting from 'yolo11n.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 84, 8400) (5.4 MB)

[34m[1mTensorFlow SavedModel:[0m starting export with tensorflow 2.19.0...

[34m[1mONNX:[0m starting export with onnx 1.19.0 opset 22...
[34m[1mONNX:[0m slimming with onnxslim 0.1.71...
[34m[1mONNX:[0m export success ✅ 1.0s, saved as 'yolo11n.onnx' (10.2 MB)
[34m[1mTensorFlow SavedModel:[0m starting TFLite export with onnx2tf 1.28.2...
Saved artifact at 'yolo11n_saved_model'. The following endpoints are available:

* Endpoint 'serving_default'
  inputs_0 (POSITIONAL_ONLY): TensorSpec(shape=(1, 640, 640, 3), dtype=tf.float32, name='images')
Output Type:
  TensorSpec(shape=(1, 84, 8400), dtype=tf.float32, name=None)
Captures:
  131192950550992: TensorSpec(shape=(4

I0000 00:00:1761037356.341521    9204 devices.cc:67] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
I0000 00:00:1761037356.341645    9204 single_machine.cc:374] Starting new session
W0000 00:00:1761037357.312253    9204 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1761037357.312278    9204 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
I0000 00:00:1761037357.775859    9204 devices.cc:67] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
I0000 00:00:1761037357.775983    9204 single_machine.cc:374] Starting new session
W0000 00:00:1761037358.264533    9204 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1761037358.264560    9204 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.


[34m[1mTensorFlow SavedModel:[0m export success ✅ 10.9s, saved as 'yolo11n_saved_model' (25.7 MB)

Export complete (11.2s)
Results saved to [1m/home/jakub/Documents/semester5/TinyML/uav-person/low-power-person-detection-uav-sar/src[0m
Predict:         yolo predict task=detect model=yolo11n_saved_model imgsz=640  
Validate:        yolo val task=detect model=yolo11n_saved_model imgsz=640 data=/usr/src/ultralytics/ultralytics/cfg/datasets/coco.yaml  
Visualize:       https://netron.app


We now downloaded Yolo11n.pt and will continue by using one of two domain-specific datasets to retrain and with that finetune these models for the deployment of person-detection on UAVs

In [15]:
#train.py

# Sys info:


# print(torch.__version__)
# print(torch.version.cuda)

print("PyTorch CUDA: ", torch.cuda.is_available())
# print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else "No GPU")
# print("Ultralytics CUDA: ", YOLO("yolo11n.pt").device)

yolo_retrained_model_path = "some path"


def train():
    # Load a model
    model = YOLO("yolo11n.pt")  # load a pretrained model 


    # Train the model using VisDrone dataset
    # results = model.train(data="VisDrone.yaml", device=0, epochs=100, imgsz=640, batch=16, plots=True)

    # OR

    # Train the model using C2A dataset
    results = model.train(data="c2a-yolo.yaml", device=0, epochs=100, imgsz=640, batch=16, plots=True)

def export():
    best_model = YOLO("runs/detect/train/weights/best.pt")
    best_model.export(format="onnx")
    # best_model.export(format="tflite", imgsz=640, half=False, int8=False)
    # best_model.export(format="tflite", imgsz=640, int8=True, data="VisDrone.yaml")



from multiprocessing import freeze_support
freeze_support()  # optional, but safe on Windows
train()
# export()





PyTorch CUDA:  False
No GPU
New https://pypi.org/project/ultralytics/8.3.218 available 😃 Update with 'pip install -U ultralytics'
Ultralytics 8.3.208 🚀 Python-3.12.3 torch-2.8.0+cu128 


ValueError: Invalid CUDA 'device=0' requested. Use 'device=cpu' or pass valid CUDA device(s) if available, i.e. 'device=0' or 'device=0,1,2,3' for Multi-GPU.

torch.cuda.is_available(): False
torch.cuda.device_count(): 0
os.environ['CUDA_VISIBLE_DEVICES']: -1
See https://pytorch.org/get-started/locally/ for up-to-date torch install instructions if no CUDA devices are seen by torch.


### Optimization
After retraining the model on our domain specific dataset, we want to optimize the model for Edge-deployment. 

For this, we wrote a script with some optimization methods from the lecture and some other options from tensorflow-lite.

**We optimize for:**
- Size
- Latency
- Trade-off (.DEFAULT)

**We quantisize:**
- f32 ( no quantization)
- f16
- dynamic int 8


**With following settings:**
- restrict to tensorflow builtin ops
- match inference in-/output types to quanitzation types
- Experimental Converter = true ( may support more features and improved conversion)
- allow_custom_ops = Tru - allows custom TensorFlow operations, not neccessary but no harm either


In [None]:


def optimize_model(model_name: str, model_path: str, output_dir: str = "models/optimized_models"):
    """Generate different variants of optimized models"""
    

    os.makedirs(output_dir, exist_ok=True)
    
    # Size optimized variants 
    _convert(model_name, model_path, output_dir, "size_float32", tf.lite.Optimize.OPTIMIZE_FOR_SIZE, None)
    _convert(model_name, model_path, output_dir, "size_float16", tf.lite.Optimize.OPTIMIZE_FOR_SIZE, tf.float16)
    _convert(model_name, model_path, output_dir, "size_dynamic", tf.lite.Optimize.OPTIMIZE_FOR_SIZE, "dynamic")
    


    # Latency optimized variants
    _convert(model_name, model_path, output_dir, "latency_float32", tf.lite.Optimize.OPTIMIZE_FOR_LATENCY, None)
    _convert(model_name, model_path, output_dir, "latency_float16", tf.lite.Optimize.OPTIMIZE_FOR_LATENCY, tf.float16)
    _convert(model_name, model_path, output_dir, "latency_dynamic", tf.lite.Optimize.OPTIMIZE_FOR_LATENCY, "dynamic")




def _convert(model_name: str, model_path: str, output_dir: str, suffix: str, optimization, quant_type):
    """Helper function to convert model with specific settings"""


    converter = tf.lite.TFLiteConverter.from_saved_model(model_path)
    converter.optimizations = [optimization]

    # restrict to tensorflow builtin ops
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]

    # float16 quantization
    if quant_type == tf.float16:
        converter.target_spec.supported_types = [tf.float16]
        converter.inference_input_type = tf.float16
        converter.inference_output_type = tf.float16
    # try dynamic range quantization
    elif quant_type == "dynamic":
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
        converter.inference_input_type = tf.float32
        converter.inference_output_type = tf.float32
    else:
        converter.inference_input_type = tf.float32
        converter.inference_output_type = tf.float32

    # try with experimental converter
    converter.experimental_new_converter = True

    # custom ops (not neccessary i think)
    converter.allow_custom_ops = True


    try:
        tflite_model = converter.convert()

    except ValueError as e:

        print(f"Skipping {suffix} for {model_name}: {e}")

        return None

    output_path = os.path.join(output_dir, f"{model_name}_{suffix}.tflite")

    with open(output_path, 'wb') as f:

        f.write(tflite_model)


    print(f"{suffix}: {output_path} ({len(tflite_model) / (1024*1024):.2f} MB)")


    return output_path




In [None]:
optimize_model(model_name, yolo_retrained_model_path)

NameError: name 'yolo_retrained_model_path' is not defined

Choosing a optimized model :( SOME MODEL )
we can now run some inference on example pictures to see where stengths and weaknesses of our detection lie

In [None]:
# inference

To gain actual knowledge we run different benchmarks on our models to compare them performance and power wise

In [None]:
# benchamrking