# DI 725: Transformers and Attention-Based Deep Networks

## Assignment 2 : Object Detection

The purpose of this notebook is to guide you through the usage of **auair_yolos.py.**

### Author:
* Ebru Kültür Başaran

## Requirements
Install requirements for your environment, comment out for later uses.

Dependencies:
- Python >=3.8
- pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
- pip install transformers[torch] albumentations opencv-python pycocotools torchmetrics wandb pillow

In [1]:
# Load the Drive helper and mount
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!pip install torchmetrics

Collecting torchmetrics
  Downloading torchmetrics-1.7.1-py3-none-any.whl.metadata (21 kB)
Collecting lightning-utilities>=0.8.0 (from torchmetrics)
  Downloading lightning_utilities-0.14.3-py3-none-any.whl.metadata (5.6 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.0.0->torchmetrics)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=2.0.0->torchmetrics)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=2.0.0->torchmetrics)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=2.0.0->torchmetrics)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=2.0.0->torchmetrics)
  D

In [3]:
!wandb login

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 
[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mtrial[0m to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


## 1. Convert annotations to COCO format
We convert to COCO format because the HuggingFace AutoImageProcessor and Trainer workflows expect COCO-style annotation JSON. Converting once at the start avoids needing custom parsing logic later and lets us leverage standardized COCO utilities such as pycocotools for data loading, augmentation, and evaluation.

In [None]:
!python "/content/drive/MyDrive/Assignment 2/auair_yolos.py" convert \
    --ann "/content/drive/MyDrive/Assignment 2/data/AU-AIR/auair_coco.json" \
    --img-root "/content/drive/MyDrive/Assignment 2/data/AU-AIR/images" \
    --out "/content/drive/MyDrive/Assignment 2/yolos-auair"

## 2. Train YOLOS-Tiny Model
To train the model, we define the image root path, COCO converted annotations file, image size, number of epochs, number of batches, number of workers for optimum GPU usage and otput directory to save findings. For a sufficient training time, the epoch count is selected as 5.

In [26]:
!python "/content/drive/MyDrive/Assignment 2/auair_yolos.py" train \
  --img-root   "/content/drive/MyDrive/Assignment 2/data/AU-AIR/images" \
  --ann        "/content/drive/MyDrive/Assignment 2/data/AU-AIR/auair_coco.json" \
  --img-size 384 \
  --epochs   10 \
  --batch    8 \
  --workers  8 \
  --outdir   "/content/drive/MyDrive/Assignment 2/yolos-auair"

2025-04-21 23:42:15.376871: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-21 23:42:15.394950: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745278935.417335   34528 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745278935.424036   34528 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-21 23:42:15.446448: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

## 3. Evaluate the Model
We finally evaluate our models performance on test set. The fine‑tuned model is loaded from --model-dir, then the validation dataset is built on the test samples, and the Hugging Face Trainer.evaluate method to compute per‑class AP@0.5 and overall mAP s used. Finally, the resulting metrics dictionary is obtained.

In [28]:
!python "/content/drive/MyDrive/Assignment 2/auair_yolos_evaluation.py" \
  --img-root   "/content/drive/MyDrive/Assignment 2/data/AU-AIR/images" \
  --ann        "/content/drive/MyDrive/Assignment 2/yolos-auair/val.json" \
  --img-size   384 \
  --batch      8 \
  --workers    8 \
  --checkpoint "/content/drive/MyDrive/Assignment 2/yolos-auair/"

2025-04-22 00:37:28.494498: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-22 00:37:28.512173: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1745282248.533901   50016 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1745282248.540482   50016 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-22 00:37:28.562299: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr