# Deep Learning project : Dangerous Object Detection

## Introduction

Before training a YOLO model, the dataset must be organized in the right format and optionally enriched with data augmentations.  
This notebook guides you through the process of preparing your annotated images and labels for training.  

You will:  
- Arrange your dataset into the YOLO folder structure (`images/train`, `labels/train`, etc.).  
- Apply data augmentations (e.g., flips, brightness/contrast changes) to improve model robustness.  
- Validate that images and annotations are properly aligned and ready for training.  

By the end, you will have a clean and well-structured dataset that can be directly used in the fine-tuning stage.

### Downloading Out of the box YOLOv10 Models weights
In this section, we download pre-trained YOLOv10 model weights.  
These weights allow us to test predictions with the base model before fine-tuning,  
so we can compare performance improvements later.

In [1]:
!python --version

Python 3.12.6


In [3]:
!git clone https://github.com/THU-MIG/yolov10.git

fatal: destination path 'yolov10' already exists and is not an empty directory.


In [4]:
cd yolov10

/Users/kowsi/Documents/A22 DSTI CLASS/DeepLearning/object_detection_dl_project/object_detect_dl/yolov10


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


In [5]:
!pip install .

Processing /Users/kowsi/Documents/A22 DSTI CLASS/DeepLearning/object_detection_dl_project/object_detect_dl/yolov10
  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: ultralytics
  Building wheel for ultralytics (pyproject.toml) ... [?25ldone
[?25h  Created wheel for ultralytics: filename=ultralytics-8.1.34-py3-none-any.whl size=731463 sha256=da96e1482c2c05b8dd432bc5b4a2b0722f1f54a5a53f9468f0d34b8f59d73c3d
  Stored in directory: /private/var/folders/bc/5yqxxc8n0w9d8wlpd4j86kv00000gn/T/pip-ephem-wheel-cache-9nuvfyw2/wheels/a8/45/60/2207492f18f2b08afa3351d7d93bd7d7515366e26c3ed571ea
Successfully built ultralytics
Installing collected packages: ultralytics
  Attempting uninstall: ultralytics
    Found existing installation: ultralytics 8.3.3
    Uninstalling ultralytics-8.3.3:
      Successfully uninstalled ultralytics-8.3.3
Successfull

**Note:** Download weights (e.g., `yolov10n.pt`) and place them in a `weights/` folder. Then update any `weights=`/`model=` arguments to point to your file.

In [None]:
import os
import urllib. request

# Create a directory for the weights in the current working directory
weights_dir = os.path.join(os.getcwd(), "weights")
os. makedirs(weights_dir, exist_ok=True)

# URLs of the weight files
urls = [
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10n.pt",
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10s.pt",
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10m.pt",
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10b.pt",
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10x.pt",
    "https://github.com/jameslahm/yolov10/releases/download/v1.0/yolov10l.pt"
]

# Download each file
for url in urls:
  file_name = os.path.join(weights_dir, os.path.basename(url))
  urllib. request.urlretrieve(url, file_name)
  print(f"Downloaded {file_name}")

Downloaded (file_name)
Downloaded (file_name)
Downloaded (file_name)
Downloaded (file_name)


KeyboardInterrupt: 

#### Test prediction with base YOLOv10n Model

In [None]:
# Update the values below to point to YOUR files:
#  - source=<path to your image/video/folder>
#  - weights/model=<path to your .pt weights> (e.g., ../weights/yolov10n.pt)
!yolo task=detect mode=predict conf=0.25 save=True model=../weights/yolov10n.pt source=test.mp4

  ckpt = torch.load(file, map_location="cpu")
Ultralytics YOLOv8.1.34 🚀 Python-3.10.12 torch-2.4.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
YOLOv10n summary (fused): 285 layers, 2762608 parameters, 63840 gradients, 8.6 GFLOPs

image 1/1 /content/yolov10/test_images/IMG_8455.jpg: 480x640 1 0, 1 63, 1 76, 41.3ms
Speed: 14.2ms preprocess, 41.3ms inference, 396.8ms postprocess per image at shape (1, 3, 480, 640)
Results saved to [1mruns/detect/predict[0m
💡 Learn more at https://docs.ultralytics.com/modes/predict


### Data Augmentation
Here we apply data augmentation techniques (such as flips, brightness/contrast changes, or noise).  
The goal is to increase dataset variety, improve robustness, and help the model generalize better  
to different real-world conditions.

In [4]:
pip install albumentations

Note: you may need to restart the kernel to use updated packages.


In [None]:
# Install tqdm to display progress bars while processing images.
pip install tqdm

Note: you may need to restart the kernel to use updated packages.


Starter cell for a **simple data augmentation** pipeline (resize, flips, color jitter, etc.).

In [None]:
# First simple Data augmentation
augmentations = A.Compose([
    A.HorizontalFlip(p=0.5),  # Flip the image horizontally 50% of the time
    A.VerticalFlip(p=0.5),    # Flip the image vertically 50% of the time
    A.RandomBrightnessContrast(p=1),  # Always adjust brightness and contrast
    A.MotionBlur(p=1),        # Always apply motion blur
    A.HueSaturationValue(p=1),  # Always adjust hue, saturation, and value
], bbox_params=A.BboxParams(format='yolo', label_fields=['category_ids']))

Template for a **richer augmentation pipeline** (e.g., geometric transforms, cutout, blur).

In [None]:
# More Complex Data augmentation
augmentations_2 = A.Compose([
    # Rotation for object angles
    A.Rotate(limit=45, p=1),
    # Perspective changes
    A.Affine(scale=(0.8, 1.2), translate_percent=(0.1, 0.1), rotate=0, shear=15, p=0.5),
    # Light adjustments
    A.RandomBrightnessContrast(brightness_limit=0.3, contrast_limit=0.3, p=1),
    # Light effects
    A.RandomShadow(shadow_roi=(0, 0.5, 1, 1), shadow_dimension=5, p=0.5),
    A.RandomRain(slant_lower=-10, slant_upper=10, p=0.2),  # Light reflections
    # Motion effects
    A.MotionBlur(blur_limit=(3, 7), p=0.5), 
    # Image noise
    A.GaussNoise(var_limit=(10.0, 50.0), p=0.5),
    # Color changes
    A.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, p=1),
    # Simulate blocked views
    A.CoarseDropout(max_holes=8, max_height=16, max_width=16, min_holes=1, min_height=8, min_width=8, fill_value=0, p=0.5),
], bbox_params=A.BboxParams(format='yolo', label_fields=['category_ids']))

In [9]:
import os
import cv2
import albumentations as A
from tqdm import tqdm

def load_yolo_annotation(annotation_file):
    """Load YOLO format annotations from a file."""
    with open(annotation_file, 'r') as f:
        lines = f.readlines()
    # Extract bounding boxes and category IDs
    bboxes = [list(map(float, line.strip().split()[1:])) for line in lines]
    category_ids = [int(line.strip().split()[0]) for line in lines]
    return bboxes, category_ids

def save_augmented_data(image, bboxes, category_ids, output_img_path, output_label_path):
    """Save the augmented image and its corresponding YOLO labels."""
    cv2.imwrite(output_img_path, image)
    with open(output_label_path, 'w') as f:
        for bbox, class_id in zip(bboxes, category_ids):
            f.write(f"{class_id} {' '.join(map(str, bbox))}\n")

def augment_data(image_dir, label_dir, output_image_dir, output_label_dir):
    """Main function to augment data."""
    # Create output directories if they don't exist
    os.makedirs(output_image_dir, exist_ok=True)
    os.makedirs(output_label_dir, exist_ok=True)

    # Get all image files
    image_files = [f for f in os.listdir(image_dir) if f.lower().endswith((".jpeg", ".jpg"))]

    # Process each image
    for img_file in tqdm(image_files, desc="Augmenting Images"):
        img_path = os.path.join(image_dir, img_file)
        label_path = os.path.join(label_dir, os.path.splitext(img_file)[0] + ".txt")

        # Read the image and its annotations
        image = cv2.imread(img_path)
        bboxes, category_ids = load_yolo_annotation(label_path)

        # Save the original image and labels
        save_augmented_data(image, bboxes, category_ids,
                            os.path.join(output_image_dir, img_file),
                            os.path.join(output_label_dir, os.path.splitext(img_file)[0] + ".txt"))

        # Generate 10 augmented versions of each image 
        for i in range(10): # Change the number of versions if needed
            # Apply augmentations -> Select the appropriate augmentation
            augmented = augmentations(image=image, bboxes=bboxes, category_ids=category_ids)

            # Create unique filenames for augmented data
            output_img_path = os.path.join(output_image_dir, f"{os.path.splitext(img_file)[0]}_aug_{i+1}.jpg")
            output_label_path = os.path.join(output_label_dir, f"{os.path.splitext(img_file)[0]}_aug_{i+1}.txt")

            # Save augmented image and updated labels
            save_augmented_data(augmented['image'], augmented['bboxes'], augmented['category_ids'],
                                output_img_path, output_label_path)

    # Calculate and print total number of images generated
    total_images = len(image_files) * 11  # Original + 10 augmented versions
    print(f"Total images generated: {total_images}")

if __name__ == "__main__":
    # Set these paths to match your file structure
    image_dir = "original_data/first_approach_multi_env_data/multi_env_original_images"
    label_dir = "original_data/first_approach_multi_env_data/multi_env_original_labels"
    output_image_dir = "final_augmented_base_images"
    output_label_dir = "final_augmented_base_labels"

    # Run the augmentation process
    augment_data(image_dir, label_dir, output_image_dir, output_label_dir)

ValidationError: 1 validation error for InitSchema
  Value error, If 'border_mode' is set to 'BORDER_CONSTANT', 'value' must be provided. [type=value_error, input_value={'min_height': 512, 'min_...e, 'always_apply': None}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/value_error

## Rearange dataset structure for fine tuning
In this step, we organize the dataset into the YOLO format (`images/train`, `images/val`, `labels/train`, `labels/val`).  
This structure is required for training and ensures that YOLO can properly read both images and annotations  
during the fine-tuning process.

In [2]:
import os
import random
import shutil
from pathlib import Path

# Set seed for reproducibility
random.seed(42)

# Define paths for images and labels
images_path = 'final_augmented_base_images'
labels_path = 'final_augmented_base_labels'

# Destination directories for images and labels
train_dir = 'datasets/images/train'
val_dir = 'datasets/images/val'
test_dir = 'datasets/images/test'
train_label_dir = 'datasets/labels/train'
val_label_dir = 'datasets/labels/val'
test_label_dir = 'datasets/labels/test'

# Create the directories if they don't exist
Path(train_dir).mkdir(parents=True, exist_ok=True)
Path(val_dir).mkdir(parents=True, exist_ok=True)
Path(test_dir).mkdir(parents=True, exist_ok=True)
Path(train_label_dir).mkdir(parents=True, exist_ok=True)
Path(val_label_dir).mkdir(parents=True, exist_ok=True)
Path(test_label_dir).mkdir(parents=True, exist_ok=True)

# List all images (allowing for different cases in extensions)
images = [f for f in os.listdir(images_path) if f.lower().endswith(('.jpg', '.jpeg'))]

# List all labels
labels = [f for f in os.listdir(labels_path) if f.endswith('.txt')]

# Ensure each image has a corresponding label
# We match by base name, ignoring file extension case (i.e., 'image1.jpg' with 'image1.txt')
images = sorted(images)
labels = sorted(labels)
# Filter the data to only include pairs where both image and label exist
data = []
for image_file in images:
    image_base = os.path.splitext(image_file)[0]
    label_file = f"{image_base}.txt"
    if label_file in labels:
        data.append((image_file, label_file))
# Check that we have valid image-label pairs
assert len(data) > 0, "No matching image-label pairs found!"
print(f"Total data pairs found: {len(data)}")

# Shuffle the data
random.shuffle(data)

# Split the dataset (80% train, 10% val, 10% test)
train_split = int(0.8 * len(data))
val_split = int(0.9 * len(data))
train_data = data[:train_split]
val_data = data[train_split:val_split]
test_data = data[val_split:]

# Function to copy files
def copy_files(data, image_dest, label_dest):
    for image_file, label_file in data:
        shutil.copy(os.path.join(images_path, image_file), os.path.join(image_dest, image_file))
        shutil.copy(os.path.join(labels_path, label_file), os.path.join(label_dest, label_file))
        
# Copy train, val, and test files
copy_files(train_data, train_dir, train_label_dir)
copy_files(val_data, val_dir, val_label_dir)
copy_files(test_data, test_dir, test_label_dir)
print("Dataset has been split and copied successfully!")

Total data pairs found: 252
Dataset has been split and copied successfully!


## Conclusion

This notebook prepared the dataset for fine-tuning our YOLOv10 model.  
We organized the data into the correct YOLO format, applied augmentations to improve robustness,  
and verified that annotations and images are aligned.  

With the dataset now ready, the next step is to use it in the training pipeline (`finetuning_yolo.ipynb`)  
to fine-tune the model and evaluate its performance on our custom task.