# Large Rocks Detection Project

Welcome to the **Large Rocks Detection Project**! This notebook serves to implement our machine learning pipeline for detecting large rocks. Below is an outline of the steps we will follow throughout the project:

## Outline

1. **Dataset Preparation**  
   Organize and adapt the training, validation, and test datasets using `dataset.py`.
2. **Data Augmentation**  
   Apply geometric and visual transformations for enhanced generalization, leveraging `dataset.py`.
3. **Model Training**  
   Train the model using `model.py`.
4. **Regularization to Combat Overfitting**  
   Employ validation strategies to minimize overfitting.
5. **Evaluation on Test Data**  
   Test the model on the final dataset and visualize the results.
6. **Accuracy Metrics**  
   Calculate and report accuracy metrics for a comprehensive performance evaluation.

Let’s dive into each step and build a robust solution for detecting large rocks!

In [6]:
from tifffile import tifffile 
from torch.utils.data import Dataset, DataLoader
from torchvision.datasets import ImageFolder
from PIL import Image

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import torch
import torchvision.transforms as T
import os
import shutil
import dataset as dt
import augmentation as  aug

## 1. Dataset Preparation

Download the given data ... (in a folder named 'Data')

In [2]:
# Load dataset from JSON
json_file_path = 'Data/large_rock_dataset.json'
data, dataset = dt.load_dataset_from_json(json_file_path)

# Define which type of images you want to use (here we use RGB and hillshade)
img_folders = ['Data/swissImage_50cm_patches/', 'Data/swissSURFACE3D_hillshade_patches/']

for i in img_folders:
    # Extract folder name from the path
    folder_name = os.path.basename(os.path.normpath(i))

    # Define base directory name where the images and labels will be stored
    base_dir_name = f'dataset_{folder_name}'

    # Split and organize the dataset
    train_images, test = dt.split_train_from_json(dataset, i, base_dir_name)
    train_labels = dt.save_train_annotations(dataset, base_dir_name)

    # Create a validation set
    val_images = dt.create_validation_set_images(train_images, base_dir_name)
    val_labels = dt.create_validation_set_labels(train_labels, base_dir_name)

    # Convert. tif to .jpg for Yolov8
    dt.convert_tif_to_jpg(train_images)
    dt.convert_tif_to_jpg(val_images)

    # Write the labels in Yolov8 format
    # YOLOv8 assumes constant bbox size
    bbox_width = 10 / 640  # Normalized width
    bbox_height = 10 / 640  # Normalized height

    dt.convert_labels_to_yolo_format(
        train_labels,
        base_dir_name,
        bbox_width=bbox_width,
        bbox_height=bbox_height,
        type='train'
    )
    
    dt.convert_labels_to_yolo_format(
        val_labels,
        base_dir_name,
        bbox_width=bbox_width,
        bbox_height=bbox_height,
        type='val'
    )

Dataset split completed.
All train annotations have been saved to the 'train_labels' folder.
Moved 64 files to 'dataset_swissImage_50cm_patches\val_images'.
Matching label files moved to 'val_labels' folder.
All .tif files have been converted to .jpg and replaced.
All .tif files have been converted to .jpg and replaced.
Conversion to YOLOv8 format completed.
Conversion to YOLOv8 format completed.
Dataset split completed.
All train annotations have been saved to the 'train_labels' folder.
Moved 64 files to 'dataset_swissSURFACE3D_hillshade_patches\val_images'.
Matching label files moved to 'val_labels' folder.
All .tif files have been converted to .jpg and replaced.
All .tif files have been converted to .jpg and replaced.
Conversion to YOLOv8 format completed.
Conversion to YOLOv8 format completed.


## 2. Data Augmentation

### For RGB images

In [4]:
# Initialize and inspect the dataset

# Set paths for training images and YOLO-format labels.
image_folder = "dataset_swissImage_50cm_patches/train_images"
label_folder = "dataset_swissImage_50cm_patches/yolo_train_labels"

# Calculate mean and standard deviation for normalization.
mean, std = dt.calculate_mean_std(image_folder)

# Create the RockDetectionDataset with normalization (no augmentation for now).
dataset = dt.RockDetectionDataset(image_folder, label_folder, mean, std, augment=False)

# Iterate through the dataset to:
#  - Print the image name, tensor shape, and associated labels.
#  - Break after the first iteration for quick inspection.
for idx, (aug_images, aug_labels) in enumerate(dataset):
    image_name = dataset.image_files[idx]  # Get the name of the current image
    print(f"Image Name: {image_name}")
    print(f"Image Shape: {aug_images.size()}")
    print(f"Labels: {aug_labels}")
    break

Image Name: 2581_1126_0_0.jpg
Image Shape: torch.Size([3, 640, 640])
Labels: tensor([[0.0000, 0.4400, 0.2700, 0.0156, 0.0156],
        [0.0000, 0.5100, 0.3900, 0.0156, 0.0156],
        [0.0000, 0.5700, 0.4500, 0.0156, 0.0156],
        [0.0000, 0.5700, 0.3800, 0.0156, 0.0156],
        [0.0000, 0.3700, 0.7600, 0.0156, 0.0156],
        [0.0000, 0.3000, 0.7100, 0.0156, 0.0156],
        [0.0000, 0.3900, 0.9200, 0.0156, 0.0156]])


In [5]:
# Data Augmentation Workflow: Geometric, Brightness, and Obstruction
# This script performs three types of augmentations (Geometric, Brightness, and Obstruction)
# and saves the augmented datasets into separate folders.

# Parameters
batch_size = 16 
rgb_folder = 'dataset_swissImage_50cm_patches'

# Geometric
output_image_folder_g = os.path.join(rgb_folder, "augmented_train_images_geom")
output_label_folder_g = os.path.join(rgb_folder, "augmented_train_labels_geom")

aug.aug_pipeline_geom(dataset, mean, std, batch_size, output_image_folder_g, output_label_folder_g)

Augmentation 1 for 2581_1126_0_0.jpg:
Labels: [[0.0, 0.4399999976158142, 0.7300000190734863, 0.015625, 0.015625], [0.0, 0.5099999904632568, 0.6100000143051147, 0.015625, 0.015625], [0.0, 0.5699999928474426, 0.550000011920929, 0.015625, 0.015625], [0.0, 0.5699999928474426, 0.6200000047683716, 0.015625, 0.015625], [0.0, 0.3700000047683716, 0.24000000953674316, 0.015625, 0.015625], [0.0, 0.30000001192092896, 0.2900000214576721, 0.015625, 0.015625], [0.0, 0.38999998569488525, 0.07999998331069946, 0.015625, 0.015625]]
Augmentation 2 for 2581_1126_0_0.jpg:
Labels: [[0.0, 0.5600000023841858, 0.27000001072883606, 0.015625, 0.015625], [0.0, 0.49000000953674316, 0.38999998569488525, 0.015625, 0.015625], [0.0, 0.4300000071525574, 0.44999998807907104, 0.015625, 0.015625], [0.0, 0.4300000071525574, 0.3799999952316284, 0.015625, 0.015625], [0.0, 0.6299999952316284, 0.7599999904632568, 0.015625, 0.015625], [0.0, 0.699999988079071, 0.7099999785423279, 0.015625, 0.015625], [0.0, 0.6100000143051147, 0.9

In [6]:
# Brightness
output_image_folder_b = os.path.join(rgb_folder, "augmented_train_images_brightning")
output_label_folder_b = os.path.join(rgb_folder, "augmented_train_labels_brightning")

aug.aug_pipeline_brightning(dataset, mean, std, batch_size, output_image_folder_b, output_label_folder_b)

In [7]:
# Obstruction
output_image_folder_o = os.path.join(rgb_folder, "augmented_train_images_obstruction")
output_label_folder_o = os.path.join(rgb_folder, "augmented_train_labels_obstruction")

aug.aug_pipeline_obstruction(dataset, mean, std, batch_size, output_image_folder_o, output_label_folder_o)

### For hillshade images

In [8]:
# Initialize and inspect the dataset

# Set paths for training images and YOLO-format labels.
image_folder = "dataset_swissSURFACE3D_hillshade_patches/train_images"
label_folder = "dataset_swissSURFACE3D_hillshade_patches/yolo_train_labels"

# Calculate mean and standard deviation for normalization.
mean, std = dt.calculate_mean_std(image_folder)

# Create the RockDetectionDataset with normalization (no augmentation for now).
dataset = dt.RockDetectionDataset(image_folder, label_folder, mean, std, augment=False)

# Iterate through the dataset to:
#  - Print the image name, tensor shape, and associated labels.
#  - Break after the first iteration for quick inspection.
for idx, (aug_images, aug_labels) in enumerate(dataset):
    image_name = dataset.image_files[idx]  # Get the name of the current image
    print(f"Image Name: {image_name}")
    print(f"Image Shape: {aug_images.size()}")
    print(f"Labels: {aug_labels}")
    break

Image Name: 2581_1126_0_0.jpg
Image Shape: torch.Size([3, 640, 640])
Labels: tensor([[0.0000, 0.4400, 0.2700, 0.0156, 0.0156],
        [0.0000, 0.5100, 0.3900, 0.0156, 0.0156],
        [0.0000, 0.5700, 0.4500, 0.0156, 0.0156],
        [0.0000, 0.5700, 0.3800, 0.0156, 0.0156],
        [0.0000, 0.3700, 0.7600, 0.0156, 0.0156],
        [0.0000, 0.3000, 0.7100, 0.0156, 0.0156],
        [0.0000, 0.3900, 0.9200, 0.0156, 0.0156]])


In [9]:
# Parameters
batch_size = 16 
hillshade_folder = 'dataset_swissSURFACE3D_hillshade_patches'

# Geometric
output_image_folder_g = os.path.join(hillshade_folder, "augmented_train_images_geom")
output_label_folder_g = os.path.join(hillshade_folder, "augmented_train_labels_geom")

aug.aug_pipeline_geom(dataset, mean, std, batch_size, output_image_folder_g, output_label_folder_g)

Augmentation 1 for 2581_1126_0_0.jpg:
Labels: [[0.0, 0.4399999976158142, 0.7300000190734863, 0.015625, 0.015625], [0.0, 0.5099999904632568, 0.6100000143051147, 0.015625, 0.015625], [0.0, 0.5699999928474426, 0.550000011920929, 0.015625, 0.015625], [0.0, 0.5699999928474426, 0.6200000047683716, 0.015625, 0.015625], [0.0, 0.3700000047683716, 0.24000000953674316, 0.015625, 0.015625], [0.0, 0.30000001192092896, 0.2900000214576721, 0.015625, 0.015625], [0.0, 0.38999998569488525, 0.07999998331069946, 0.015625, 0.015625]]
Augmentation 2 for 2581_1126_0_0.jpg:
Labels: [[0.0, 0.5600000023841858, 0.27000001072883606, 0.015625, 0.015625], [0.0, 0.49000000953674316, 0.38999998569488525, 0.015625, 0.015625], [0.0, 0.4300000071525574, 0.44999998807907104, 0.015625, 0.015625], [0.0, 0.4300000071525574, 0.3799999952316284, 0.015625, 0.015625], [0.0, 0.6299999952316284, 0.7599999904632568, 0.015625, 0.015625], [0.0, 0.699999988079071, 0.7099999785423279, 0.015625, 0.015625], [0.0, 0.6100000143051147, 0.9

In [10]:
# Brightness
output_image_folder_b = os.path.join(hillshade_folder, "augmented_train_images_brightning")
output_label_folder_b = os.path.join(hillshade_folder, "augmented_train_labels_brightning")

aug.aug_pipeline_brightning(dataset, mean, std, batch_size, output_image_folder_b, output_label_folder_b)

In [11]:
# Obstruction
output_image_folder_o = os.path.join(hillshade_folder, "augmented_train_images_obstruction")
output_label_folder_o = os.path.join(hillshade_folder, "augmented_train_labels_obstruction")

aug.aug_pipeline_obstruction(dataset, mean, std, batch_size, output_image_folder_o, output_label_folder_o)

### Combine all datasets in one, organised for Yolov8 model

In [None]:
aug.organize_yolo_dataset(rgb_folder, hillshade_folder)


YOLO dataset organized in 'yolo_dataset'


## 3. Model Trainning