<a target="_blank" href="https://colab.research.google.com/github/umanitoba-meagher-projects/public-experiments/blob/main/jupyter-notebooks/Object%20Classification%20and%20Localization/yolov5-localization-feature.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [None]:
"""
Author: Ryleigh J. Bruce
Date: June 18, 2024

Purpose:To load and implement the object localization feature of the YOLOv5 model.


Note: The author generated this text in part with GPT-4,
OpenAI’s large-scale language-generation model. Upon generating
draft code, the authors reviewed, edited, and revised the code
to their own liking and takes ultimate responsibility for
the content of this code.

"""

## Introduction
This notebook demonstrates the implementation of object localization using the YOLOv5 model. It begins with dataset preparation, including downloading datasets from the FiftyOne Data Zoo. The notebook then proceeds to install necessary libraries, clone the YOLOv5 repository, and load a pre-trained YOLOv5 model. Key functionalities include defining custom functions for predictions, annotating images, and visualizing results. The methodologies employed ensure efficient handling of datasets, seamless integration with machine learning workflows, and adaptability for various use cases.

## Critical Uses & Adaptability

### What the Notebook Can Be Used For:
**Dataset Exploration:** This notebook enables users to explore datasets by downloading and preparing them for object detection tasks. The dataset preparation steps in the code blocks for mounting Google Drive and downloading datasets provide a structured approach to handling datasets.

**Educational Purposes & Demonstrations:** The notebook provides a step-by-step guide to using Python scripts and machine learning libraries for image processing. It educates readers on the practical application of YOLOv5 for object detection, with detailed instructions in the code blocks for installing libraries, cloning the YOLOv5 repository, and loading the model.

### How the Notebook Can Be Adapted:
**Integration with Spatial Design & Architectural Studies:** The object detection capabilities of this notebook can be applied to site analysis in architectural or spatial datasets. The code blocks for processing and annotating images are particularly relevant for such applications.

**Variables & Customization:** Users can modify variables such as input and output directories, font sizes, and model parameters to tailor the workflow to their specific needs. The code blocks defining `input_folder_path`, `output_folder_path`, and `process_images_in_folder` provide clear examples of how to customize these variables.

**Swapping Datasets:** The notebook allows users to replace the dataset with custom datasets by modifying the dataset loading and preparation steps. The code block for loading datasets using `fiftyone.zoo.load_zoo_dataset` demonstrates how to swap datasets effectively.

## Module: Mount the Notebook to Google Drive

Here we import the drive module that allows us to link the Colab environment with our google drive, where the desired data set is stored. This allows us to access any files located within Google Drive and interact with them directly.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Downloading a Dataset from FiftyOne Data Zoo

First the FiftyOne library must be installed in the colab environment. This is done using the `!pip install` command.

In [None]:
!pip install fiftyone

In [None]:
import fiftyone
import fiftyone.zoo

In [None]:
export_dir = "/content/drive/MyDrive/shared-data/Notebook datafiles/Street-view-datasets/Geo-dataset"

In [None]:
try:
    # Step 3: Load the dataset
    dataset = fiftyone.zoo.load_zoo_dataset("quickstart-geo")
    print("Dataset loaded successfully.")

    # Step 4: Export the dataset to the specified directory on Google Drive
    dataset.export(
        export_dir=export_dir,
        dataset_type=fiftyone.types.FiftyOneDataset  # You can specify other supported formats as needed
    )
    print(f"Dataset exported to Google Drive at: {export_dir}")

except Exception as e:
    print(f"Error loading dataset: {e}")

In [None]:
# Export the dataset in YOLO format
try:
  dataset.export(
    export_dir=export_dir,
    dataset_type=fiftyone.types.YOLOv5Dataset,
    label_field="detections", # This field specifies where the relevant detection labels are stored
    split='train'  # This line is optional unless specifically handling splits differently
  )
  print("Dataset exported successfully.")
except Exception as e:
  print(f"Error exporting dataset: {e}")

# Using the Pre-trained YOLOv5 Model

## Module: Loading the YOLOv5 Model

In order to install and use the pre-trained YOLOv5 model, the `tensorflow` and `pillow` (PIL) libraries must be installed using the `!pip` package installer. `tensorflow` is an open-source machine learning library used for machine learning tasks and PIL allows scripts to open,manipulate, and save various image formats.

In [None]:
!pip install tensorflow pillow

The `os` and `shutil` modules are both critical for higher level file operations.

In [None]:
import os
import shutil

The ‘if’ block begins by using the `os` module to check if the `/content/yolov5` path already exists on the file system, and if the condition evaluates to True then the `shutil` module is used to delete the directory to prevent potential error messages.

In [None]:
# If the yolov5 directory already exists, remove it to avoid conflicts
if os.path.exists('/content/yolov5'):
    shutil.rmtree('/content/yolov5')

Here the `!git` command from the Git version control system used to copy the YOLOv5 repository from the GitHub server and clone it into the specified directory on the local machine (in this case the path for the directory is `/content/yolov5`).

In [None]:
# Clone YOLOv5 repository
!git clone https://github.com/ultralytics/yolov5 /content/yolov5

This line of code uses the `%` magic command unique to Jupyter Notebooks. `cd` changes the current working directory in the notebook environment to the directory specified in the file path.

In [None]:
# Change directory to the cloned YOLOv5 repo
%cd /content/yolov5

`git pull` updates the local repository with the latest changes from the remote repository.

In [None]:
# Pull the latest changes
!git pull

Here the `!pip` installer is used to install several required packages. The first line uses `-r` to tell pip to install all of the packages listed in the `requirements.txt` file. The `ultralytics`  and `torch`, `torchvision`, and `torchaudio` packages are then installed using the `--quiet` command to ensure only essential information is displayed.

In [None]:
# Install the required packages
!pip install -r requirements.txt
!pip install ultralytics --quiet
!pip install torch torchvision torchaudio --quiet

Next the `torch`, `pillow`, `os`, and `pandas` modules must be imported for later use.

In [None]:
# Import necessary libraries
import torch
from PIL import Image, ImageDraw, ImageFont
import os
import pandas as pd

Here the `torch.hub.load` function is used to load the YOLOv5 model from the specified directory. Since there are multiple versions of the YOLOv5 model, the second argument specifies that the YOLOv5s variant should be loaded.`pretrained=True` indicates that  a pre trained version of the YOLOv5s model should be loaded into the model variable.

The `torch.cuda.is_available()` function checks if a CUDA-compatible GPU is available on the local machine, and returns ‘cuda’ if GPU is available (otherwise returning ‘cpu’). Appending `.eval` to the end of the function sets the model into evaluation mode to ensure that it behaves correctly when making predictions.

In [None]:
# Load the YOLOv5 model from Torch Hub
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model.to('cuda' if torch.cuda.is_available() else 'cpu').eval()

This line of code determines the names of the classes available for detection through the loaded YOLOv5s model and stores them in the `class_names` variable.

In [None]:
# Get class names directly from the model
class_names = model.names

## Module: Implementing the Object Localization Feature on a Dataset of Images

Several functions must be defined before the YOLOv5 model can be used to generate a series of predictions.

The `get_predictions` function defined below begins with a ‘try’ block that takes an image and converts it to the RGB color space and stores the result in the `img` variable. The ‘except’ block ensures that if there are any errors with opening the file a corresponding error message is printed.

`results = model(img)` uses the model to process the information stored in the `img` variable and stores the raw detection results in the `results` variable for further processing. The information is then converted into a dataframe with bounding box coordinates using the `Pandas` library.

`predictions = []` initializes an empty list to store the analyzed prediction results. The script then iterates through each row of the dataframe to retrieve the class name, detection confidence, and bounding box coordinates, and then adds a tuple containing this information to the `predictions` list.

The final line of the code block returns the completed `predictions` list for each detected object in the image.

In [None]:
def get_predictions(img_path):
    try:
        img = Image.open(img_path).convert('RGB')
    except IOError:
        print(f"Error: Could not open {img_path}")
        return []
    results = model(img)
    results = results.pandas().xyxy[0]
    predictions = []
    for _, row in results.iterrows():
        class_name = class_names[int(row['class'])]
        conf = row['confidence']
        bbox = [int(row['xmin']), int(row['ymin']), int(row['xmax']), int(row['ymax'])]
        predictions.append((class_name, conf, bbox))
    return predictions

Next the `draw_text_with_outline` function must be defined to ensure the images have legible prediction labels.

The function contains several parameters: `draw` (used to draw on images), `position` (a tuple containing the x, y coordinates of where the text should be drawn), `text` (the string to be drawn), `font` (the font used for the label), `text_color` (the color of the main text), `outline_color` (the color of the text outline), and `outline_width` (the width of the outline in pixels). `x, y = position` decomposes the position tuple into x and y coordinates.

The ‘for’ loop draws the text at slightly offset positions to create an outline effect around the main text, making it more legible in different image lighting and environmental conditions.

The `draw.text((x, y)..)` function draws the main text at the specified `x, y` coordinates on top of the outline using text_color.

In [None]:
def draw_text_with_outline(draw, position, text, font, text_color, outline_color, outline_width):
    x, y = position
    # Draw the outline by drawing text in the outline_color slightly offset in each direction
    for offset in range(-outline_width, outline_width + 1):
        draw.text((x + offset, y), text, font=font, fill=outline_color)
        draw.text((x - offset, y), text, font=font, fill=outline_color)
        draw.text((x, y + offset), text, font=font, fill=outline_color)
        draw.text((x, y - offset), text, font=font, fill=outline_color)

    # Draw the text in the center in text_color
    draw.text((x, y), text, font=font, fill=text_color)

The final function that must be defined before making predictions with the YOLO model is the `process_images_in_folder` function.This allows the function to process all the images in a given directory while gracefully handling any potential errors.

The function begins by listing the acceptable image extensions in the `img_exts` list. Then the script checks if the output directory already exists, and uses the `os` module to create one if it doesn’t.

A ‘for’ loop iterates over each file in the input folder and checks that each image has a valid extension (one of the extensions listed in the `img_exts` variable). If it encounters a file with an invalid extension then the file is skipped over. The function then uses the `os` module to check the file paths of the images with valid extensions to ensure they are indeed files. The previously defined `get_predictions()` function is then called to get predictions for each image. If predictions are found it proceeds to the next step.

The ‘try’ block begins by opening the image and converting it to the RGB color space. A specific font in the specified size can then be loaded into the `font` variable, but the script returns to the default font should any errors occur. Another ‘for’ block then loops through each prediction, drawing a rectangle around the detected object and adding text annotations. The function then saves the annotated image to the output folder with a modified filename prefixed with ‘annotated-”.

The `IOError` catches and handles any potential issues with opening or saving images.

The function then alerts the user when it has finished processing and saving all of the images.

In [None]:
def process_images_in_folder(input_folder_path, output_folder_path):
    # List of acceptable image extensions
    img_exts = ['.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.gif', '.ico']

    if not os.path.exists(output_folder_path):
        os.makedirs(output_folder_path)

    for filename in os.listdir(input_folder_path):
        if not any(filename.lower().endswith(ext) for ext in img_exts):  # Check if the file is an image.
            continue  # Skip the current iteration if the file is not an image.
        img_path = os.path.join(input_folder_path, filename)
        if os.path.isfile(img_path):
            predictions = get_predictions(img_path)
            if predictions:
                try:
                    img = Image.open(img_path).convert('RGB')
                    draw = ImageDraw.Draw(img)

                    # Define font size and font
                    font_size = 12  # Adjust this value as needed
                    try:
                        font = ImageFont.truetype("arial.ttf", font_size)
                    except IOError:
                        font = ImageFont.load_default()  # Fallback to default font in case of error

                    for class_name, conf, bbox in predictions:
                        draw.rectangle(bbox, outline='red', width=2)
                        draw_text_with_outline(draw, (bbox[0], bbox[1]), f'{class_name} {conf:.2f}', font, 'white', 'red', 2)

                    output_path = os.path.join(output_folder_path, f'annotated-{filename}')
                    img.save(output_path)
                    print(f'Processed {filename}. Annotated image saved.')
                except IOError:
                    print(f"Error: Could not process {img_path}")

    print(f"All images processed. Annotations saved in {output_folder_path}")

The following code block defines the input and output directories to be used in the script.

In [None]:
# Paths to your folders
input_folder_path = '/content/drive/MyDrive/shared-data/Notebook datafiles/Street-view-datasets/Geo-dataset/data'  # Path where images are stored
output_folder_path = '/content/drive/MyDrive/shared-data/Notebook datafiles/Street-view-datasets/Geo-dataset/pre-trained-anno-2'  # Path where to save images

This final line calls the `process_images_in_folder()` function with the input and output folders as its arguments. This results in the script getting predictions and annotating each image in the input directory, then saving the annotated images to the output directory.

In [None]:
# Process images
process_images_in_folder(input_folder_path, output_folder_path)

## Module: Visualizing a Subset of the Annotated Images for Review

Importing the following libraries will enable the script to display a random subset of the newly annotated images.The Plotly Express (`plotly.express`) module is particularly useful for creating interactive visualizations such as the ones generated in this script.

In [None]:
import plotly.express as px
import os
from PIL import Image
import numpy as np
import random

The `load_image` function loads an image from the specified file path and converts it to a NumPy array. The image is then opened and assigned to the `img` variable using a context manager to ensure that it is properly closed after the block of code is executed, even if any errors occur. The image is then converted to a three-dimensional NumPy array and returned at the end of the code block.

In [None]:
def load_image(image_path):
    """Load an image from a file path."""
    with Image.open(image_path) as img:
        return np.array(img)

The `interactive_visualization` function begins by creating a list of the valid file names in the specified directory, and then filters them to only include image files ending in .png, .jpg, or .jpeg. The ‘if’ block ensures that there are images within the directory after which the `random` module is used to select a subset of images and store them in the `selected_files_subset` variable. The subset of images is then displayed with the image names as the plot titles using the `Plotly` library.

In [None]:
def interactive_visualization(directory, subset_size=4): #the subset value can be modified according to the desired number of images
    # Getting the list of image file names that ends with the specified extensions
    image_files = [file for file in os.listdir(directory) if file.endswith(('.png', '.jpg', '.jpeg'))]

    if not image_files:
        print("No images found in the directory.")
        return

    # Select a subset of images from the list of image files
    selected_files_subset = random.sample(image_files, min(subset_size, len(image_files)))

    # Display selected images using Plotly
    for image_file in selected_files_subset:
        image_path = os.path.join(directory, image_file)
        image = load_image(image_path)
        fig = px.imshow(image)
        fig.update_layout(title_text=f'Selected Image: {image_file}', margin=dict(l=10, r=10, t=40, b=10))
        fig.show()

The `interactive_visualization` function is then called with the annotated image directory as the argument.

In [None]:
interactive_visualization('/content/drive/MyDrive/shared-data/Notebook datafiles/Street-view-datasets/Geo-dataset/pre-trained-anno-2')