<!DOCTYPE html>
<html>
<head>
<title>Image Grouping with Object Detection</title>
</head>
<body>
<h1>Image Grouping with Object Detection</h1>
<p>This HTML page provides an overview of the Python script that uses the Ultralytics YOLO library to detect objects in images and group them based on detected entities.</p>

<ol>
    <li>Import the necessary libraries: The script begins by importing the Ultralytics YOLO class (<code>ultralytics.YOLO</code>), the operating system module (<code>os</code>), the CSV module (<code>csv</code>), and the Image module from the Python Imaging Library (<code>PIL</code>).</li>
    <li>Load YOLO Model: The YOLO model is loaded using the pretrained model file named <code>'yolov8n.pt'</code>.</li>
    <li>Specify Folder Paths: The paths to the input folder (<code>folder_path</code>) where images are stored and the output folder (<code>output_folder</code>) where grouped images will be saved are defined.</li>
    <li>Retrieve File List: The <code>getFolder_Path(folder_path)</code> function is used to retrieve a list of file names from the specified input folder.</li>
    <li>Detect and Group Segments: The <code>detect_Segments()</code> function processes each image in the input folder, detects object segments using the YOLO model, counts entities, and groups images based on detected entities.</li>
    <li>Count Detected Entities: The <code>count_Entities(results)</code> function counts the detected entities and calculates the total number of detected objects.</li>
    <li>Group Images: The <code>group_images(img_name, counts)</code> function creates group folders based on entity IDs and copies images to the appropriate group folder.</li>
    <li>Main Module Execution: The script checks if it is being run as the main module, and if so, it calls the <code>detect_Segments()</code> function to initiate the detection and grouping process.</li>
</ol>

<p>This Python script uses the Ultralytics YOLO library to detect objects in images and group them based on detected entities. Grouped images are saved in separate folders corresponding to the detected entities.</p>
</body>
</html>


In [3]:
!pip install scikit-image

Defaulting to user installation because normal site-packages is not writeable
Collecting scikit-image
  Obtaining dependency information for scikit-image from https://files.pythonhosted.org/packages/08/c0/8085c5fd2f7f7514a0c5031b666171d5828ac5b3c9cf5d0ecd19688d5407/scikit_image-0.21.0-cp311-cp311-win_amd64.whl.metadata
  Downloading scikit_image-0.21.0-cp311-cp311-win_amd64.whl.metadata (14 kB)
Collecting tifffile>=2022.8.12 (from scikit-image)
  Obtaining dependency information for tifffile>=2022.8.12 from https://files.pythonhosted.org/packages/74/68/19989a1009f68ed777ea5d2624c2996bab0890a31ce7d4b2a7ae4e1c0cfe/tifffile-2023.8.12-py3-none-any.whl.metadata
  Downloading tifffile-2023.8.12-py3-none-any.whl.metadata (30 kB)
Collecting PyWavelets>=1.1.1 (from scikit-image)
  Downloading PyWavelets-1.4.1-cp311-cp311-win_amd64.whl (4.2 MB)
     ---------------------------------------- 0.0/4.2 MB ? eta -:--:--
      --------------------------------------- 0.1/4.2 MB 1.1 MB/s eta 0:00:04
    

In [5]:
from ultralytics import YOLO
import os
import numpy as np
import csv
import shutil
from PIL import Image
from skimage.metrics import structural_similarity as ssim

# Load YOLO model
model = YOLO('yolov8n.pt')  # Using yolov8n pretrained model
folder_path = './All_Images'  # Path to folder where images are stored
output_folder = './output/problem3/'  # Path to folder where grouped images will be saved

def getFolder_Path(folder_path):
    # Getting the path to all files in a list
    file_list = []

    for root, dirs, files in os.walk(folder_path):
        for file_name in files:
            file_list.append(file_name)
    return file_list

def detect_Segments():
    # Using YOLO to detect objects in every image
    files_in_folder = getFolder_Path(folder_path)
    for i in files_in_folder:
        path = f'./All_Images/{i}'
        results = model.predict(path, save=True)
        counts, total_objects = count_Entities(results)
        group_images(i, counts)
        write_to_CSV(i, counts, total_objects)

def count_Entities(results):
    # Counting entities detected in the results
    counts = {}
    total_objects = 0

    for result in results:
        boxes = result.boxes.cpu().numpy()
        total_objects += len(boxes)
        for box in boxes:
            cls = int(box.cls[0])
            if cls not in counts:
                counts[cls] = 1
            else:
                counts[cls] += 1

    return counts, total_objects

def write_to_CSV(img_name, counts, total_objects):
    # Writing the data to a CSV file
    name = img_name.split('.')[0]
    csv_path = f"./runs/detect/predict/{name}.csv"

    with open(csv_path, mode='w', newline='') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow(['Entity', 'Count'])
        for key in counts:
            writer.writerow([model.names[key], counts[key]])
        writer.writerow(['Total Objects', total_objects])

    print(f"Data saved to {csv_path}")

def calculate_image_similarity(image1_path, image2_path):
    # Calculate the structural similarity index between two images
    image1 = Image.open(image1_path).convert("L")
    image2 = Image.open(image2_path).convert("L")
    similarity = ssim(np.array(image1), np.array(image2))
    return similarity

def group_images(img_name, counts):
    # Creating folders based on detected entities and moving images
    for entity_id in counts:
        entity_name = model.names[entity_id]
        group_folder = f"{output_folder}/Folder-grp{entity_id}_{entity_name}"
        if not os.path.exists(group_folder):
            os.makedirs(group_folder)

        src_path = f"./All_Images/{img_name}"
        dst_path = f"{group_folder}/{img_name}"
        shutil.copy(src_path, dst_path)

    print(f"Images grouped for {img_name}")

if __name__ == "__main__":
    detect_Segments()



image 1/1 d:\Adobe\Aithon\Aithon\All_Images\1.jpg: 384x640 4 persons, 1 couch, 654.2ms
Speed: 14.6ms preprocess, 654.2ms inference, 30.9ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for 1.jpg
Data saved to ./runs/detect/predict/1.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\2.jpg: 448x640 3 persons, 1 bed, 452.0ms
Speed: 10.5ms preprocess, 452.0ms inference, 4.0ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for 2.jpg
Data saved to ./runs/detect/predict/2.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\3.jpg: 448x640 3 persons, 702.3ms
Speed: 7.6ms preprocess, 702.3ms inference, 9.2ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for 3.jpg
Data saved to ./runs/detect/predict/3.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_112814949.jpeg: 448x640 1 spoon, 1 bowl, 1 sandwich, 475.8ms
Speed: 8.0ms preprocess, 475.8ms inference, 3.5ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_112814949.jpeg
Data saved to ./runs/detect/predict/AdobeStock_112814949.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_119085612.jpeg: 480x640 1 cup, 1 toothbrush, 267.7ms
Speed: 6.6ms preprocess, 267.7ms inference, 4.0ms postprocess per image at shape (1, 3, 480, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_119085612.jpeg
Data saved to ./runs/detect/predict/AdobeStock_119085612.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_189740072.jpeg: 448x640 1 person, 199.1ms
Speed: 4.5ms preprocess, 199.1ms inference, 4.4ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_189740072.jpeg
Data saved to ./runs/detect/predict/AdobeStock_189740072.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_254640691.jpeg: 448x640 1 spoon, 4 bowls, 3 oranges, 1 dining table, 289.5ms
Speed: 10.1ms preprocess, 289.5ms inference, 3.0ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_254640691.jpeg
Data saved to ./runs/detect/predict/AdobeStock_254640691.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_255497590.jpeg: 352x640 1 bowl, 186.7ms
Speed: 6.0ms preprocess, 186.7ms inference, 7.1ms postprocess per image at shape (1, 3, 352, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_255497590.jpeg
Data saved to ./runs/detect/predict/AdobeStock_255497590.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_257024669.jpeg: 448x640 1 suitcase, 318.2ms
Speed: 5.6ms preprocess, 318.2ms inference, 4.0ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_257024669.jpeg
Data saved to ./runs/detect/predict/AdobeStock_257024669.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_286178925.jpeg: 448x640 5 bowls, 6 donuts, 1 cake, 1 dining table, 365.3ms
Speed: 7.2ms preprocess, 365.3ms inference, 7.6ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_286178925.jpeg
Data saved to ./runs/detect/predict/AdobeStock_286178925.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_297274410.jpeg: 448x640 2 persons, 3 bowls, 1 dining table, 332.9ms
Speed: 9.1ms preprocess, 332.9ms inference, 3.5ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_297274410.jpeg
Data saved to ./runs/detect/predict/AdobeStock_297274410.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_315333860.jpeg: 448x640 3 persons, 1 wine glass, 4 cups, 1 spoon, 1 bowl, 4 pizzas, 1 dining table, 262.6ms
Speed: 5.5ms preprocess, 262.6ms inference, 3.0ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_315333860.jpeg
Data saved to ./runs/detect/predict/AdobeStock_315333860.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_321810820.jpeg: 384x640 5 persons, 1 skateboard, 216.7ms
Speed: 10.2ms preprocess, 216.7ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_321810820.jpeg
Data saved to ./runs/detect/predict/AdobeStock_321810820.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_323628715.jpeg: 448x640 7 persons, 2 cups, 1 bowl, 4 pizzas, 1 dining table, 302.2ms
Speed: 8.1ms preprocess, 302.2ms inference, 4.6ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_323628715.jpeg
Data saved to ./runs/detect/predict/AdobeStock_323628715.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_349162891.jpeg: 608x640 1 sandwich, 311.4ms
Speed: 9.4ms preprocess, 311.4ms inference, 4.5ms postprocess per image at shape (1, 3, 608, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_349162891.jpeg
Data saved to ./runs/detect/predict/AdobeStock_349162891.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_425653713.jpeg: 448x640 1 handbag, 1 bowl, 1 potted plant, 1 bed, 1 laptop, 1 keyboard, 268.0ms
Speed: 3.0ms preprocess, 268.0ms inference, 4.6ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_425653713.jpeg
Data saved to ./runs/detect/predict/AdobeStock_425653713.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_57849082.jpeg: 448x640 1 person, 2 pizzas, 1 dining table, 297.5ms
Speed: 4.0ms preprocess, 297.5ms inference, 3.5ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_57849082.jpeg
Data saved to ./runs/detect/predict/AdobeStock_57849082.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_58143201.jpeg: 480x640 (no detections), 275.0ms
Speed: 6.6ms preprocess, 275.0ms inference, 4.0ms postprocess per image at shape (1, 3, 480, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_58143201.jpeg
Data saved to ./runs/detect/predict/AdobeStock_58143201.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_584637362.jpeg: 448x640 2 airplanes, 3 suitcases, 176.7ms
Speed: 4.1ms preprocess, 176.7ms inference, 3.6ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_584637362.jpeg
Data saved to ./runs/detect/predict/AdobeStock_584637362.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_594853635.jpeg: 224x640 8 sandwichs, 123.0ms
Speed: 2.0ms preprocess, 123.0ms inference, 1.5ms postprocess per image at shape (1, 3, 224, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_594853635.jpeg
Data saved to ./runs/detect/predict/AdobeStock_594853635.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_608957878.jpeg: 448x640 2 persons, 2 toothbrushs, 348.6ms
Speed: 7.6ms preprocess, 348.6ms inference, 2.6ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_608957878.jpeg
Data saved to ./runs/detect/predict/AdobeStock_608957878.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_618914535.jpeg: 480x640 2 sandwichs, 241.2ms
Speed: 6.6ms preprocess, 241.2ms inference, 4.5ms postprocess per image at shape (1, 3, 480, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_618914535.jpeg
Data saved to ./runs/detect/predict/AdobeStock_618914535.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_92669497.jpeg: 448x640 7 persons, 1 suitcase, 223.9ms
Speed: 5.1ms preprocess, 223.9ms inference, 4.5ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_92669497.jpeg
Data saved to ./runs/detect/predict/AdobeStock_92669497.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\AdobeStock_96284524.jpeg: 448x640 12 donuts, 233.9ms
Speed: 9.3ms preprocess, 233.9ms inference, 4.0ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for AdobeStock_96284524.jpeg
Data saved to ./runs/detect/predict/AdobeStock_96284524.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\DL.jpeg: 384x640 1 bottle, 207.0ms
Speed: 5.0ms preprocess, 207.0ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for DL.jpeg
Data saved to ./runs/detect/predict/DL.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\maxresdefault.jpeg: 384x640 3 persons, 197.3ms
Speed: 5.1ms preprocess, 197.3ms inference, 4.4ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mruns\detect\predict[0m



Images grouped for maxresdefault.jpeg
Data saved to ./runs/detect/predict/maxresdefault.csv


image 1/1 d:\Adobe\Aithon\Aithon\All_Images\Mist.jpeg: 448x640 1 tie, 246.2ms
Speed: 5.3ms preprocess, 246.2ms inference, 2.7ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mruns\detect\predict[0m


Images grouped for Mist.jpeg
Data saved to ./runs/detect/predict/Mist.csv
