# **YOLO Image Object Detection with Indexify**

This guide demonstrates how to create an image object detection pipeline using Indexify and the tensorlake/yolo extractor. By following this guide, you'll build a pipeline capable of ingesting image files and detecting objects within them using the YOLO (You Only Look Once) model.

## **Table of Contents**

1. [Introduction](#introduction)
2. [Prerequisites](#prerequisites)
3. [Setup](#setup)
   - [Install Indexify](#install-indexify)
   - [Install Required Extractor](#install-required-extractor)
4. [Creating the Extraction Graph](#creating-the-extraction-graph)
5. [Implementing the Object Detection Pipeline](#implementing-the-object-detection-pipeline)
6. [Running the Object Detection Process](#running-the-object-detection-process)
7. [Customization and Advanced Usage](#customization-and-advanced-usage)
8. [Conclusion](#conclusion)

## **Introduction**

This pipeline uses the `tensorlake/yolo-extractor` to process images and identify objects, providing bounding boxes, class names, and confidence scores for each detected object.

## **Prerequisites**

Ensure you have the following before starting:



    - A virtual environment with Python 3.9 or later
    - `pip` (Python package manager)
    - Basic familiarity with Python and command-line interfaces


In [None]:
!python3.9 -m venv ve
!source ve/bin/activate

# **Training the Model on the Natural-Images dataset**

In [None]:
!pip install -U ipywidgets

In [None]:
!pip install ultralytics

In [None]:
with open('fashion_images.yaml', 'w') as yaml_handle:
    yaml_handle.write("""
    # fashion_images.yaml

    # Paths to the dataset
    train: /workspace/image-object-detection-reverse-search/data
    val: /workspace/image-object-detection-reverse-search/data

    # Number of classes
    nc: 4  # number of classes
    names: ["Boys", "Girls", "Men", "Women"]

    # Model architecture (example for YOLOv8)
    backbone:
    - [type, model, pretrained, stride]
    - [Conv, 32, 3, 1]       # Convolutional layer
    - [Conv, 64, 3, 2]       # Convolutional layer with stride 2
    - [CSP, 64, 1, 1]        # CSP block
    - [CSP, 128, 3, 2]       # CSP block with stride 2
    - [CSP, 256, 3, 2]       # CSP block with stride 2
    - [CSP, 512, 1, 1]       # CSP block

    head:
    - [type, model, pretrained, stride]
    - [CSP, 512, 1, 1]       # CSP block
    - [SPPF, 512, 1, 1]      # SPPF block
    - [Conv, 256, 1, 1]      # Convolutional layer
    - [Conv, 128, 1, 1]      # Convolutional layer
    - [Detect, 4, 1, 1]      # Detection layer for 4 classes
    """)
    yaml_handle.close()

# train: /kaggle/input/fashion-images/data
#     val: /kaggle/input/fashion-images/data

#     nc: 4  # number of classes
#     names: ["Boys", "Girls", "Men", "Women"]

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.pt", task='classify')  # load a pretrained model (recommended for training)
# model = YOLO("/workspace/image-object-detection-reverse-search/notebooks/fashion_images.yaml", task='detect')  # build a new model from scratch

# Use the model
model.train(data="fashion_images.yaml", epochs=3)  # train the model
# metrics = model.val()  # evaluate model performance on the validation set
# results = model("https://ultralytics.com/images/bus.jpg")  # predict on an image
# path = model.export(format="onnx")  # export the model to ONNX format (you can choose the format suitable to us here)

In [None]:
model.save("yolov8n_trained.pt")

## **Setup**

### **Install Indexify**

First, install Indexify using the official installation script and start the server:

In [None]:
!curl https://getindexify.ai | sh ./indexify server -d

This starts a long-running server that exposes ingestion and retrieval APIs to applications.

### **Install Required Extractor**

Next, install the YOLO extractor in a new terminal and start it:

In [None]:
!pip install indexify-extractor-sdk
!indexify-extractor download tensorlake/yolo-extractor
!indexify-extractor join-server

In [None]:
!pip install indexify

In [None]:
"""
This script demonstrates how to define and create an extraction graph for image object detection using Indexify and the YOLO model.

1. Import necessary modules from the Indexify library.
2. Initialize the Indexify client.
3. Define the extraction graph specification in YAML format.
4. Create an ExtractionGraph object from the YAML specification.
5. Use the Indexify client to create the extraction graph in the Indexify system.
"""

from indexify import IndexifyClient, ExtractionGraph  # Import Indexify classes

# Step 2: Initialize the Indexify client
client = IndexifyClient()

# Step 3: Define the extraction graph specification using YAML
extraction_graph_spec = """
name: 'yolo_detector'  # Name of the extraction graph
extraction_policies:
  - extractor: 'tensorlake/yolo-extractor'  # Use YOLO extractor for object detection
    name: 'image_object_detection'  # Name of the extraction policy
    input_params:
      model_name: 'yolov8n_trained.pt'  # Specify the model to be used
      conf: 0.25  # Confidence threshold for object detection
      iou: 0.7  # Intersection over Union threshold for non-max suppression
"""

# Step 4: Create an ExtractionGraph object from the YAML specification
extraction_graph = ExtractionGraph.from_yaml(extraction_graph_spec)

# Step 5: Create the extraction graph in the Indexify system
client.create_extraction_graph(extraction_graph)

## **Creating the Extraction Graph**

The extraction graph defines the flow of data through various components. You will need to configure this graph to use the YOLO extractor for object detection.

## **Implementing the Object Detection Pipeline**
Now that we have our extraction graph set up, we can upload images and make the pipeline detect objects.

### **Code Explanation**

1. **detect_objects Function**:

- The `detect_objects` function uses the Indexify service to detect objects in a single image. Here’s a brief overview of the process:

   - **Initialization**: The Indexify client is initialized.
   - **Image Upload**: The image file is uploaded to the "yolo_image_detector" extraction graph.
   - **Extraction Process**: The function waits for the Indexify service to complete the extraction process.
   - **Retrieve Detections**: The detected objects are retrieved from the extraction graph, which includes information such as bounding boxes, classes, and confidence scores.

In [None]:
import json
import os
from indexify import IndexifyClient

def detect_objects(image_path):
    """
    Detect objects in an image using the Indexify service.

    Parameters:
    image_path (str): The path to the image file to be processed.

    Returns:
    list: A list of detected objects with their bounding boxes, classes, and confidence scores.
    """
    # Initialize the Indexify client
    client = IndexifyClient()

    # Upload the image file to the extraction graph "yolo_image_detector"
    # This sends the image to the Indexify service for object detection
    content_id = client.upload_file("yolo_detector", image_path)
    print(f"Uploaded image, content ID: {content_id}")

    # Wait for the extraction process to complete
    # This ensures the service has finished processing the image
    client.wait_for_extraction(content_id)
    print(f"Extraction completed for content ID: {content_id}")

    # Retrieve the detected objects from the extraction graph
    # This gets the results of the object detection process
    detections = client.get_extracted_content(
        content_id=content_id,
        graph_name="yolo_detector",
        policy_name="image_object_detection"
    )
    print(f"Detections retrieved for content ID: {content_id}")

    # Return the detections
    # Uncomment the following line to format the detections into a more readable format
    # return [{'bbox': det['bbox'], 'class': det['class'], 'confidence': det['features'][0]['metadata']['score']} for det in detections]

    return detections

#### **Code Example to Detect Objects in a Single Image**

In [None]:
detections = detect_objects('/workspace/image-object-detection-reverse-search/sample.jpg')

for detection in detections:
    content = json.loads(detection['content'])
    bbox = content['bbox']
    object_class = content['class']
    score = content['score']
    print(f"ID: {detection['id']}")
    print(f"Class: {object_class}")
    print(f"Score: {score:.2f}")
    print(f"Bounding Box: {bbox}")
    print("="*40)

2. **process_dataset Function**: 
   - Walks through the `natural_images` dataset directory structure.
   - Iterates through each image file in each category folder.
   - Calls `detect_objects` for each image and prints the detection results.

In [None]:
def process_dataset(dataset_folder):
    """
    Process a dataset of images, detect objects in each image, and print the details of detected objects.

    Parameters:
    dataset_folder (str): The path to the folder containing the dataset of images.
    """
    # Traverse the dataset folder recursively
    for root, subdirs, files in os.walk(dataset_folder):
        for subdir in subdirs:
            subdir_path = os.path.join(root, subdir)
            print(f"Processing folder: {subdir_path}")
            # Iterate through each file in the subdirectory
            for filename in os.listdir(subdir_path):
                file_path = os.path.join(subdir_path, filename)
                try:
                    # Detect objects in the current image
                    detections = detect_objects(file_path)
                    print(f"Image: {filename}")
                    print(f"Number of objects detected: {len(detections)}")
                    # Print details of each detected object
                    for detection in detections:
                        content = json.loads(detection['content'])
                        bbox = content['bbox']
                        object_class = content['class']
                        score = content['score']
                        print(f"ID: {detection['id']}")
                        print(f"Class: {object_class}")
                        print(f"Score: {score:.2f}")
                        print(f"Bounding Box: {bbox}")
                        print("="*40)
                    print("\n")
                except Exception as e:
                    # Print an error message if the image processing fails
                    print(f"Failed to process {file_path}: {e}")

if __name__ == "__main__":
    # Define the path to the dataset folder
    dataset_folder = '/workspace/image-object-detection-reverse-search/data'
    # Process the dataset
    process_dataset(dataset_folder)

## **Reverse Image Search Pipeline**

In [None]:
!pip install qdrant_client

1. **Initialize the Qdrant client and Create a collection named "image_search" in Qdrant**

- First, we need to set up the connection to the Qdrant server and create a collection that will store our image vectors. This collection will be used for efficient image search.

In [None]:
import os
import json
import uuid
from qdrant_client import QdrantClient
from qdrant_client.http import models
from indexify import IndexifyClient

# Initialize the Qdrant client, connecting to the local Qdrant server running on port 6333
client = QdrantClient('http://localhost:6333')

# Create a collection named "image_search" in Qdrant
# The collection will store vectors with a dimensionality (size) of 5
# The vectors will use dot product for similarity measurement by default, we can change the same using distance attribute
client.create_collection(
    collection_name="image_search",
    vectors_config=models.VectorParams(size=5),
)


2. **Extract features from an image using the Indexify service.**

- This step involves utilizing the Indexify service to extract meaningful features from an image. These features can include object detections, scene descriptions, or any other relevant information extracted from the image data.

In [None]:
def extract_image_features(image_path):
    """
    Extract features from an image using the Indexify service.

    Parameters:
    image_path (str): The path to the image file to be processed.

    Returns:
    tuple: A tuple containing the extracted features and the content ID of the image.
    """
    # Initialize the Indexify client
    client = IndexifyClient()

    # Upload the image file to the extraction graph "yolo_image_detector"
    # This sends the image to the Indexify service for feature extraction
    content_id = client.upload_file("yolo_image_detector", image_path)
    print(f"Waiting for extraction to complete for content id: {content_id}")

    # Wait for the extraction process to complete
    # This ensures the features are fully extracted before proceeding
    client.wait_for_extraction(content_id)
    print(f"Extraction completed for content id: {content_id}")

    # Retrieve the extracted features from the extraction graph
    features = client.get_extracted_content(
        content_id=content_id,
        graph_name="yolo_image_detector",
        policy_name="image_object_detection"
    )

    # Return the extracted features and the content ID
    return features, content_id

3. **Store extracted image features in the Qdrant vector store.**
- After extracting features from an image using services like Indexify, the next step is to store these extracted features in the Qdrant vector store. This allows for efficient storage and retrieval of image vectors, facilitating tasks such as similarity searches and **reverse image lookups**.

In [None]:
def store_features_in_vector_store(features, content_id):
    """
    Store extracted image features in the Qdrant vector store.

    Parameters:
    features (list): A list of extracted features from an image.
    content_id (str): The content ID of the image from which features were extracted.
    """
    # Initialize the Qdrant client, connecting to the local Qdrant server
    client = QdrantClient("http://localhost:6333")
    points = []

    for feature in features:
        # Extract the feature content and parse it as JSON
        feature_content = feature['content']
        feature_data = json.loads(feature_content.decode('utf-8'))

        # Extract the bounding box, class name, and score from the feature data
        bbox = feature_data['bbox']
        class_name = feature_data['class']
        score = feature_data['score']

        # Create a vector representation combining bbox coordinates and score
        # Assuming bbox is a list of numbers (e.g., [x, y, width, height])
        vector = bbox + [score]

        # Generate a new UUID for each point
        point_id = str(uuid.uuid4())
        
        # Add the point with payload including the original content_id and class name
        points.append(models.PointStruct(
            id=point_id, 
            vector=vector, 
            payload={"class": class_name, "content_id": content_id}
        ))
    
    if points:
        print(f"Upserting {len(points)} points to Qdrant.")
        # Upsert the points into the Qdrant collection named "image_search"
        client.upsert(
            collection_name="image_search",
            points=points
        )
    else:
        print("No points to upsert, skipping request.")

4. **Process a dataset of images and store their features in the Qdrant vector store.**
- This task involves iterating through a dataset of images, extracting features from each image using tools like Indexify, and then storing these extracted features in the Qdrant vector store. This process enables efficient management and retrieval of image data for tasks such as content-based image retrieval and similarity searches.

In [None]:
def process_dataset(dataset_folder):
    """
    Process a dataset of images and store their features in the Qdrant vector store.

    Parameters:
    dataset_folder (str): The path to the dataset folder containing image files.
    """
    # Walk through the dataset folder and its subdirectories
    for root, subdirs, files in os.walk(dataset_folder):
        for subdir in subdirs:
            subdir_path = os.path.join(root, subdir)
            print(f"Processing folder: {subdir_path}")

            # Iterate through each file in the subdirectory
            for filename in os.listdir(subdir_path):
                file_path = os.path.join(subdir_path, filename)
                try:
                    # Extract features from the image file
                    features, content_id = extract_image_features(file_path)
                    if not features:
                        print(f"No features extracted for {file_path}")

                    # Store the extracted features in the Qdrant vector store
                    store_features_in_vector_store(features, content_id)
                except Exception as e:
                    print(f"Failed to process {file_path}: {e}")

In [None]:
# Example usage
dataset_folder = "/workspace/ml-projects/object-detection-reverse-image-search-indexify/data/natural_images"
process_dataset(dataset_folder)

### **Reverse Image Search Function**

#### **Perform reverse image search using Qdrant and retrieve original contents from Indexify.**

- Perform reverse image search using Qdrant for vector similarity search and retrieve detailed content information from Indexify based on matched images.

In [None]:
def reverse_image_search(query_image_path):
    """
    Perform reverse image search using Qdrant and retrieve original contents from Indexify.

    Parameters:
    query_image_path (str): Path to the query image.

    Returns:
    list: List of original contents corresponding to similar images found.
    """
    features, query_content_id = extract_image_features(query_image_path)

    # Extract the vector representation from the features
    vectors = []
    for feature in features:
        feature_content = feature['content']
        feature_data = json.loads(feature_content.decode('utf-8'))
        bbox = feature_data['bbox']
        score = feature_data['score']

        # Create a vector representation combining bbox and score
        vector = bbox + [score]  # Assuming bbox is a list of numbers
        vectors.append(vector)

    # Assuming we take the first vector for the query
    query_vector = vectors[0]

    # Query the vector store
    qdrant_client = QdrantClient("http://localhost:6333")
    search_result = qdrant_client.search(
        collection_name="image_search",
        query_vector=query_vector,
        limit=5  # Number of similar images to retrieve
    )

    # Extract content_id from the search results
    content_ids = [point.payload['content_id'] for point in search_result]

    # Retrieve the original content from Indexify using content_id
    indexify_client = IndexifyClient()
    original_contents = []
    for content_id in content_ids:
        content = indexify_client.get_extracted_content(
            content_id=content_id,
            graph_name="yolo_image_detector",
            policy_name="image_object_detection"
        )
        original_contents.append(content)

    return original_contents

In [None]:
if __name__ == "__main__":
    query_image_path = 'query_image.jpg'
    results = reverse_image_search(query_image_path)
    # Print the results in a neat format
    for result_set in results:
        for result in result_set:
            print(f"ID: {result['id']}")
            content_data = json.loads(result['content'].decode('utf-8'))
            bbox = content_data['bbox']
            class_name = content_data['class']
            score = content_data['score']
            print(f"Class: {class_name}, BBox: {bbox}, Score: {score}")
        print()  # Empty line between different result