#Object and Sub-Object Detection
###Approach:
Object Detection: We can utilize an object detection model such as YOLO (You Only Look Once) or SSD (Single Shot Multibox Detector), which are popular for real-time object detection. These models can be fine-tuned or trained to detect various objects (e.g., "Person," "Car").

Sub-Object Detection: For detecting sub-objects like "Helmet" or "Tire," we will need to either:

Use a multi-class detector that includes both objects and sub-objects in the same model, or
Use a two-stage pipeline where the primary object is first detected, and a secondary sub-object detection model focuses on detecting sub-objects within the bounding box of the primary object.
Hierarchical Association:
After detecting an object and sub-object, a hierarchical structure will be created where each detected object has a unique ID. Each sub-object will be linked to its corresponding parent object.

The hierarchical structure will be maintained as follows:

Main Object ID (e.g., Person, Car) will be assigned a unique identifier.
Sub-objects will be linked to the main object using that identifier, ensuring the system can establish relationships between them.

In [None]:
detected_objects = []

# main object detection
for obj in detected_objects:
    obj_id = unique_object_id()
    main_object = {
        "object": obj.name,
        "id": obj_id,
        "bbox": obj.bounding_box
    }

    # Sub-object detection within the bounding box
    for sub in obj.detected_sub_objects:
        sub_object = {
            "object": sub.name,
            "id": unique_sub_object_id(),
            "bbox": sub.bounding_box
        }
        main_object["subobject"] = sub_object  # Link sub-object to main object

    # Store the result
    results.append(main_object)


#JSON Output Format
The output should adhere to the hierarchical JSON format as described, capturing both the object and its sub-objects:

{
  "object": "Person",
  "id": 1,
  "bbox": [100, 200, 300, 400],
  "subobject": {
    "object": "Helmet",
    "id": 1,
    "bbox": [120, 220, 180, 280]
  }
}
object: The name of the detected object (e.g., "Person").
id: Unique identifier for the object.
bbox: Bounding box of the object, represented as [x1, y1, x2, y2].
subobject: Contains information about the associated sub-object, including its name, ID, and bounding box.

#Sub-Object Image Retrieval
To retrieve and save cropped images of specific sub-objects, we need to implement a function that uses the bounding box of the detected sub-object and crops the image from the original frame.

##Implementation Steps:
1.Use the bounding box from the detection to crop the sub-object region.
2.Save the cropped image of the sub-object for later retrieval.

In [None]:
import cv2

def crop_subobject(image, bbox):
    x1, y1, x2, y2 = bbox
    cropped_image = image[y1:y2, x1:x2]
    return cropped_image

def save_subobject_image(cropped_image, sub_object_id):
    file_path = f"subobject_{sub_object_id}.png"
    cv2.imwrite(file_path, cropped_image)


#Inference Speed Optimization
To meet the real-time processing requirement of 10-30 FPS on CPU:

1.Model Optimization: Use optimized models such as MobileNetV2 or Tiny YOLO to reduce computational load.
2.Batch Processing: Process multiple frames in batches where possible.
3.Framework Optimization: Use ONNX (Open Neural Network Exchange) or TensorRT for optimized inference on the CPU.
4.Multithreading/Concurrency: Split the video stream processing into multiple threads to utilize CPU resources more efficiently.

In [None]:
import time
import cv2

cap = cv2.VideoCapture('sample_video.mp4')
fps = 0

while(cap.isOpened()):
    ret, frame = cap.read()
    if not ret:
        break

    start_time = time.time()

    detect_objects(frame)

    end_time = time.time()
    fps = 1 / (end_time - start_time)
    print(f"FPS: {fps}")
    if fps >= 10 and fps <= 30:
        break
cap.release()


#Modularity and Extensibility
To ensure that the system is modular:

1.Object Detection Module: Create separate classes or functions for detecting different objects.
2.Sub-Object Detection Module: Implement independent sub-object detectors that can be swapped or extended easily.
3.Configuration Files: Use configuration files (e.g., JSON, YAML) to specify which object-sub-object pairs should be detected, making it easier to add new detections.

#yolov5 implmentation

In [None]:
pip install opencv-python torch torchvision pyyaml
pip install yolov5  # For YOLOv5 model


In [None]:
#importing the library
import cv2
import torch
import json
import uuid
import time

In [None]:

# Load pre-trained YOLOv5 model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')

In [None]:
# Define a function to detect objects and sub-objects
def detect_objects(frame):
    results = model(frame)  # Perform inference on the frame
    return results

In [None]:
# Create a unique ID generator function
def generate_unique_id():
    return str(uuid.uuid4())

In [None]:
# Convert the detection results into the required JSON format
def generate_json_output(detected_objects):
    results_json = []
    for obj in detected_objects:
        main_object = {
            "object": obj['name'],
            "id": generate_unique_id(),
            "bbox": obj['bbox'] }

In [None]:
        # Assuming sub-object is a part of the object (e.g., helmet for person)
        if 'subobject' in obj:
            sub_object = {
                "object": obj['subobject']['name'],
                "id": generate_unique_id(),
                "bbox": obj['subobject']['bbox']
            }
            main_object["subobject"] = sub_object

        results_json.append(main_object)
    return json.dumps(results_json, indent=4)

In [None]:
# Process video frames and detect objects
def process_video(video_path):
    cap = cv2.VideoCapture(video_path)
    fps = 0
    frame_count = 0

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break

        start_time = time.time()
        results = detect_objects(frame)
        detected_objects = []

        # Parsing results into detected objects with sub-objects
        for *xyxy, conf, cls in results.xywh[0]:
            name = model.names[int(cls)]
            bbox = [int(x) for x in xyxy]
            detected_objects.append({
                'name': name,
                'bbox': bbox
            })

In [None]:
# Generating hierarchical JSON output
json_output = generate_json_output(detected_objects)
print(json_output)

In [None]:
# Calculating FPS
end_time = time.time()
fps = 1 / (end_time - start_time)
frame_count += 1
if frame_count % 10 == 0:
  print(f"FPS: {fps:.2f}")
cap.release()

In [None]:
# Run the video processing function with a sample video
process_video('sample_video.mp4')


#Sub-Object Image Retrieval
This function will crop images of detected sub-objects (for example, helmets for people) and save them to disk.

In [None]:
def crop_subobject_image(frame, bbox):
    x1, y1, x2, y2 = bbox
    cropped_image = frame[y1:y2, x1:x2]
    return cropped_image

def save_subobject_image(cropped_image, sub_object_id):
    filename = f"subobject_{sub_object_id}.png"
    cv2.imwrite(filename, cropped_image)
    print(f"Saved image as {filename}")

# Sample function to demonstrate cropping and saving sub-object images
def process_and_save_subobject_images(video_path):
    cap = cv2.VideoCapture(video_path)

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break

        # Detect objects and sub-objects in the frame
        results = detect_objects(frame)
        for *xyxy, conf, cls in results.xywh[0]:
            name = model.names[int(cls)]
            bbox = [int(x) for x in xyxy]

            # Save the sub-object image (assuming sub-object detection)
            sub_object_id = generate_unique_id()
            cropped_image = crop_subobject_image(frame, bbox)
            save_subobject_image(cropped_image, sub_object_id)

    cap.release()

# Run the function to save sub-object images
process_and_save_subobject_images('sample_video.mp4')
