## Pixel Per Second Feature
Status: ***Completed***

### Purpose
We wish to develop a feature to detect the speed of a dog. Ideally we wish to develop a feature that will give us km/h, but an initial we will develop a pixel per second feature. A speed feature will give us the ability to eventually:
- Track average speed of a good
- Track acceleration, particularly at the start of the race and final straight
- Ability to notice checks easier (dogs slow down when they are checked

### How?
We will use euclidean distance multiplied by frames per second

### Limitation
The goodness of our pixel per second reading depend entirely on our ability to track the dogs. If a dog gets lost for a few moments (flickering of the box), then we lose the dogs speed. This has knock on effect to the rest of our features.

#### Imports

In [1]:
import cv2
import os
import requests
import numpy as np
from ultralytics import YOLO
import supervision as sv
from scipy.spatial.distance import euclidean
from deep_sort_realtime.deepsort_tracker import DeepSort


#### Downloading Inputs
We are grabbing a video from our public bucket. Possible values:
- 20240720WENG09_V.mp4
- 20240720WENG04_V.mp4
- 20240722NOTG06_V.mp4
- 20240722NOTG08_V.mp4
- 20240722SANG03_V.mp4
- 20240724THMG05_V.mp4
- 20240724THMG08_V.mp4
- 20240724THMG10_V.mp4


In [2]:
video_url = "https://storage.googleapis.com/greyhound-vision-data/raw_videos/20240720WENG04_V.mp4"
local_path = os.path.join(os.getcwd(), "20240720WENG04_V.mp4")

response = requests.get(video_url)

if response.status_code == 200:
    with open(local_path, 'wb') as file:
        file.write(response.content)
    

#### Initialising Model & Set up
The default value in the model is yolov8. This can easily be switched out to use the model produced by training notebook. Replace the content of the initiliser with the path of saved model.

In [3]:
model = YOLO("yolov8n.pt")

# Initialize the DeepSort tracker
tracker = DeepSort(max_age=30)

# Open the video file
cap = cv2.VideoCapture(local_path)

# Verify the output directory and permissions
output_dir = "./output"
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

if not os.access(output_dir, os.W_OK):
    raise PermissionError(f"Write permission denied for the directory {output_dir}")

# Define the output video path
output_path = os.path.join(output_dir, "pixel_per_second.mp4")

assert cap.isOpened(), "Error reading video file"

# Get video properties
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))

# Initialize VideoWriter with a successful FourCC code
fourcc_code = cv2.VideoWriter_fourcc(*"mp4v")
video_writer = cv2.VideoWriter(output_path, fourcc_code, fps, (w, h))

if not video_writer.isOpened():
    raise IOError(f"Error initializing video writer with path {output_path}")



#### Creation of calculate speed function 

In [4]:
# Track previous positions of dogs to calculate speed
previous_positions = {}
box_annotator = sv.BoxAnnotator(thickness=4)

def calculate_speed(previous_position, current_position, fps):
    distance = euclidean(previous_position, current_position)
    speed = distance * fps  # Speed in pixels per second
    return speed


#### Main processing loop

In [5]:
# Process video frames
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        print("Video frame is empty or video processing has been successfully completed.")
        break

    # Perform object detection
    results = model(frame, imgsz=1280)

    # Extract bounding boxes, confidences, and class IDs
    boxes = results[0].boxes.xyxy.cpu().numpy()
    confidences = results[0].boxes.conf.cpu().numpy()
    class_ids = results[0].boxes.cls.cpu().numpy().astype(int)

    # Prepare detections for tracking
    detections = []
    for i in range(len(boxes)):
        x1, y1, x2, y2 = boxes[i]
        bbox = [x1, y1, x2 - x1, y2 - y1]  # Convert to [x, y, w, h]
        detection = (bbox, confidences[i], class_ids[i])
        detections.append(detection)

    # Update the tracker with detections
    tracked_objects = tracker.update_tracks(detections, frame=frame)

    # Annotate the frame with boxes and labels
    for obj in tracked_objects:
        if not obj.is_confirmed():
            continue

        box = obj.to_ltwh()  # Get the bounding box as [left, top, width, height]
        obj_id = obj.track_id
        class_id = obj.det_class

        if class_id == 16:  # Assuming '16' is the class ID for dogs
            center = (int(box[0] + box[2] / 2), int(box[1] + box[3] / 2))

            if obj_id in previous_positions:
                speed = calculate_speed(previous_positions[obj_id], center, fps)
                label = f"ID {obj_id} | Speed: {speed:.2f} px/s"
            else:
                label = f"ID {obj_id} | Speed: Calculating..."

            previous_positions[obj_id] = center

            # Draw the bounding box
            cv2.rectangle(frame, (int(box[0]), int(box[1])), (int(box[0] + box[2]), int(box[1] + box[3])), (0, 255, 0), 2)
            # Put the label on top of the bounding box
            cv2.putText(frame, label, (int(box[0]), int(box[1]) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)

    # Write the processed frame to the output video
    video_writer.write(frame)


0: 736x1280 1 person, 28.8ms
Speed: 4.6ms preprocess, 28.8ms inference, 76.9ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.0ms
Speed: 4.7ms preprocess, 3.0ms inference, 0.6ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.0ms
Speed: 3.7ms preprocess, 3.0ms inference, 0.8ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.1ms
Speed: 3.9ms preprocess, 3.1ms inference, 0.6ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.1ms
Speed: 3.7ms preprocess, 3.1ms inference, 0.6ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.0ms
Speed: 3.6ms preprocess, 3.0ms inference, 0.6ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.0ms
Speed: 3.6ms preprocess, 3.0ms inference, 0.6ms postprocess per image at shape (1, 3, 736, 1280)

0: 736x1280 1 person, 3.4ms
Speed: 4.7ms preprocess, 3.4ms inference, 0.7ms postprocess per image at shape (1, 3, 

#### Clean up

In [6]:
cap.release()
video_writer.release()
cv2.destroyAllWindows()

print(f"Processed video saved to {output_path}")

Processed video saved to ./output/pixel_per_second.mp4
