# Object detection in video
In this notebook we try to do object detection in a video.

# Load Data
As a first step, let's fetch the results from our training run.

In [1]:
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
!curl -L https://aml-2023.s3.eu-north-1.amazonaws.com/final-project/yolo_runs_epoch_90.zip > yolo_runs_epoch_90.zip

And extract into a chosen directory.

In [3]:
import zipfile

run_data_dir = "run_data"
Path(run_data_dir).mkdir(exist_ok=True, parents=True)

with zipfile.ZipFile("yolo_runs_epoch_90.zip", 'r') as zip_ref:
    zip_ref.extractall(run_data_dir)

Let's load the run dataframe.

In [4]:
#run_results = pd.read_csv("data/yolo_runs_epoch_90/runs/detect/train/results.csv")

run_results = pd.read_csv("/content/run_data/runs/detect/train/results.csv")




```
# This is formatted as code
```

Then, let's fetch the training and validation data. This we need for the validation of the YOLO model at the end.

In [None]:
!wget -O fetch_data.sh https://raw.githubusercontent.com/aml-2023/final-project/main/fetch_data.sh
!bash fetch_data.sh --type yolo --output garbage_subset --percentage subset

# Extract Train and Validation Results
Then, we extract the training and validation columns from the dataframe.

In [6]:
train_columns = list(filter(lambda col_name: "train" in col_name, run_results.columns))
train_results = run_results[train_columns]

val_columns = list(filter(lambda col_name: "val" in col_name, run_results.columns))
val_results = run_results[val_columns]

# Setup Model
As a first step, we need to setup up the model by doing the following:

1. Create a `DetectionModel` with the garbage architecture, basically just use a single class instead of the many that are normally used.
2. Load the best weights from the training into this model.
3. Create the YOLO model with the same best weights and with a detection task, since we want to do object detection here.
4. Assign the detection model to the `model` field of the YOLO object. This is a bit hacky but it's the only way we can let YOLO know that it should only predict a single class.

In [7]:
import torch
import os
import cv2
from PIL import Image, ImageDraw

!pip install ultralytics -qq
from ultralytics import YOLO
from ultralytics.nn.tasks import DetectionModel

!pip install opencv-python

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/660.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/660.5 kB[0m [31m2.7 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━[0m [32m553.0/660.5 kB[0m [31m8.2 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m660.5/660.5 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
det = DetectionModel("/content/model.yaml")
det.load(torch.load("/content/run_data/runs/detect/train/weights/best.pt"))
model = YOLO(model="/content/run_data/runs/detect/train/weights/best.pt", task="detect")  # load a pretrained model (recommended for training)
model.model = det

# Model Validation
Next, we validate the model on the **test** data by simply calling the `val` method with the path to the `.yaml` file where we specify the dataset. This will return a `Metrics` object from which we can access all the metrics we are interested in.

In [None]:
# Validate the model
data_path = os.path.abspath("/content/garbage_subset/data.yaml")
metrics = model.val(data=data_path)  # no arguments needed, dataset and settings remembered

# Trackin object in video
- Track mode is used for tracking objects in real-time using a YOLOv8 model. In this mode, the model is loaded from a checkpoint file, and the user can provide a live video stream to perform real-time object tracking.


Object detection in VIDEO

In [12]:
# model = YOLO(model="yolov8m.pt")

In [None]:
# Track with the model
results1 = model.track(source="/content/video.mp4", show=True)  # Tracking with default tracker

### Detect garbage in each frame of the video

In [15]:
from google.colab.patches import cv2_imshow

import cv2
from ultralytics import YOLO

# Load the YOLOv8 model
#model = YOLO('yolov8n.pt')

# Open the video file
video_path = "/content/video.mp4"
cap = cv2.VideoCapture(video_path)

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()

    if success:
        # Run YOLOv8 tracking on the frame, persisting tracks between frames
        results = model.track(frame, persist=True)

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        # cv2.imshow("YOLOv8 Tracking", annotated_frame)

        # cv2.imshow is not supported so here we use cv2_imshow
        cv2_imshow(annotated_frame)

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break

# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

# Predict on an image

In [None]:
results2 = model("/content/img1.jpeg")  # predict on an image