In [None]:
# Object Detection in videos: Detect objects in video sequences using YOLO

In [None]:
pip install torch torchvision opencv-python yolov5

In [None]:
# Object Detection in Video with YOLOv5

import cv2
import torch
import numpy as np
from yolov5 import YOLOv5
# Import the cv2_imshow function from google.colab.patches
from google.colab.patches import cv2_imshow

# Load YOLOv5 model (e.g., yolov5s.pt for a small, fast model)
model_path = 'yolov5s.pt'  # Download from YOLOv5 repo if needed
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Initialize YOLO model
yolo_model = YOLOv5(model_path, device)

# Open the video file
cap = cv2.VideoCapture('/content/5538137-hd_1920_1080_25fps.mp4')

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Perform object detection
    results = yolo_model.predict(frame)

    # Loop over detected objects and draw bounding boxes
    for *xyxy, conf, cls in results.pred[0]:  # x1, y1, x2, y2, confidence, class
        # Access class names from results.names instead of yolo_model.names
        label = results.names[int(cls)]
        confidence = f"{conf:.2f}"

        # Draw bounding box
        cv2.rectangle(frame, (int(xyxy[0]), int(xyxy[1])), (int(xyxy[2]), int(xyxy[3])), (0, 255, 0), 2)

        # Put label and confidence on box
        cv2.putText(frame, f"{label} {confidence}", (int(xyxy[0]), int(xyxy[1]) - 10),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    # Display the frame with detections using cv2_imshow instead of cv2.imshow
    cv2_imshow(frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Execution logic:

Here's a breakdown of each step:

### Step-by-Step Explanation

1. **Import Required Libraries**:
   - `cv2`: OpenCV library for video processing.
   - `torch`: PyTorch library to leverage CUDA if available.
   - `numpy`: Used for numerical operations, though not directly in this code.
   - `YOLOv5`: The YOLOv5 model for object detection.
   - `cv2_imshow`: Colab’s specific function to display images in notebooks.

2. **Model Path and Device Setup**:
   - `model_path`: Specifies the path to the YOLOv5 model file (`yolov5s.pt`).
   - `device`: Uses GPU if available (`cuda`), otherwise defaults to `cpu`.

3. **Initialize the YOLOv5 Model**:
   - `yolo_model`: Instantiates the YOLOv5 model with the specified path and device. The model can then be used to perform object detection.

4. **Open the Video File**:
   - `cv2.VideoCapture('/content/5538137-hd_1920_1080_25fps.mp4')`: Loads the video file for processing. Replace the path with your desired video file if different.

5. **Processing Each Frame**:
   - `cap.isOpened()`: Checks if the video capture object is successfully opened.
   - `cap.read()`: Reads each frame of the video one by one in a loop.
   - `if not ret`: If `ret` is `False`, it means the video has ended or an error occurred, so the loop breaks.

6. **Perform Object Detection**:
   - `results = yolo_model.predict(frame)`: Uses the YOLO model to detect objects in the current frame. The `results` object contains information about detected objects, including bounding box coordinates, confidence scores, and class labels.

7. **Draw Bounding Boxes and Labels**:
   - `for *xyxy, conf, cls in results.pred[0]`: Loops through each detected object in the frame.
     - `*xyxy`: Represents bounding box coordinates `(x1, y1, x2, y2)`.
     - `conf`: Confidence score of the detected object.
     - `cls`: Class index of the detected object.
   - `label = results.names[int(cls)]`: Retrieves the class label name using `results.names`.
   - `cv2.rectangle(...)`: Draws a bounding box around each detected object using coordinates `(x1, y1, x2, y2)`.
   - `cv2.putText(...)`: Adds the class label and confidence score above each bounding box.

8. **Display the Frame with Detections**:
   - `cv2_imshow(frame)`: Displays the processed frame with bounding boxes and labels in Google Colab.

9. **Exit Condition**:
   - `cv2.waitKey(1) & 0xFF == ord('q')`: Checks for the 'q' key press to quit the loop.
   - `cap.release()`: Releases the video capture object.
   - `cv2.destroyAllWindows()`: Closes all OpenCV windows (mainly used outside of Colab).