# Task
Generate Python code to load a YOLO model from a ".pt" file, apply it to an input video to detect suspicious objects with bounding boxes, and save the output as a new video file. The code should use the `ultralytics` and `opencv-python` libraries.

## Install necessary libraries

### Subtask:
Install libraries like `ultralytics` and `opencv-python` to work with YOLO models and videos.


**Reasoning**:
The subtask requires installing the `ultralytics` and `opencv-python` libraries.



In [1]:
%pip install ultralytics opencv-python

Collecting ultralytics
  Downloading ultralytics-8.3.202-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.17-py3-none-any.whl.metadata (14 kB)
Downloading ultralytics-8.3.202-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading ultralytics_thop-2.0.17-py3-none-any.whl (28 kB)
Installing collected packages: ultralytics-thop, ultralytics
Successfully installed ultralytics-8.3.202 ultralytics-thop-2.0.17


## Load the trained model

### Subtask:
Load the `.pt` file containing the trained model.


**Reasoning**:
Import the YOLO class, define the model path, and load the model.



In [2]:
from ultralytics import YOLO

model_path = "/content/best_suspicious_obj.pt"  # Replace with the actual path to your model file
model = YOLO(model_path)

Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


## Load the input video and process it

### Subtask:
Load the input video, process each frame to detect objects using the loaded model, and draw bounding boxes.

**Reasoning**:
Import necessary libraries, define the video path, load the video, and then loop through each frame, running the model and drawing bounding boxes. We'll also set up the video writer to save the output video.

In [5]:
import cv2
from ultralytics import YOLO

# Assuming the model is already loaded as 'model'
# from ultralytics import YOLO
# model_path = "/content/best_suspicious_obj.pt"
# model = YOLO(model_path)

def process_video_frame(frame, model):
    """
    Processes a single video frame using the YOLO model and returns the frame with bounding boxes.
    """
    results = model(frame)
    for result in results:
        for box in result.boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            label = model.names[int(box.cls[0])]
            confidence = float(box.conf[0])

            # You can set a confidence threshold to filter detections
            if confidence > 0.5: # Example threshold
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                cv2.putText(frame, f'{label} {confidence:.2f}', (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
    return frame

def process_video_file(video_path, output_path, model):
    """
    Processes a video file using the YOLO model and saves the output.
    """
    cap = cv2.VideoCapture(video_path)

    # Get video properties
    frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps = int(cap.get(cv2.CAP_PROP_FPS))

    # Define the codec and create VideoWriter object
    fourcc = cv2.VideoWriter_fourcc(*'mp4v') # You can use other codecs like 'XVID'
    out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break

        processed_frame = process_video_frame(frame, model)
        out.write(processed_frame)

    cap.release()
    out.release()
    cv2.destroyAllWindows()

    return output_path

# Example usage (assuming 'model' is already loaded)
input_video_path = "/content/demo.mp4"
output_video_path = "/content/output_video.mp4"
processed_video_path = process_video_file(input_video_path, output_video_path, model)
print(f"Processed video saved to: {processed_video_path}")


0: 640x384 (no detections), 331.3ms
Speed: 22.1ms preprocess, 331.3ms inference, 20.7ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 142.5ms
Speed: 3.3ms preprocess, 142.5ms inference, 0.8ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 146.2ms
Speed: 4.8ms preprocess, 146.2ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 139.1ms
Speed: 4.7ms preprocess, 139.1ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 142.3ms
Speed: 4.4ms preprocess, 142.3ms inference, 0.6ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 140.3ms
Speed: 4.5ms preprocess, 140.3ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 140.8ms
Speed: 4.7ms preprocess, 140.8ms inference, 0.5ms postprocess per image at shape (1, 3, 640, 384)

0: 640x384 (no detections), 140.7ms
Speed: 4.7ms pre

## Finish task

### Subtask:
Display the path to the output video.

**Reasoning**:
Print the path to the saved output video so the user knows where to find it.

In [None]:
print(f"Output video is saved at: {output_path}")

Output video is saved at: /content/output_video.mp4


In [None]:
import gradio as gr
import cv2
import numpy as np # Import numpy
from ultralytics import YOLO
import shutil # Import shutil for file operations

# Load the YOLO model
model_path = "/content/best_suspicious_obj.pt" # Make sure this path is correct
model = YOLO(model_path)

def process_video(video_input, output_path="/content/output_video.mp4"):
    """
    Processes a video file or webcam input using the YOLO model and returns the processed video.
    Optionally saves the output to output_path.
    """
    if video_input is None:
        return None

    # If input is a file path (from file upload)
    if isinstance(video_input, str):
        cap = cv2.VideoCapture(video_input)
        # Get video properties for file input
        frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        fps = int(cap.get(cv2.CAP_PROP_FPS))

        processed_frames = []
        while cap.isOpened():
            ret, frame = cap.read()
            if not ret:
                break

            results = model(frame)

            for result in results:
                for box in result.boxes:
                    x1, y1, x2, y2 = map(int, box.xyxy[0])
                    label = model.names[int(box.cls[0])]
                    confidence = float(box.conf[0])

                    if confidence > 0.5:
                        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                        cv2.putText(frame, f'{label} {confidence:.2f}', (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
            processed_frames.append(frame)

        cap.release()

        # Save processed video to a temporary file
        if processed_frames:
            temp_output_path = "/content/temp_output_video.mp4"
            temp_out = cv2.VideoWriter(temp_output_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (frame_width, frame_height))
            for temp_frame in processed_frames:
                temp_out.write(temp_frame)
            temp_out.release()

            # Copy the temporary file to the desired output_path
            shutil.copyfile(temp_output_path, output_path)

            return temp_output_path # Return temporary path for Gradio display/download
        else:
            return None

    # If input is a frame (from webcam) - Gradio handles webcam frames
    elif isinstance(video_input, np.ndarray):
        # For webcam, we process frame by frame directly
        frame = video_input
        results = model(frame)
        for result in results:
            for box in result.boxes:
                x1, y1, x2, y2 = map(int, box.xyxy[0])
                label = model.names[int(box.cls[0])]
                confidence = float(box.conf[0])
                if confidence > 0.5:
                    cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                    cv2.putText(frame, f'{label} {confidence:.2f}', (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
        return frame
    else:
        return None


# Create the Gradio interface
interface = gr.Interface(
    fn=process_video,
    inputs=gr.Video(sources=["upload", "webcam"], label="Input Video (Upload or Webcam)"),
    outputs=gr.Video(label="Output Video with Detections", streaming=False), # Added streaming=False for download
    title="YOLO Object Detection on Video",
    description="Upload a video or use your webcam to detect suspicious objects using a YOLO model."
)

# Launch the interface
interface.launch(debug=True)

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://81d7cdda837674ad86.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)



0: 384x640 1 Terrorist, 131.5ms
Speed: 3.3ms preprocess, 131.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 132.3ms
Speed: 3.0ms preprocess, 132.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 132.7ms
Speed: 3.4ms preprocess, 132.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 131.3ms
Speed: 3.2ms preprocess, 131.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 140.4ms
Speed: 3.2ms preprocess, 140.4ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 136.7ms
Speed: 2.8ms preprocess, 136.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 130.6ms
Speed: 3.1ms preprocess, 130.6ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 Terrorist, 127.6ms
Speed: 3.0ms preprocess, 127.6ms inference, 1.0ms 



## Load the trained model

### Subtask:
Load the `.pt` file containing the trained model.
