# Video Detection Demo with PytorchWildlife

This tutorial guides you on how to use PyTorchWildlife for video detection and classification. We will go through the process of setting up the environment, defining the detection and classification models, as well as performing inference and saving the results in an annotated video.

## Prerequisites
Install PytorchWildlife running the following commands:
```bash
conda create -n pytorch_wildlife python=3.10 -y
conda activate pytorch_wildlife
pip install PytorchWildlife
```
Also, make sure you have a CUDA-capable GPU if you intend to run the model on a GPU. This notebook can also run on CPU.

## Importing libraries
First, let's import the necessary libraries and modules.

In [None]:
import os
import numpy as np
import supervision as sv
import torch
from PytorchWildlife.models import detection as pw_detection
from PytorchWildlife.models import classification as pw_classification
from PytorchWildlife import utils as pw_utils

## Model Initialization
We'll  define the device to run the models and then we will initialize the models for both video detection and classification.

In [None]:
# Setting the device to use for computations ('cuda' indicates GPU)
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
if DEVICE == "cuda":
    torch.cuda.set_device(0)
SOURCE_VIDEO_PATH = os.path.join(".","demo_data","videos","opossum_example.MP4")
TARGET_VIDEO_PATH = os.path.join(".","demo_data","videos","opossum_example_processed.MP4")

# Initializing the MegaDetectorV6 model for image detection
# Valid versions are MDV6-yolov9-c, MDV6-yolov9-e, MDV6-yolov10-c, MDV6-yolov10-e or MDV6-rtdetr-c
detection_model = pw_detection.MegaDetectorV6(device=DEVICE, pretrained=True, version="MDV6-yolov10-e")

# Uncomment the following line to use MegaDetectorV5 instead of MegaDetectorV6
#detection_model = pw_detection.MegaDetectorV5(device=DEVICE, pretrained=True, version="a")

# Initializing the AI4GOpossum model for image classification
classification_model = pw_classification.AI4GOpossum(device=DEVICE, pretrained=True)

## Video Processing
For each frame in the video, we'll apply detection and classification, and then annotate the frame with the results. The processed video will be saved with annotated detections and classifications.

In [None]:
box_annotator = sv.BoxAnnotator(thickness=4)
lab_annotator = sv.LabelAnnotator(text_color=sv.Color.BLACK, text_thickness=4, text_scale=2)

def callback(frame: np.ndarray, index: int) -> np.ndarray:
    results_det = detection_model.single_image_detection(frame, img_path=index)
    labels = []
    for xyxy in results_det["detections"].xyxy:
        cropped_image = sv.crop_image(image=frame, xyxy=xyxy)
        results_clf = classification_model.single_image_classification(cropped_image)
        labels.append("{} {:.2f}".format(results_clf["prediction"], results_clf["confidence"]))
    annotated_frame = lab_annotator.annotate(
        scene=box_annotator.annotate(
            scene=frame,
            detections=results_det["detections"],
        ),
        detections=results_det["detections"],
        labels=labels,
    )
    return annotated_frame 

pw_utils.process_video(source_path=SOURCE_VIDEO_PATH, target_path=TARGET_VIDEO_PATH, callback=callback, target_fps=5)

### Copyright (c) Microsoft Corporation. All rights reserved.
### Licensed under the MIT License.