# Annotate Video with Detections

---

[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/roboflow/supervision/blob/develop/docs/notebooks/annotate-video-with-detections.ipynb)

One of the most common requirements of computer vision applications is detecting objects in images and displaying bounding boxes around those objects. In this cookbook we'll walk through the steps on how to utilize the open source Roboflow ecosystem to accomplish this task on a video. Let's dive in! 

## Installing Dependencies 

In this cookbook we'll be utilizing the open source packages [Inference](https://inference.roboflow.com/) and [Supervision](https://supervision.roboflow.com/latest/) to accomplish our goals. Let's get those installed in our notebook with pip.

In [None]:

!pip install -q inference "supervision[assets]"

## Download a Video Asset

First, let's download a video that we can detect objects in. Supervision comes with a great utility called Assets to help us hit the ground running. Wehn we run this script, the video is saved in our local directory and can be accessed with the variable `path_to_video`.

In [None]:
from supervision.assets import download_assets, VideoAssets

# Download a supervision video asset 
path_to_video = download_assets(VideoAssets.PEOPLE_WALKING)

## Detecting Objects

For this example, the objects in the video that we'd like to detect are people. In order to display bounding boxes around the people in the video, we first need a way to detect them. We'll be using the open source [Inference](https://github.com/roboflow/inference) package for this task. Inference allows us to quickly use thousands of models, including fine tuned models from [Roboflow Universe](https://universe.roboflow.com/), with a few lines of code.  We'll also utilize a few utilities for working with our video data from the [Supervision](https://github.com/roboflow/supervision) package.

In [None]:
import supervision as sv
from supervision.assets import download_assets, VideoAssets
from inference.models.utils import get_roboflow_model


if __name__ == "__main__":
    # Download the video asset from Supervision assets.
    PATH_TO_VIDEO = download_assets(VideoAssets.PEOPLE_WALKING)

    # Load a yolov8 nano model from roboflow.
    model = get_roboflow_model("yolov8n-640")

    # Create a frame generator and video info object from supervision utilities.
    frame_generator = sv.get_video_frames_generator(PATH_TO_VIDEO)

    # Yield a single frame from the generator.
    frame = next(frame_generator)

    # Run inference on our frame.
    result = model.infer(frame)[0]

    # Parse result into detections data model.
    detections = sv.Detections.from_inference(result)

    # Display the detections data model.
    print(detections)

First, we load our model using the method `get_roboflow_model()`. Notice how we pass in a `model_id`? We're using an [alias](https://inference.roboflow.com/reference_pages/model_aliases/) here. This is where we can pass in other models from Roboflow Universe like this [rock, paper, scissors](https://universe.roboflow.com/roboflow-58fyf/rock-paper-scissors-sxsw) model utilizing our roboflow api key. 

```
model = get_roboflow_mode(
    model_id="rock-paper-scissors-sxsw/11", 
    api_key="roboflow_private_api_key"
)
```

If you don't have an api key, you can [create an free Roboflow account](https://app.roboflow.com/login). This model wouldn't be much help with detecting people, but it's a nice exercise to see how our code becomes model agnostic!

We then create a `frame_generator` object and yeild a single frame for inference using `next()`. We pass our frame to `model.infer()` to run inference, then pass that data into a little helpfer function called `sv.Detections.from_inference()` to parse it. Lastly we print our detections to show we are in fact detecting a few people in the frame! 

## Saving Bounding Boxes to the Video

Now comes the fun part. Let's wrap up our code by utilizing `Annotators` and a `VideoSink` to draw bounding boxes and save the resulting video respectively. Take a peak at the final code example below. This might can take up to a minute to run, since we're processing a full video. 

In [None]:
import supervision as sv
from supervision.assets import download_assets, VideoAssets
from inference.models.utils import get_roboflow_model


if __name__ == "__main__":
    # Download the video asset from Supervision assets.
    PATH_TO_VIDEO = download_assets(VideoAssets.PEOPLE_WALKING)

    # Load a yolov8 nano model from roboflow.
    model = get_roboflow_model("yolov8n-640")

    # Initalize the bounding box frame annotator.
    box_annotator = sv.BoundingBoxAnnotator()

    # Create a frame generator  object from video path.
    frame_generator = sv.get_video_frames_generator(PATH_TO_VIDEO)

    # Create a video info object from video path.
    video_info = sv.VideoInfo.from_video_path(PATH_TO_VIDEO)

    # Use a VideoSink context manager for saving frames of a video.
    with sv.VideoSink(target_path="output.mp4", video_info=video_info) as sink:

        # Iterate through frames yielded from the frame_generator.
        for frame in frame_generator:

            # Run inference on our frame.
            result = model.infer(frame)[0]

            # Parse the result into the detections data model.
            detections = sv.Detections.from_inference(result)

            # Apply bounding box to detections on a copy of the frame.
            annotated_frame = box_annotator.annotate(
                scene=frame.copy(), 
                detections=detections
            )

            # Write the annotated frame to the video sink.
            sink.write_frame(frame=annotated_frame)

 Notice that we create a `box_annoator` variable by initalizing a [BoundingBoxAnnotator](https://supervision.roboflow.com/latest/annotators/#boundingboxannotator). We can change the color and thickness, but for simplicity we keep the defaults. There are a ton of easy to use [annotators](https://supervision.roboflow.com/latest/annotators/) available in the Supervision package other than a bounding box that are fun to play with. Next, we create a `video_info` variable to pass information about the video to our `VideoSink`. The `VideoSink` is a cool little context manager that allows us to `write_frames()` to a video ouput file. Prior to writing the frame, we annotate a copy of it utilizing `box_annotator.annotate()`. Let's take a look at the resulting video. It will be installed locally and is called `output.mp4`. When run, you will see bounding boxes around the detections. Pretty awesome! 

We only scratched the surface of all of the customizable Annotators and additional features that Supervision and Inference have to offer. Stay tuned for more cookbooks on how to take advantge of them in your computer vision applications. Happy building! 