# Tracking Objects in a Video

In some cases, it's important for us to track objects across multiple frames of a video. For example, we may need to figure out the direction a vehicle is moving. In this cookbook, we'll cover how to get a tracker up and running  for use in your computer vision applications.

## What is a Tracker?

Trackers are a piece of code that identifies objects across frames and assigns them a unique id.  There are a few popular trackers at the time of writing this including ByteTrack and Bot-SORT. Supervision makes using trackers a breeze and comes with ByteTrack built-in. First, let's get our deppendencies installed. 

## Install Dependencies

In [None]:
#!/bin/bash
!python -m venv venv
!source venv/bin/activate
!pip install -q inference "supervision[assets]"

## Download a Video Asset

First, let's download a video that we can detect objects in. Supervision comes with a great utility to help us hit the ground running. The videos is saved in our local directory and can be accessed with the variable `path_to_video`.

In [None]:
from supervision.assets import download_assets, VideoAssets

# Download a supervision video asset 
path_to_video = download_assets(VideoAssets.PEOPLE_WALKING)

# Tracking Objects

Now that we have our video installed, let's get to work on tracking objects. We'll pull in a model from roboflow Inference to detect people in our video. We'll then create a `byte_tracker` object that we'll pass our detections to. This will give us a `tracker_id`. We'll then utilize that tracker id to label our detections with a `label_annotator`.

In [None]:
import supervision as sv
from supervision.assets import download_assets, VideoAssets
from inference.models.utils import get_roboflow_model


if __name__ == '__main__':

    # Install our video from supervision assets
    PATH_TO_VIDEO = download_assets(VideoAssets.PEOPLE_WALKING)

    # load the yolov8X model from roboflow inference
    model = get_roboflow_model('yolov8n-640')

    # get video info from the video path 
    video_info = sv.VideoInfo.from_video_path(PATH_TO_VIDEO)

    # create a trace and label annotator, with dynamic video info
    label = sv.LabelAnnotator()

    # create a ByteTrack object to track detections
    byte_tracker = sv.ByteTrack(frame_rate=video_info.fps)

    # get frames iterable from video and loop over them
    frame_generator = sv.get_video_frames_generator(PATH_TO_VIDEO)

    # create a video sink context manager to write the annotated frames to
    with sv.VideoSink(target_path="output.mp4", video_info=video_info) as sink:
        for frame in frame_generator:

            # run inference on the frame
            result = model.infer(frame)[0]

            # convert the detections to a supervision detections object
            detections = sv.Detections.from_inference(result)

            # update detections with tracker ids
            tracked_detections = byte_tracker.update_with_detections(detections)

            # create label text for annotator
            labels = [ f"{tracker_id}" for tracker_id in tracked_detections.tracker_id ]

            # apply label annotator to frame
            annotated_frame = label.annotate(scene=frame.copy(), detections=tracked_detections, labels=labels)

            # save the annotated frame to the video sink
            sink.write_frame(frame=annotated_frame)
