<a href="https://colab.research.google.com/github/volkodava/Binance_Futures_Java/blob/master/notebooks/how-to-use-ultralytics-yolo-with-sahi.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Setup

pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.

[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://www.pepy.tech/projects/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/)

In [1]:
!pip install ultralytics sahi
import ultralytics
from ultralytics.utils.downloads import safe_download
ultralytics.checks()

Ultralytics 8.3.144 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 41.5/112.6 GB disk)


### Clone Repository

- Clone the `ultralytics` repository.
- `%cd` to the examples section.
- Move to `YOLOv8-SAHI-Inference-Video` folder.

In [2]:
# Clone ultralytics repo
!git clone https://github.com/ultralytics/ultralytics

# cd to local directory
%cd ultralytics/examples/YOLOv8-SAHI-Inference-Video

Cloning into 'ultralytics'...
remote: Enumerating objects: 60469, done.[K
remote: Counting objects: 100% (53/53), done.[K
remote: Compressing objects: 100% (32/32), done.[K
remote: Total 60469 (delta 33), reused 32 (delta 21), pack-reused 60416 (from 2)[K
Receiving objects: 100% (60469/60469), 32.80 MiB | 22.11 MiB/s, done.
Resolving deltas: 100% (44943/44943), done.
/content/ultralytics/examples/YOLOv8-SAHI-Inference-Video


### Download the Sample Video

- If you want to use your own video, you can skip this step.

In [3]:
safe_download(f"https://github.com/ultralytics/assets/releases/download/v0.0.0/sahi.demo.video.mp4", dir="/content")

Downloading https://ultralytics.com/assets/sahi.demo.video.mp4 to '/content/sahi.demo.video.mp4'...


100%|██████████| 15.3M/15.3M [00:00<00:00, 105MB/s] 


PosixPath('/content/sahi.demo.video.mp4')

### Inference using SAHI

The output results will be stored in `ultralytics/ultralytics/examples/YOLOv8-SAHI-Inference-Video/`

In [18]:
with open('/content/ultralytics/examples/YOLOv8-SAHI-Inference-Video/yolov8_sahi.py', 'r') as f:
    content = f.read()

content = content.replace(
    'download_model_weights(yolo11_model_path)  # Download model if not present',
    '# download_model_weights(yolo11_model_path)  # Commented out for YOLOv11 support'
)

with open('/content/ultralytics/examples/YOLOv8-SAHI-Inference-Video/yolov8_sahi.py', 'w') as f:
    f.write(content)

print("Script patched successfully!")

Script patched successfully!


In [20]:
!python /content/ultralytics/examples/YOLOv8-SAHI-Inference-Video/yolov8_sahi.py --source "/content/sahi.demo.video.mp4" --weights "yolo11x.pt" --save-img


Ultralytics 8.3.144 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11x.pt to 'models/yolo11x.pt'...
100% 109M/109M [00:00<00:00, 329MB/s] 
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slic

In [26]:
%%writefile /content/yolov8_sahi_video.py
import argparse
import cv2
from sahi import AutoDetectionModel
from sahi.predict import get_sliced_prediction
from ultralytics.utils.files import increment_path
from pathlib import Path

class SAHIInference:
    def __init__(self):
        self.detection_model = None

    def load_model(self, weights: str, device: str) -> None:
        from ultralytics.utils.torch_utils import select_device
        yolo11_model_path = f"models/{weights}"
        self.detection_model = AutoDetectionModel.from_pretrained(
            model_type="ultralytics", model_path=yolo11_model_path, device=select_device(device)
        )

    def inference(
        self,
        weights: str = "yolo11n.pt",
        source: str = "test.mp4",
        view_img: bool = False,
        save_img: bool = False,
        exist_ok: bool = False,
        device: str = "",
        hide_conf: bool = False,
        slice_width: int = 512,
        slice_height: int = 512,
    ) -> None:
        cap = cv2.VideoCapture(source)
        assert cap.isOpened(), "Error reading video file"

        fps = int(cap.get(cv2.CAP_PROP_FPS))
        width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

        save_dir = increment_path("runs/detect/predict", exist_ok)
        save_dir.mkdir(parents=True, exist_ok=True)

        video_path = None
        if save_img:
            video_path = save_dir / f"{Path(source).stem}_output.mp4"
            fourcc = cv2.VideoWriter_fourcc(*'mp4v')
            out = cv2.VideoWriter(str(video_path), fourcc, fps, (width, height))
            print(f"Saving video to: {video_path.absolute()}")

        self.load_model(weights, device)

        frame_count = 0
        while cap.isOpened():
            success, frame = cap.read()
            if not success:
                break

            frame_count += 1
            print(f"Processing frame {frame_count}...", end='\r')

            results = get_sliced_prediction(
                frame[..., ::-1],
                self.detection_model,
                slice_height=slice_height,
                slice_width=slice_width,
            )

            for obj in results.object_prediction_list:
                x1 = int(obj.bbox.minx)
                y1 = int(obj.bbox.miny)
                x2 = int(obj.bbox.maxx)
                y2 = int(obj.bbox.maxy)
                score = obj.score.value
                category = obj.category.name

                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                if not hide_conf:
                    label = f'{category} {score:.2f}'
                else:
                    label = category
                cv2.putText(frame, label, (x1, y1 - 10),
                           cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

            if view_img:
                cv2.imshow("Ultralytics YOLO Inference", frame)

            if save_img:
                out.write(frame)

            if cv2.waitKey(1) & 0xFF == ord("q"):
                break

        cap.release()
        if save_img:
            out.release()
            print(f"\n✅ Video saved to: {video_path.absolute()}")
        cv2.destroyAllWindows()

    @staticmethod
    def parse_opt() -> argparse.Namespace:
        parser = argparse.ArgumentParser()
        parser.add_argument("--weights", type=str, default="yolo11n.pt", help="initial weights path")
        parser.add_argument("--source", type=str, required=True, help="video file path")
        parser.add_argument("--view-img", action="store_true", help="show results")
        parser.add_argument("--save-img", action="store_true", help="save results")
        parser.add_argument("--exist-ok", action="store_true", help="existing project/name ok, do not increment")
        parser.add_argument("--device", default="", help="cuda device, i.e. 0 or 0,1,2,3 or cpu")
        parser.add_argument("--hide-conf", default=False, action="store_true", help="display or hide confidences")
        parser.add_argument("--slice-width", default=512, type=int, help="Slice width for inference")
        parser.add_argument("--slice-height", default=512, type=int, help="Slice height for inference")
        return parser.parse_args()

if __name__ == "__main__":
    inference = SAHIInference()
    inference.inference(**vars(inference.parse_opt()))

Writing /content/yolov8_sahi_video.py


In [27]:
!python /content/yolov8_sahi_video.py --source "/content/sahi.demo.video.mp4" --weights "yolo11x.pt" --save-img

Saving video to: /content/ultralytics/examples/YOLOv8-SAHI-Inference-Video/runs/detect/predict12/sahi.demo.video_output.mp4
Ultralytics 8.3.144 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction on 9 slices.
Performing prediction o