## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator` and set it to `GPU`.

In [None]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


**NOTE:** To make it easier for us to manage datasets, images and models we create a `HOME` constant.

In [None]:
import os
HOME = os.getcwd()
print(HOME)

/content


## Mount Google Drive

In [None]:
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


## Install YOLO11 via Ultralytics

In [None]:
%pip install "ultralytics<=8.3.40" supervision roboflow
import ultralytics
ultralytics.checks()

Ultralytics 8.3.40 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 32.7/107.7 GB disk)


## Image Pre-processing

In [None]:
import cv2

def pad_and_resize(image, target_size, pad_color=(0, 0, 0)):
    """
    Resizes the image while preserving the aspect ratio, padding the shorter side.

    Parameters:
    - image: input image (numpy array)
    - target_size: tuple (width, height) of the target size
    - pad_color: color to use for padding, default is black (0, 0, 0)

    Returns:
    - resized image with padding
    """
    original_height, original_width = image.shape[:2]
    target_width, target_height = target_size

    # Calculate the aspect ratio of the image and the target size
    aspect_ratio_image = original_width / original_height
    aspect_ratio_target = target_width / target_height

    if aspect_ratio_image > aspect_ratio_target:
        # Wider than target, resize based on width
        new_width = target_width
        new_height = int(new_width / aspect_ratio_image)
    else:
        # Taller than target, resize based on height
        new_height = target_height
        new_width = int(new_height * aspect_ratio_image)

    resized_image = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_AREA)

    # Calculate padding
    pad_top = (target_height - new_height) // 2
    pad_bottom = target_height - new_height - pad_top
    pad_left = (target_width - new_width) // 2
    pad_right = target_width - new_width - pad_left

    # Pad the image
    padded_image = cv2.copyMakeBorder(resized_image, pad_top, pad_bottom, pad_left, pad_right, cv2.BORDER_CONSTANT, value=pad_color)

    return padded_image

In [None]:
import numpy as np

def image_loader(image_path, target_size=(640, 640)):
  img = cv2.imread(image_path) #BGR
  img = pad_and_resize(img, target_size)
  img = img[::-1] #RGB
  img = img[np.newaxis, ...].astype(np.float32)
  img = img.transpose(0, 3, 1, 2)
  return img

In [None]:
input_img_path = "/content/Aerial_Location_1_14.jpg"
img = image_loader(input_img_path)
print(img.shape)

(1, 3, 640, 640)


## ONNX Runtime

In [None]:
!yolo task=segment mode=export model=/content/drive/MyDrive/Aerial_River_Plastic_Wastes/yolo11/result_1/runs/segment/train/weights/best.pt format=onnx imgsz=640 opset=13

Ultralytics 8.3.40 🚀 Python-3.10.12 torch-2.5.1+cu121 CPU (Intel Xeon 2.20GHz)
YOLO11s-seg summary (fused): 265 layers, 10,067,590 parameters, 0 gradients, 35.3 GFLOPs

[34m[1mPyTorch:[0m starting from '/content/drive/MyDrive/Aerial_River_Plastic_Wastes/yolo11/result_1/runs/segment/train/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) ((1, 38, 8400), (1, 32, 160, 160)) (19.6 MB)
[31m[1mrequirements:[0m Ultralytics requirements ['onnx>=1.12.0', 'onnxslim', 'onnxruntime'] not found, attempting AutoUpdate...
Collecting onnx>=1.12.0
  Downloading onnx-1.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
Collecting onnxslim
  Downloading onnxslim-0.1.45-py3-none-any.whl.metadata (4.2 kB)
Collecting onnxruntime
  Downloading onnxruntime-1.20.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.5 kB)
Collecting coloredlogs (from onnxruntime)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 k

In [None]:
# Loading model using ONNX-Runtime
import onnxruntime as ort

model_path = f"/content/drive/MyDrive/Aerial_River_Plastic_Wastes/yolo11/result_1/runs/segment/train/weights/best.onnx"
session = ort.InferenceSession(model_path)

In [None]:
outputs = session.run(None, {"images": img})

In [None]:
outputs

[array([[[     4.3351,      10.896,      19.059, ...,      551.15,      573.59,      604.79],
         [     4.6109,      4.6285,       4.585, ...,      572.17,      576.91,      589.59],
         [     8.5384,      11.751,      10.046, ...,      262.91,      250.42,      273.52],
         ...,
         [    0.12957,     0.15135,     0.13226, ...,    -0.66867,    -0.58661,    -0.41044],
         [    0.19579,     0.17518,     0.13301, ...,     -0.7402,    -0.65537,    -0.37453],
         [   -0.38748,    -0.29909,    -0.21724, ...,      -1.012,    -0.84028,    -0.24442]]], dtype=float32),
 array([[[[    0.15643,     0.25919,     0.31111, ...,     0.90424,     0.74214,     0.39832],
          [    0.26348,     0.41484,     0.46478, ...,      2.1057,      2.1741,      0.9542],
          [    0.37085,     0.37548,     0.57932, ...,      1.9129,      2.0605,     0.96066],
          ...,
          [    0.19764,     0.31145,     0.57659, ...,      1.3001,      1.0435,     0.46108],
         