# Instance Segmenation based on YOLO11

# Setup

Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.

[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://www.pepy.tech/projects/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/)
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/healthonrails/annolid/blob/main/docs/tutorials/Annolid_Instance_Segmentation_on_YOLO11_Tutorial.ipynb)

In [None]:
%pip install ultralytics
import ultralytics
ultralytics.checks()

In [None]:
import shutil
import os
from google.colab import files

# Upload custom dataset, labeled in annolid

In [None]:
custom_dataset = files.upload()

In [None]:
!unzip YOLO_dataset.zip

In [None]:
# Path to the dataset directory
image_dir = '/content/YOLO_dataset/images/val'  # Directory containing images
label_dir = '/content/YOLO_dataset/labels/val'  # Directory containing YOLO Pose estimation labels (e.g., .txt files)


# Optional (check dataset)

In [None]:
import os
import cv2
import matplotlib.pyplot as plt
import numpy as np

def visualize_pose_estimation_with_labels(image_path, label_path, keypoint_color=(0, 0, 255), keypoint_radius=5):
    """
    Visualizes pose estimation keypoints and bounding boxes from YOLO format label files on an image.

    Args:
        image_path (str): Path to the image file.
        label_path (str): Path to the YOLO format label file.
        keypoint_color (tuple): BGR color for keypoints (default: blue).
        keypoint_radius (int): Radius of the circles drawn for keypoints (default: 5).
    """
    img = cv2.imread(image_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    if not os.path.exists(label_path):
        print(f"Label file {label_path} not found.")
        return

    with open(label_path, 'r') as f:
        lines = f.readlines()

    for line in lines:
        data = line.strip().split()
        class_id = int(data[0])

        # Extract bounding box coordinates (x_center, y_center, width, height)
        center_x = float(data[1])
        center_y = float(data[2])
        box_width = float(data[3])
        box_height = float(data[4])

        # Denormalize bounding box coordinates
        x_min = int((center_x - box_width / 2) * img.shape[1])
        y_min = int((center_y - box_height / 2) * img.shape[0])
        x_max = int((center_x + box_width / 2) * img.shape[1])
        y_max = int((center_y + box_height / 2) * img.shape[0])

        # Extract keypoint coordinates (px1, py1, px2, py2, ...)
        keypoints_data = data[5:]
        num_keypoints = len(keypoints_data) // 2
        keypoints = []
        for i in range(num_keypoints):
            px = float(keypoints_data[2*i])
            py = float(keypoints_data[2*i + 1])
            keypoints.append((px, py))

        # Visualize bounding box (optional, you can comment this out if you only want keypoints)
        cv2.rectangle(img, (x_min, y_min), (x_max, y_max), color=(0, 255, 0), thickness=1)

        # Visualize keypoints
        for px, py in keypoints:
            # Denormalize keypoint coordinates
            keypoint_x = int(px * img.shape[1])
            keypoint_y = int(py * img.shape[0])
            cv2.circle(img, (keypoint_x, keypoint_y), keypoint_radius, keypoint_color, -1) # Draw filled circle

        # Display Class ID near the bounding box (you can adjust position)
        cv2.putText(img, f'Class {class_id}', (x_min, y_min - 10 if y_min > 20 else y_min + 20), # Adjust text position
                    cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1, cv2.LINE_AA) # Reduced thickness for clarity

    plt.figure(figsize=(10, 10))
    plt.imshow(img)
    plt.title(f"Pose Estimation with Ground Truth: {os.path.basename(image_path)}")
    plt.axis("off")
    plt.show() # Added plt.show() to display each image in loop

if not os.path.exists(image_dir):
    print(f"Error: Image directory '{image_dir}' not found.")
elif not os.path.exists(label_dir):
    print(f"Error: Label directory '{label_dir}' not found.")
else:
    # Loop through images and visualize them with pose estimation labels
    for image_name in os.listdir(image_dir):
        if image_name.endswith(('.jpg', '.jpeg', '.png')): # Added common image extensions
            image_path = os.path.join(image_dir, image_name)
            label_path = os.path.join(label_dir, os.path.splitext(image_name)[0] + '.txt')
            visualize_pose_estimation_with_labels(image_path, label_path)

# Predict without fine-tuning

YOLO11 may be used directly in the Command Line Interface (CLI) with a `yolo` command for a variety of tasks and modes and accepts additional arguments, i.e. `imgsz=640`. See a full list of available `yolo` [arguments](https://docs.ultralytics.com/usage/cfg/) and other details in the [YOLO11 Predict Docs](https://docs.ultralytics.com/modes/train/).


In [None]:
# Run inference on an image with YOLO11n
!yolo predict model=yolo11n-pose.pt source='/content/YOLO_dataset/images/val/92-mouse-2_000000000.png'

# Train


In [None]:
#@title Select YOLO11 🚀 logger {run: 'auto'}
logger = 'TensorBoard' #@param ['Comet', 'TensorBoard']

if logger == 'Comet':
  %pip install -q comet_ml
  import comet_ml; comet_ml.init()
elif logger == 'TensorBoard':
  %load_ext tensorboard
  %tensorboard --logdir .

In [None]:
# Train YOLO11n-seg on custom dataset for 30 epochs
!yolo train model=yolo11n-pose.pt data=/content/YOLO_dataset/data.yaml epochs=300 imgsz=640

# Inference and Save results to annolid json files

In [None]:
# Run inference on an image with YOLO11n
!yolo predict model=runs/pose/train/weights/best.pt source='/content/YOLO_dataset/images/val/92-mouse-2_000000000.png'

In [None]:
from collections import defaultdict
import cv2
import numpy as np
import json
import os
import matplotlib.pyplot as plt
from IPython.display import clear_output
from ultralytics import YOLO

# Load the YOLO11n pose estimation model
#model = YOLO("yolo11n-pose.pt")  # Load a pretrained YOLO11 pose model, you can replace with your custom model path
model = YOLO("runs/pose/train2/weights/best.pt")  # Update this to your trained pose model if you have one

# Provide the path to your video file
video_path = "/content/92-mouse-2.mp4"  # Update this to your video file's path
cap = cv2.VideoCapture(video_path)

# Create an output directory named after the video file without the extension
video_name = os.path.splitext(os.path.basename(video_path))[0]
output_dir = video_name + "_pose_estimation" # Updated output directory name to reflect pose estimation
os.makedirs(output_dir, exist_ok=True)

# Store the track history
track_history = defaultdict(lambda: [])
frame_id = 0  # Track frame number
display_interval = 10  # Display visualization every n frames

# Loop through the video frames
while cap.isOpened():
    success, frame = cap.read()
    if success:
        # Run YOLOv8 pose estimation tracking on the frame, persisting tracks between frames
        results = model.track(frame, persist=True)

        # Initialize LabelMe JSON structure
        labelme_data = {
            "version": "5.5.0", # Updated version to match current LabelMe version
            "flags": {},
            "shapes": [],
            "imagePath": os.path.basename(video_path), # Added imagePath to be more informative
            "imageHeight": frame.shape[0],
            "imageWidth": frame.shape[1],
            "imageData": None, # set to None as imageData is usually not needed for video annotation
        }

        # Get the boxes, track IDs, and pose keypoints
        boxes = results[0].boxes.xywh.cpu()
        try:
          track_ids = results[0].boxes.id.int().cpu().tolist()
        except:
          track_ids = [0]
        keypoints_list = results[0].keypoints.xy.cpu().numpy() # Get keypoints

        # Loop through detected objects
        for box, track_id, keypoints in zip(boxes, track_ids, keypoints_list): # Looping through keypoints as well
            x, y, w, h = box.tolist()
            track = track_history[track_id]
            track.append((float(x), float(y)))  # x, y center point

            if len(track) > 30:  # retain 30 tracks for 30 frames
                track.pop(0)

            # Bounding box coordinates
            x1, y1 = x - w/2, y - h/2
            x2, y2 = x + w/2, y + h/2

            # Add bounding box annotation
            bbox_shape = {
                "label": f"object_{track_id}",
                "shape_type": "rectangle",
                "points": [
                    [float(x1), float(y1)],
                    [float(x2), float(y2)]
                ],
                "group_id": track_id,
                "description": "Bounding Box", # Added description for clarity
                "flags": {},
                "line_color": None,
                "fill_color": None
            }
            labelme_data["shapes"].append(bbox_shape)

            # Add polygon for tracking history
            if len(track) > 1:
                points = np.array(track).tolist()
                shape_polygon = {
                    "label": f"track_{track_id}",
                    "shape_type": "polygon",
                    "points": points,
                    "group_id": track_id,
                    "description": "Tracking History", # Added description for clarity
                    "flags": {},
                    "line_color": None,
                    "fill_color": None
                }
                labelme_data["shapes"].append(shape_polygon)

            # Add keypoint annotations
            for idx, (kx, ky) in enumerate(keypoints):
                keypoint_shape = {
                    "label": f"keypoint_{track_id}_{idx}", # Label each keypoint with track ID and index
                    "shape_type": "point",
                    "points": [[float(kx), float(ky)]],
                    "group_id": track_id,
                    "description": f"Keypoint {idx}", # Added description for clarity
                    "flags": {},
                    "line_color": None,
                    "fill_color": None
                }
                labelme_data["shapes"].append(keypoint_shape)


        # Save the JSON annotation file with zero-padded numbering (e.g., 000000001.json)
        json_filename = os.path.join(output_dir, f"{frame_id:09d}.json")
        with open(json_filename, "w") as json_file:
            json.dump(labelme_data, json_file, indent=4)

        # Visualization: Display annotated frame every 'display_interval' frames
        if frame_id % display_interval == 0:
            annotated_frame = results[0].plot()  # Get annotated frame with pose estimation
            plt.figure(figsize=(10, 6))
            plt.axis('off')
            plt.imshow(cv2.cvtColor(annotated_frame, cv2.COLOR_BGR2RGB))
            clear_output(wait=True)  # Clear previous output for smoother display
            plt.show()

        frame_id += 1  # Increment frame number
    else:
        # Break the loop if the end of the video is reached
        break

# Release the video capture object
cap.release()
print(f"Pose estimation annotations saved in '{output_dir}' directory.")

# Zip and download json results can be loaded into annolid

In [None]:

# Provide the path to the folder you want to zip
folder_to_zip = output_dir  # Update with your folder path

# Output zip file path
output_zip_file = folder_to_zip + ".zip"

# Zip the folder
shutil.make_archive(folder_to_zip, 'zip', folder_to_zip)

print(f"Folder '{folder_to_zip}' has been zipped to '{output_zip_file}'.")
files.download(output_zip_file)

# Zip and download runs folder which continas the saved best model

In [None]:
# Replace 'folder_name' with the name of your folder
folder_to_download = 'runs'
output_filename = 'runs.zip'

# Compress the folder
shutil.make_archive(output_filename.replace('.zip', ''), 'zip', folder_to_download)

# Download the zipped folder
files.download(output_filename)

# 4. Export

Export a YOLO11 model to any supported format below with the `format` argument, i.e. `format=onnx`. See [YOLO11 Export Docs](https://docs.ultralytics.com/modes/export/) for more information.

- 💡 ProTip: Export to [ONNX](https://docs.ultralytics.com/integrations/onnx/) or [OpenVINO](https://docs.ultralytics.com/integrations/openvino/) for up to 3x CPU speedup.  
- 💡 ProTip: Export to [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) for up to 5x GPU speedup.

| Format                                                                   | `format` Argument | Model                     | Metadata | Arguments                                                            |
|--------------------------------------------------------------------------|-------------------|---------------------------|----------|----------------------------------------------------------------------|
| [PyTorch](https://pytorch.org/)                                          | -                 | `yolo11n.pt`              | ✅        | -                                                                    |
| [TorchScript](https://docs.ultralytics.com/integrations/torchscript)     | `torchscript`     | `yolo11n.torchscript`     | ✅        | `imgsz`, `optimize`, `batch`                                         |
| [ONNX](https://docs.ultralytics.com/integrations/onnx)                   | `onnx`            | `yolo11n.onnx`            | ✅        | `imgsz`, `half`, `dynamic`, `simplify`, `opset`, `batch`             |
| [OpenVINO](https://docs.ultralytics.com/integrations/openvino)           | `openvino`        | `yolo11n_openvino_model/` | ✅        | `imgsz`, `half`, `int8`, `batch`                                     |
| [TensorRT](https://docs.ultralytics.com/integrations/tensorrt)           | `engine`          | `yolo11n.engine`          | ✅        | `imgsz`, `half`, `dynamic`, `simplify`, `workspace`, `int8`, `batch` |
| [CoreML](https://docs.ultralytics.com/integrations/coreml)               | `coreml`          | `yolo11n.mlpackage`       | ✅        | `imgsz`, `half`, `int8`, `nms`, `batch`                              |
| [TF SavedModel](https://docs.ultralytics.com/integrations/tf-savedmodel) | `saved_model`     | `yolo11n_saved_model/`    | ✅        | `imgsz`, `keras`, `int8`, `batch`                                    |
| [TF GraphDef](https://docs.ultralytics.com/integrations/tf-graphdef)     | `pb`              | `yolo11n.pb`              | ❌        | `imgsz`, `batch`                                                     |
| [TF Lite](https://docs.ultralytics.com/integrations/tflite)              | `tflite`          | `yolo11n.tflite`          | ✅        | `imgsz`, `half`, `int8`, `batch`                                     |
| [TF Edge TPU](https://docs.ultralytics.com/integrations/edge-tpu)        | `edgetpu`         | `yolo11n_edgetpu.tflite`  | ✅        | `imgsz`                                                              |
| [TF.js](https://docs.ultralytics.com/integrations/tfjs)                  | `tfjs`            | `yolo11n_web_model/`      | ✅        | `imgsz`, `half`, `int8`, `batch`                                     |
| [PaddlePaddle](https://docs.ultralytics.com/integrations/paddlepaddle)   | `paddle`          | `yolo11n_paddle_model/`   | ✅        | `imgsz`, `batch`                                                     |
| [NCNN](https://docs.ultralytics.com/integrations/ncnn)                   | `ncnn`            | `yolo11n_ncnn_model/`     | ✅        | `imgsz`, `half`, `batch`                                             |

In [None]:
!yolo export model=runs/pose/train/weights/best.pt format=onnx

# 5. Python Usage

YOLO11 was reimagined using Python-first principles for the most seamless Python YOLO experience yet. YOLO11 models can be loaded from a trained checkpoint or created from scratch. Then methods are used to train, val, predict, and export the model. See detailed Python usage examples in the [YOLO11 Python Docs](https://docs.ultralytics.com/usage/python/).

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO('runs/pose/train/weights/best.pt')  # load a pretrained model (recommended for training)

# Use the model
results = model('/content/YOLO_dataset/images/val/92-mouse-2_000000000.png')  # predict on an image
results = model.export(format='onnx')  # export the model to ONNX formats