<a href="https://colab.research.google.com/github/tahamsi/computer-vision/blob/main/week-9/Instance_segmentation_with_yolo11.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 [![GitHub](https://badges.aleen42.com/src/github.svg)](https://github.com/tahamsi/computer-vision)

##YOLO
YOLO (You Only Look Once) is a popular deep learning model used for real-time object detection. It was introduced by Joseph Redmon and is known for its ability to detect multiple objects in an image or video frame with high speed and accuracy. YOLO stands out due to its innovative approach of treating object detection as a single regression problem, enabling it to predict bounding boxes and class probabilities directly from full images in one evaluation.

###Key Features

* Single-Pass Detection: Unlike traditional object detection methods that use a multi-stage process (e.g., region proposal and classification), YOLO processes an image in a single neural network pass, making it extremely fast and suitable for real-time applications.
* Grid-Based Prediction: YOLO divides the input image into a grid and assigns each grid cell the responsibility of predicting bounding boxes and their associated class probabilities if the center of an object falls within that cell.
* End-to-End Learning: The model is trained end-to-end, optimizing for both object localization and classification simultaneously.
* Speed and Efficiency: YOLO is capable of processing images at high frame rates, making it suitable for applications that require real-time performance, such as video surveillance, autonomous vehicles, and interactive systems.
##YOLO11
[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) is a state-of-the-art model that builds on the success of previous YOLO versions, incorporating new features and enhancements to further improve performance and flexibility. YOLO11 is designed to be fast, accurate, and user-friendly, making it an ideal choice for a variety of tasks, including object detection, tracking, instance segmentation, image classification, and pose estimation.
YOLO11 builds on the advancements introduced in YOLOv9 and YOLOv10 earlier this year, incorporating improved architectural designs, enhanced feature extraction techniques, and optimized training methods.

YOLO11m achieves a higher mean mAP score on the COCO dataset while using 22% fewer parameters than YOLOv8m, making it computationally lighter without sacrificing performance.

YOLOv11 is available in 5 different sizes, ranging from `2.6M` to `56.9M` parameters, and capable of achieving from `39.5` to `54.7` mAP on the COCO dataset.

### Before you start

Let's ensure we have `GPU` access by using the `nvidia-smi` command to check. If there are any issues, go to `Edit -> Notebook settings -> Hardware accelerator`, set it to GPU, and then click `Save`. The codes here are extended from [Ultralytics](https://github.com/ultralytics/ultralytics) and [Roboflow](https://github.com/roboflow/notebooks).

In [None]:
!nvidia-smi

In [None]:
import os
HOME = os.getcwd()

## Install YOLO11 via Ultralytics

In [None]:
%pip install ultralytics supervision roboflow
import ultralytics
ultralytics.checks()

## Inference with model pre-trained on COCO dataset

### Download sample data

In [None]:
!mkdir -p {HOME}/data
!wget https://raw.githubusercontent.com/tahamsi/computer-vision/refs/heads/main/images/peel.jpg -P {HOME}/data
!wget https://raw.githubusercontent.com/tahamsi/computer-vision/refs/heads/main/images/London_bridge.jpg -P {HOME}/data

### CLI

In [None]:
!yolo task=segment mode=predict model=yolo11l-seg.pt conf=0.25 source='data/London_bridge.jpg' save=True

Result annotated image are saved in `{HOME}/runs/segment/predict/`.

In [None]:
from IPython.display import Image as IPyImage

IPyImage(filename=f'/content/runs/segment/predict/London_bridge.jpg', width=600)

### SDK

**NOTE:** YOLO's Python interface allows for seamless integration into your Python projects, making it easy to load, run, and process the model's output.

In [None]:
from ultralytics import YOLO
from PIL import Image

model = YOLO('yolo11l-seg.pt')
image = Image.open('data/London_bridge.jpg')
result = model.predict(image, conf=0.25)[0]

**NOTE:** The obtained `result` object stores information about the location, classes, and confidence levels of the detected objects.

In [None]:
result.boxes.xyxy

In [None]:
result.boxes.conf

In [None]:
result.boxes.cls

In [None]:
result.masks.data

**NOTE:** YOLO11 can be easily integrated with `supervision` using the familiar `from_ultralytics` connector.

NOTE: [Roboflow Supervision](https://github.com/roboflow/supervision) refers to a set of tools and features provided by Roboflow, a platform designed to simplify and enhance the development of computer vision models. Roboflow offers a comprehensive suite for data annotation, preprocessing, model training, and deployment, aimed at improving the workflow and performance of machine learning projects.

In [None]:
import supervision as sv

detections = sv.Detections.from_ultralytics(result)

In [None]:
mask_annotator = sv.MaskAnnotator()
label_annotator = sv.LabelAnnotator(text_color=sv.Color.BLACK, text_position=sv.Position.CENTER)

annotated_image = image.copy()
annotated_image = mask_annotator.annotate(annotated_image, detections=detections)
annotated_image = label_annotator.annotate(annotated_image, detections=detections)

sv.plot_image(annotated_image, size=(10, 10))

## Fine-tune YOLO11

### API keys

To fine-tune YOLO11, you’ll need to provide your Roboflow API key. Follow these steps:

Visit your [`Roboflow Settings`](https://app.roboflow.com/settings/api) page, and click `Copy` to copy your private API key to the clipboard.

When training YOLOv11, ensure your data is stored in the `datasets` folder. To modify the default location for your fine-tuning data, adjust the path in Ultralytics’ `settings.json`. In this tutorial, we’ll use a sample [dataset](https://universe.roboflow.com/detect-microplastic/new-project-2-bchc8/dataset/2#) from [Roboflow Universe](https://universe.roboflow.com/), specifically the microplastic dataset. When downloading, select the `YOLOv11` export format.

In [None]:
!mkdir {HOME}/datasets
%cd {HOME}/datasets

from google.colab import userdata
from roboflow import Roboflow

rf = Roboflow(api_key="") #copy and paste your api key from roboflow
project = rf.workspace("detect-microplastic").project("new-project-2-bchc8")
version = project.version(2)
dataset = version.download("yolov11")

## Custom Training

In [None]:
%cd {HOME}

!yolo task=segment mode=train model=yolo11s-seg.pt data={dataset.location}/data.yaml epochs=10 imgsz=640 plots=True

**NOTE:** The results of the completed training are saved in `{HOME}/runs/detect/train/`. Let's examine them.

In [None]:
!ls {HOME}/runs/segment/train/

In [None]:
from IPython.display import Image as IPyImage

IPyImage(filename=f'{HOME}/runs/segment/train/confusion_matrix.png', width=600)

In [None]:
from IPython.display import Image as IPyImage

IPyImage(filename=f'{HOME}/runs/segment/train/results.png', width=600)

In [None]:
from IPython.display import Image as IPyImage

IPyImage(filename=f'{HOME}/runs/segment/train/val_batch0_pred.jpg', width=600)

## Validate fine-tuned model

In [None]:
!yolo task=segment mode=val model={HOME}/runs/segment/train/weights/best.pt data={dataset.location}/data.yaml

## Inference with custom model

In [None]:
!yolo task=segment mode=predict model={HOME}/runs/segment/train/weights/best.pt conf=0.25 source={dataset.location}/test/images save=True

Check a few results.

In [None]:
import glob
import os
from IPython.display import Image as IPyImage, display

latest_folder = max(glob.glob(f'{HOME}/runs/segment/predict*/'), key=os.path.getmtime)
for img in glob.glob(f'{latest_folder}/*.jpg')[:3]:
    display(IPyImage(filename=img, width=600))
    print("\n")