# Introduction to Ultralytics YOLO

Ultralytics provides state-of-the-art object detection models through the YOLO (You Only Look Once) family. YOLO models are known for their speed and accuracy, making them suitable for real-time object detection tasks in applications like autonomous driving, surveillance, retail, and robotics.

In this notebook we take a high-level look at the Ultralytics ecosystem and run our first detections with modern YOLO models. We use YOLO11 as our primary example, but the concepts apply to the broader YOLO family.

> **Note:** Ultralytics YOLO is a family of models (YOLOv3–YOLO11 and beyond). In this course we use YOLO11 models (e.g., `yolo11n.pt`) as a concrete example.

## Table of Contents

1. [Setup and Imports](#Setup-and-Imports)
2. [Example Images](#Example-Images)
3. [What is YOLO?](#What-is-YOLO?)
4. [Evolution of the YOLO Family](#Evolution-of-the-YOLO-Family)
5. [Ultralytics Framework Overview](#Ultralytics-Framework-Overview)
6. [Loading a YOLO11 Detection Model](#Loading-a-YOLO11-Detection-Model)
7. [First Detection on a Single Image](#First-Detection-on-a-Single-Image)
8. [Inspecting the Results Object](#Inspecting-the-Results-Object)
9. [Other YOLO Tasks](#Other-YOLO-Tasks)
10. [Recap and Exercises](#Recap-and-Exercises)

## Setup and Imports

First, let's set up the environment and import the necessary libraries.

In [None]:
# Optional: install Ultralytics and OpenCV in fresh environments (e.g. Colab)
# %pip install ultralytics opencv-python

In [None]:
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
from ultralytics import YOLO
import time

%matplotlib inline

## Download Example Images

This section downloads the example images used throughout the Ultralytics notebooks. Images are saved locally in `../images/` so they only need to be downloaded once.

In [None]:
import urllib.request
import ssl

def download_images(image_urls, target_dir="../images"):
    """
    Download images from URLs to a local directory.
    
    Args:
        image_urls: List of (url, filename) tuples
        target_dir: Target directory for saving images
    
    Returns:
        List of local file paths
    """
    os.makedirs(target_dir, exist_ok=True)
    
    # Set up opener with User-Agent to avoid 403 errors
    opener = urllib.request.build_opener()
    opener.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36')]
    urllib.request.install_opener(opener)
    
    # Handle SSL certificate issues
    ssl._create_default_https_context = ssl._create_unverified_context
    
    downloaded_paths = []
    
    for url, filename in image_urls:
        file_path = os.path.join(target_dir, filename)
        
        if os.path.exists(file_path):
            size_mb = os.path.getsize(file_path) / (1024 * 1024)
            print(f"✓ {filename} already exists ({size_mb:.2f} MB)")
        else:
            try:
                print(f"Downloading {filename}...", end=" ")
                urllib.request.urlretrieve(url, file_path)
                size_mb = os.path.getsize(file_path) / (1024 * 1024)
                print(f"✓ ({size_mb:.2f} MB)")
            except Exception as e:
                print(f"✗ Failed: {e}")
                continue
        
        downloaded_paths.append(file_path)
    
    return downloaded_paths

# Define all example images for the Ultralytics notebooks
image_urls = [
    # Object detection examples
    ('https://akm-img-a-in.tosshub.com/indiatoday/images/story/201812/dogs_and_cats.jpeg?TAxD19DTCFE7WiSYLUdTu446cfW4AbuW&size=770:433', 'yolo_dog_cat.jpg'),
    ('https://i.ibb.co/R7pRTLy/beach-no-axis.png', 'yolo_beach_scene.jpg'),
    ('https://i.ibb.co/jL1kZRF/phones.png', 'yolo_phones_on_table.jpg'),
    ('https://i.ytimg.com/vi/1ZupwFOhjl4/maxresdefault.jpg', 'yolo_traffic.jpg'),
    # Segmentation example
    ('https://upload.wikimedia.org/wikipedia/commons/d/d3/Albert_Einstein_Head.jpg', 'yolo_einstein_head.jpg'),
    # Pose estimation examples
    ('https://images.unsplash.com/photo-1561049501-e1f96bdd98fd?q=80&w=2778&auto=format&fit=crop&ixlib=rb-4.0.3', 'yolo_yoga_1.jpg'),
    ('https://images.unsplash.com/photo-1545205597-3d9d02c29597?q=80&w=2940&auto=format&fit=crop&ixlib=rb-4.0.3', 'yolo_yoga_2.jpg'),
]

# Download all images
downloaded_paths = download_images(image_urls)
print(f"\nTotal images available: {len(downloaded_paths)}")

### Available Images

Let's verify that all images are available:

In [None]:
# List all downloaded YOLO images
image_dir = "../images"
yolo_images = sorted([f for f in os.listdir(image_dir) if f.startswith("yolo_")])

print("Available YOLO example images:")
print("-" * 45)
for img in yolo_images:
    img_path = os.path.join(image_dir, img)
    size_mb = os.path.getsize(img_path) / (1024 * 1024)
    print(f"  {img:<30} {size_mb:>6.2f} MB")
print("-" * 45)
print(f"Total: {len(yolo_images)} images")

In [None]:
# Preview a few sample images
sample_images = ["yolo_beach_scene.jpg", "yolo_traffic.jpg", "yolo_dog_cat.jpg"]

fig, axes = plt.subplots(1, 3, figsize=(15, 4))
for ax, img_name in zip(axes, sample_images):
    img_path = os.path.join(image_dir, img_name)
    img = cv2.imread(img_path)
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    ax.imshow(img_rgb)
    ax.set_title(img_name.replace("yolo_", "").replace(".jpg", ""))
    ax.axis("off")

plt.suptitle("Sample Images for Object Detection", fontsize=12)
plt.tight_layout()
plt.show()

## Evolution of the YOLO Family

YOLO has evolved significantly since its introduction:

- **YOLOv1–YOLOv3** (2016–2018): Established real-time detection. YOLOv3 introduced multi-scale prediction.
- **YOLOv4** (2020): Improved training techniques (Bag of Specials/Freebies), better small object detection.
- **YOLOv5** (2020): First Ultralytics version, PyTorch-based, easy training and deployment.
- **YOLOv6–YOLOv7** (2022): Various improvements in architecture and efficiency.
- **YOLOv8** (2023): Ultralytics flagship, anchor-free detection, improved accuracy.
- **YOLOv9–YOLOv10** (2024): Further architectural improvements.
- **YOLO11** (2024): Current generation we use in this course.

Each version brings improvements in speed, accuracy, or ease of use. **Ultralytics wraps these model families into a unified, user-friendly Python API** that you will use throughout the rest of this course. This means you can switch between model versions or tasks (detection, segmentation, pose) with minimal code changes.

## Ultralytics Framework Overview

The Ultralytics framework provides a consistent API for working with YOLO models:

```python
from ultralytics import YOLO

model = YOLO("model_file.pt")
results = model(source)
```

### Supported Tasks

The same API works for different computer vision tasks:

- **Object Detection**: Locate and classify objects with bounding boxes
- **Instance Segmentation**: Detect objects and their precise pixel masks
- **Image Classification**: Classify entire images into categories
- **Pose Estimation**: Detect human figures and key body points
- **Oriented Bounding Boxes (OBB)**: Rotated boxes for angled objects
- **Object Tracking**: Track objects across video frames

### Key Features

- Pre-trained models ready to use
- Simple training and fine-tuning
- Export to various formats (ONNX, TensorRT, CoreML, etc.)
- Extensive documentation and community support

## Loading a YOLO11 Detection Model

Let's load a YOLO11 detection model. Ultralytics provides different model sizes optimized for various use cases.

In [None]:
# Load YOLO11 nano model - fast and lightweight
model = YOLO("yolo11n.pt")

print(f"Model loaded: {model.model_name}")

### YOLO11 Model Variants

YOLO11 comes in different sizes, each balancing speed and accuracy:

| Model | Size | Speed | Accuracy | Best For |
|-------|------|-------|----------|----------|
| `yolo11n.pt` | Nano | Fastest | Good | Edge devices, mobile, real-time apps |
| `yolo11s.pt` | Small | Fast | Better | Balanced performance |
| `yolo11m.pt` | Medium | Moderate | High | General-purpose detection |
| `yolo11l.pt` | Large | Slower | Higher | Accuracy-focused applications |
| `yolo11x.pt` | Extra Large | Slowest | Highest | Maximum accuracy, offline processing |

The trade-off is simple:
- **Smaller models** → faster inference, lower accuracy, less memory
- **Larger models** → slower inference, higher accuracy, more memory

## First Detection on a Single Image

Let's run our first object detection on a traffic scene image.

In [None]:
# Load image using local path
image_path = "../images/yolo_traffic.jpg"

img_bgr = cv2.imread(image_path)
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(10, 6))
plt.imshow(img_rgb)
plt.title("Original image")
plt.axis("off")
plt.show()

In [None]:
# Run detection - pass the RGB image to the model
results = model(img_rgb)

# Get annotated image with bounding boxes
annotated = results[0].plot()

plt.figure(figsize=(10, 6))
plt.imshow(annotated)
plt.title("YOLO11 detections")
plt.axis("off")
plt.show()

**What we see:**

- Bounding boxes around detected objects (cars, trucks, buses, traffic lights, etc.)
- Class labels showing what each object is
- Confidence scores indicating model certainty
- Multiple objects detected in a single pass

## Inspecting the Results Object

The `results` object contains detailed information about the detections. Let's explore its structure.

In [None]:
# Get the first (and only) result
r = results[0]

# Show available class names (COCO dataset classes)
print("Available classes (first 10):")
for idx, name in list(r.names.items())[:10]:
    print(f"  {idx} → {name}")

print(f"\nTotal COCO classes: {len(r.names)}")

In [None]:
# List all detected objects with their confidence scores
print("Detected objects:")
print("-" * 40)

for i, box in enumerate(r.boxes):
    cls_id = int(box.cls)  # Class ID (integer)
    conf = float(box.conf)  # Confidence score (0-1)
    class_name = r.names[cls_id]  # Human-readable class name
    print(f"{i+1}. {class_name:15} (confidence: {conf:.2f})")

print("-" * 40)
print(f"Total detections: {len(r.boxes)}")

In [None]:
# Inspect bounding box coordinates
print("First bounding box details:")
print(f"  Format: xyxy (x1, y1, x2, y2)")
print(f"  Coordinates: {r.boxes.xyxy[0].cpu().numpy()}")
print(f"\nCoordinate meaning:")
print(f"  x1, y1 = top-left corner")
print(f"  x2, y2 = bottom-right corner")

## Other YOLO Tasks

Beyond object detection, Ultralytics YOLO supports several other computer vision tasks. The model file suffix indicates the task type.

### Task-Specific Models

| Task | Model File | Description |
|------|-----------|-------------|
| Detection | `yolo11n.pt` | Bounding boxes around objects |
| Segmentation | `yolo11n-seg.pt` | Pixel-level object masks |
| Classification | `yolo11n-cls.pt` | Image-level classification |
| Pose | `yolo11n-pose.pt` | Human keypoint detection |
| OBB | `yolo11n-obb.pt` | Oriented (rotated) bounding boxes |

Let's try a quick segmentation example:

## Recap and Exercises

### Key Takeaways

- **YOLO** (You Only Look Once) is a real-time object detection system that processes entire images in a single pass
- **Ultralytics** provides a unified, easy-to-use API for YOLO models
- **Loading a model** is simple: `model = YOLO("yolo11n.pt")`
- **Running detection**: `results = model(image)` returns detailed results
- **Inspecting results**: Access bounding boxes (`.boxes.xyxy`), confidence (`.boxes.conf`), and class names (`.names`)
- **Multiple tasks**: Same API supports detection, segmentation, pose estimation, and more
- **Model variants**: Choose between nano (fast) to xlarge (accurate) based on your needs

In [None]:
# Exercise 1: Compare different model sizes

image_path = "../images/yolo_traffic.jpg"
img = cv2.imread(image_path)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

model_names = ["yolo11n.pt", "yolo11s.pt"]

for model_name in model_names:
    # TODO: Load the model
    # Hint: Use YOLO(model_name)
    
    # TODO: Measure inference time
    # Hint: Use time.time() before and after model inference
    # Hint: Convert to milliseconds by multiplying by 1000
    
    # TODO: Count detections
    # Hint: len(results[0].boxes) gives the count
    
    # TODO: Print results
    # Hint: Print model name, detection count, and inference time
    
    pass  # Remove this line when you complete the TODO

### Exercise 2: Count Unique Classes

Choose another image and count how many different object classes appear in the predictions.

In [None]:
# Exercise 2: Count unique classes in detections

image_path = "../images/yolo_beach_scene.jpg"

# TODO: Load and run detection
# Hint: Use cv2.imread(), cv2.cvtColor(), YOLO(), and model(img_rgb)

# TODO: Get unique class names
# Hint: Create a set to store unique names
# Hint: Loop over results[0].boxes
# Hint: Use int(box.cls) to get class ID, then r.names[cls_id] for name

# TODO: Print results
# Hint: Print the count and list of unique class names

pass  # Remove this line when you complete the TODO