# Object Detection with YOLOv8: A Practical Application

This notebook demonstrates a complete object detection workflow using YOLOv8, one of the most practical and efficient models for real-world applications.

**What you'll learn:**
- Using pre-trained YOLOv8 for immediate object detection
- Fine-tuning on a custom dataset (pedestrian detection)
- Evaluating and visualizing results

**Why YOLOv8?** Fast, accurate, easy to use, and excellent for deployment.

In [None]:
# Installation
!pip install ultralytics opencv-python matplotlib pillow

In [None]:
from ultralytics import YOLO
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import urllib.request
import os

print('Environment ready!')

## Part 1: Quick Start - Pre-trained Detection

Let's start by using YOLOv8 pre-trained on COCO dataset (80 common object classes).

In [None]:
# Load pre-trained YOLOv8
model = YOLO('yolov8n.pt')  # n = nano (fastest), also available: s, m, l, x

print('YOLOv8 model loaded!')
print(f'Model can detect {len(model.names)} classes')
print(f'Classes: {list(model.names.values())[:10]}...')  # Show first 10

In [None]:
# Download sample images
sample_urls = [
    'http://images.cocodataset.org/val2017/000000039769.jpg',  # Cats
    'http://images.cocodataset.org/val2017/000000397133.jpg',  # Sports
    'http://images.cocodataset.org/val2017/000000037777.jpg',  # Traffic
]

os.makedirs('samples', exist_ok=True)
image_paths = []

for i, url in enumerate(sample_urls):
    try:
        path = f'samples/image_{i}.jpg'
        urllib.request.urlretrieve(url, path)
        image_paths.append(path)
        print(f'Downloaded image {i+1}')
    except:
        print(f'Failed to download image {i+1}')

print(f'\nReady to detect on {len(image_paths)} images!')

In [None]:
# Run detection on all images
results = model(image_paths, conf=0.5)  # conf = confidence threshold

# Visualize results
fig, axes = plt.subplots(len(results), 2, figsize=(16, 6*len(results)))
if len(results) == 1:
    axes = axes.reshape(1, -1)

for idx, (result, path) in enumerate(zip(results, image_paths)):
    # Original image
    img = cv2.imread(path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    axes[idx, 0].imshow(img)
    axes[idx, 0].set_title('Original', fontsize=14)
    axes[idx, 0].axis('off')
    
    # Detection result
    result_img = result.plot()  # YOLOv8 draws boxes automatically
    axes[idx, 1].imshow(result_img)
    axes[idx, 1].set_title(f'Detected: {len(result.boxes)} objects', fontsize=14)
    axes[idx, 1].axis('off')
    
    # Print detected objects
    print(f'\nImage {idx+1}:')
    for box in result.boxes:
        cls = int(box.cls[0])
        conf = float(box.conf[0])
        print(f'  {model.names[cls]}: {conf:.3f}')

plt.tight_layout()
plt.show()

## Part 2: Fine-tuning on Custom Dataset

Now let's fine-tune YOLOv8 on a custom dataset for pedestrian detection using the Penn-Fudan dataset.

In [None]:
# Download Penn-Fudan dataset
!wget -q https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip
!unzip -q PennFudanPed.zip

print('Dataset downloaded!')

In [None]:
# Prepare dataset in YOLO format
import shutil
from pathlib import Path

# Create directory structure
dataset_root = Path('pedestrian_dataset')
for split in ['train', 'val']:
    (dataset_root / split / 'images').mkdir(parents=True, exist_ok=True)
    (dataset_root / split / 'labels').mkdir(parents=True, exist_ok=True)

# Convert masks to YOLO format bounding boxes
from PIL import Image
import numpy as np

def mask_to_bbox(mask_path):
    """Convert mask to YOLO format: class x_center y_center width height (normalized)"""
    mask = np.array(Image.open(mask_path))
    h, w = mask.shape
    
    bboxes = []
    obj_ids = np.unique(mask)[1:]  # Skip background
    
    for obj_id in obj_ids:
        pos = np.where(mask == obj_id)
        xmin, xmax = np.min(pos[1]), np.max(pos[1])
        ymin, ymax = np.min(pos[0]), np.max(pos[0])
        
        # Convert to YOLO format (normalized)
        x_center = ((xmin + xmax) / 2) / w
        y_center = ((ymin + ymax) / 2) / h
        width = (xmax - xmin) / w
        height = (ymax - ymin) / h
        
        bboxes.append(f'0 {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}')
    
    return bboxes

# Process all images
img_dir = Path('PennFudanPed/PNGImages')
mask_dir = Path('PennFudanPed/PedMasks')
images = sorted(list(img_dir.glob('*.png')))

# Split train/val (80/20)
split_idx = int(0.8 * len(images))
train_images = images[:split_idx]
val_images = images[split_idx:]

for split, img_list in [('train', train_images), ('val', val_images)]:
    for img_path in img_list:
        # Copy image
        shutil.copy(img_path, dataset_root / split / 'images' / img_path.name)
        
        # Create label file
        mask_path = mask_dir / img_path.name.replace('.png', '_mask.png')
        bboxes = mask_to_bbox(mask_path)
        
        label_path = dataset_root / split / 'labels' / img_path.name.replace('.png', '.txt')
        with open(label_path, 'w') as f:
            f.write('\n'.join(bboxes))

print(f'Dataset prepared:')
print(f'  Train: {len(train_images)} images')
print(f'  Val: {len(val_images)} images')

In [None]:
# Create dataset config file
config = f"""
path: {dataset_root.absolute()}
train: train/images
val: val/images

nc: 1
names: ['pedestrian']
"""

with open('pedestrian.yaml', 'w') as f:
    f.write(config)

print('Config file created!')

In [None]:
# Visualize a sample from training data
sample_img = train_images[0]
sample_label = dataset_root / 'train' / 'labels' / sample_img.name.replace('.png', '.txt')

# Read image
img = cv2.imread(str(dataset_root / 'train' / 'images' / sample_img.name))
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
h, w = img.shape[:2]

# Read and draw boxes
with open(sample_label) as f:
    for line in f:
        cls, x_c, y_c, width, height = map(float, line.strip().split())
        
        # Convert back to pixel coordinates
        x_c, y_c, width, height = x_c * w, y_c * h, width * w, height * h
        x1 = int(x_c - width/2)
        y1 = int(y_c - height/2)
        x2 = int(x_c + width/2)
        y2 = int(y_c + height/2)
        
        cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)

plt.figure(figsize=(10, 8))
plt.imshow(img)
plt.title('Sample Training Image with Annotations')
plt.axis('off')
plt.show()

In [None]:
# Fine-tune YOLOv8
model = YOLO('yolov8n.pt')  # Start from pre-trained weights

# Train
results = model.train(
    data='pedestrian.yaml',
    epochs=20,
    imgsz=640,
    batch=8,
    name='pedestrian_detector',
    patience=5,  # Early stopping
    save=True,
    verbose=True
)

print('Training complete!')

## Part 3: Evaluation and Results

In [None]:
# Load best model
model = YOLO('runs/detect/pedestrian_detector/weights/best.pt')

# Evaluate on validation set
metrics = model.val(data='pedestrian.yaml')

print('\nValidation Metrics:')
print(f'  mAP50: {metrics.box.map50:.3f}')
print(f'  mAP50-95: {metrics.box.map:.3f}')
print(f'  Precision: {metrics.box.mp:.3f}')
print(f'  Recall: {metrics.box.mr:.3f}')

In [None]:
# Test on validation images
test_images = list((dataset_root / 'val' / 'images').glob('*.png'))[:6]

fig, axes = plt.subplots(2, 3, figsize=(18, 12))
axes = axes.flat

for ax, img_path in zip(axes, test_images):
    # Run detection
    result = model(str(img_path), conf=0.5)[0]
    
    # Visualize
    result_img = result.plot()
    ax.imshow(result_img)
    ax.set_title(f'Detected: {len(result.boxes)} pedestrians', fontsize=12)
    ax.axis('off')

plt.tight_layout()
plt.show()

In [None]:
# Visualize training curves
from IPython.display import Image as IPImage, display

print('Training Results:')
display(IPImage('runs/detect/pedestrian_detector/results.png'))

## Part 4: Deploy Your Model

The trained model can be easily deployed for real-world use.

In [None]:
# Simple inference function
def detect_pedestrians(image_path, confidence=0.5):
    """
    Detect pedestrians in an image.
    
    Args:
        image_path: Path to image file
        confidence: Detection confidence threshold
    
    Returns:
        List of detections with bounding boxes and scores
    """
    results = model(image_path, conf=confidence)[0]
    
    detections = []
    for box in results.boxes:
        x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
        conf = float(box.conf[0])
        detections.append({
            'bbox': [int(x1), int(y1), int(x2), int(y2)],
            'confidence': conf
        })
    
    return detections

# Test the function
test_path = test_images[0]
detections = detect_pedestrians(test_path)

print(f'Found {len(detections)} pedestrians:')
for i, det in enumerate(detections):
    print(f'  Pedestrian {i+1}: bbox={det["bbox"]}, conf={det["confidence"]:.3f}')

In [None]:
# Export model for deployment
# model.export(format='onnx')  # For production deployment
# model.export(format='torchscript')  # For PyTorch serving

print('Model ready for deployment!')
print('\nModel files:')
print(f'  PyTorch: runs/detect/pedestrian_detector/weights/best.pt')
print('\nTo export for production:')
print('  model.export(format="onnx")  # Cross-platform')
print('  model.export(format="torchscript")  # PyTorch serving')

## Summary

**What we accomplished:**
1. Used pre-trained YOLOv8 for immediate object detection
2. Fine-tuned on custom pedestrian dataset
3. Achieved strong detection performance with minimal training
4. Created deployment-ready model

**Key Takeaways:**
- YOLOv8 is practical: fast training, fast inference, easy to use
- Transfer learning works well: pre-trained model + small dataset = good results
- Real-time capable: suitable for production applications
- Easy deployment: export to ONNX, TorchScript, TFLite, etc.

**Next Steps:**
- Try different YOLOv8 variants (s, m, l, x) for accuracy/speed tradeoffs
- Add data augmentation for better generalization
- Fine-tune on your own custom dataset
- Deploy to edge devices or cloud services