# EE267 Spring 2025 Term Project: Swimmer Segmentation and Pose Estimation

## Objective
Articial Intelligence (AI) has transformed computer vision, enabling machines to interpret visual data
much like humans. Two critical techniques in this field are image segmentation and human pose
estimation. Image segmentation isolates objects from their backgrounds, essential for tasks like sports
analysis and medical imaging. Human pose estimation, on the other hand, detects body joints and
movement, providing valuable insights for elds such as biomechanics and augmented reality.

In this assignment, we will do the following:
1. Segment the swimmer from the background using Meta's Segment Anything Model (SAM).
2. Detect and label the swimmer's joints using a human pose estimation model of your choice
(either YOLO or Meta Sapiens).

## Experiment

### Task 0: Data Pre-processing

Complete Guide to Image Preprocessing Techniques in Python: https://medium.com/@maahip1304/the-complete-guide-to-image-preprocessing-techniques-in-python-dca30804550c

In [25]:
# Import libraries
import os
import numpy as np
from glob import glob
import cv2

# Data types
from typing import List
from numpy import ndarray

In [22]:
# Image dataset paths
INPUT_FOLDER = 'data/input/'
OUTPUT_FOLDER = 'data/output/'

os.makedirs(OUTPUT_FOLDER, exist_ok=True)

In [23]:
# Read all JPG and PNG images in input image folder
images = glob(os.path.join(INPUT_FOLDER, '*.jpg')) + \
    glob(os.path.join(INPUT_FOLDER, '*.png'))

#### Image Pre-processing

In [28]:
def preprocess(images: List[ndarray]) -> List[ndarray]:
    """
    Image pre-processing for image segmentation & pose estimation:
    1. Resizing
    2. Image denoising
    3. Image enhancement
    4. Normalization

    Parameters:
        images (List[ndarray]): images to pre-process

    Returns:
        List[ndarray]: post-processed images
    """
    # Image resizing

    # Grayscaling (probably not bc we want color)

    # Image denoising: smoothing, blurring, filtering, sharpening
    """ We won't want to smooth or blur for our use case. 
        In lecture professor said, the blue water and bubbles makes it hard to segment swimmers from rest of image. 
        If anything, we want to sharpen image & detect edges."""

    ## Edge-preserving smoothing filter: bilateral filter?

    ## Image sharpening: Laplacian filter, Unsharp masking

    ## Edge detection: Canny operator

    # Image enhancement
    ## Contrast enhancement: histogram equalization

    # Normalization (pixel intensity)

    # Batching
    
    return images

In [None]:
# Process images
processed_images = preprocess(images)

#### Image Augmentation (if needed)

### Task 1: Swimmer Segmentation with SAM
Description:

Use Meta’s SAM model to isolate the swimmer from each frame provided.

Instructions:
1. Load the provided dataset of swimmer images.
2. Apply SAM to each image to segment the swimmer, separating them from the background.
3. Save the segmented images for the next task.

Meta SAM Github: https://github.com/facebookresearch/segment-anything

In [16]:
# Import libraries
import torch
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator

In [17]:
# Check if GPU is available for PyTorch
device = "cuda" if torch.cuda.is_available() else "cpu"

In [19]:
# Meta Segment Anything Model (SAM) variables
MODEL_PATH = 'models/sam_vit_b_01ec64.pth'
SAM_TYPE = 'vit_b'

# Load Meta SAM variant
sam_model = sam_model_registry[SAM_TYPE](checkpoint=MODEL_PATH)
sam_model.to(device=device)

# Create an automatic mask generator using SAM
mask_generator = SamAutomaticMaskGenerator(sam_model)

In [29]:
def segment(image: ndarray) -> ndarray:
    """
    Segment swimmer in image

    Parameters:
        images (ndarray): image to segment

    Returns:
        ndarray: segmented image
    """
    return

In [30]:
def segment_dataset(images: List[ndarray]) -> List[ndarray]:
    """
    Segment swimmer in image

    Parameters:
        images (List[ndarray]): image dataset to segment

    Returns:
        List[ndarray]: segmented image dataset
    """
    return [segment(image) for image in images]

### Task 2: Pose Estimation on the Segmented Swimmer

Description:

With the segmented swimmer images, utilize a human pose estimation model to
detect the swimmer’s joints.

Model Options:
1. YOLO (congured for pose estimation).
2. Meta Sapiens

Instructions:
1. Choose either YOLO or Meta Sapiens for pose estimation.
2. Run the selected model on the segmented images to identify key joint positions.
3. Save images with overlaid joint positions and export joint coordinates for each frame.


In [4]:
# Initialize Pose Estimation Model

In [5]:
# Identify key joint positions

In [6]:
# Overlay images with joint positions

In [7]:
# Save overlaid images