# Pose Estimation

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sensein/senselab/blob/main/tutorials/video/pose_estimation.ipynb)

This tutorial demonstrates how to use Senselab's Pose Estimation API for estimating human poses in images. Senselab supports multiple pose estimation backends, such as MediaPipe and YOLO.

## Setup

Let's get started by installing Senselab and importing the necessary modules from Senselab for processing images and performing pose estimation.

In [None]:
%pip install 'senselab[video]'

In [1]:
from senselab.video.tasks.pose_estimation import estimate_pose, visualize_pose

In [None]:
!mkdir -p tutorial_images
!wget -O tutorial_images/no_people.jpeg https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/pose_data/no_people.jpeg
!wget -O tutorial_images/single_person.jpg https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/pose_data/single_person.jpg
!wget -O tutorial_images/three_people.jpg https://raw.githubusercontent.com/sensein/senselab/main/src/tests/data_for_testing/pose_data/three_people.jpg

## MediaPipe Pose Estimation

### Perform Pose Estimation
Now, let's perform pose estimation on the example image using MediaPipe. We will use the "full" model for this tutorial.


In [None]:
image_path = "tutorial_images/single_person.jpg"
result = estimate_pose(image_path, model="mediapipe", model_type="full")

# Check the number of individuals detected
print(f"Number of individuals detected: {len(result.individuals)}")

MediaPipe produces 33 3D keypoints (normalized and world coordinates) for each individual along with a visibility score (0-1):

In [None]:
# Print detailed information about each detected individual
for i, individual in enumerate(result.individuals):
    print(f"Individual {i+1}:")
    for landmark_name, landmark in individual.normalized_landmarks.items(): 
        # replace with individual.world_landmarks.items() to get world coordinates
        print(f"  {landmark_name}: (x={round(landmark.x, 2)}, " \
              f"y={round(landmark.y, 2)}, z={round(landmark.z, 2)}, " \
              f"visibility={round(landmark.visibility, 2)})")

### Visualize Results
To visualize the estimated poses, use Senselab's built-in visualization utilities.

In [None]:
visualize_pose(result, output_path="visualize/mediapipe.jpg", plot=True)

## YOLO Pose Estimation

### Perform Pose Estimation
Run the YOLO model on the same example image.

In [None]:
result = estimate_pose(image_path, model="yolo", model_type="11n")

# Check the number of individuals detected
print(f"Number of individuals detected: {len(result.individuals)}")

YOLO produces 17 2D keypoints for each individual along with a confidence score (0-1):

In [None]:
# Print detailed information about each detected individual
for i, individual in enumerate(result.individuals):
    print(f"Individual {i+1}:")
    for landmark_name, landmark in individual.normalized_landmarks.items():
        print(f"  {landmark_name}: (x={round(landmark.x, 2)}, " \
              f"y={round(landmark.y, 2)}, " \
              f"confidence={round(landmark.confidence, 2)})")

### Visualize Results
Plot the YOLO-estimated poses on the image.

In [None]:
visualize_pose(result, output_path="visualize/yolo.jpg", plot=True)

## Extended Cases

### Estimating Poses in Multiple-Person Images

In [None]:
multi_person_image = "tutorial_images/three_people.jpg"
result = estimate_pose(multi_person_image, model="yolo", model_type="11n")
visualize_pose(result, "visualize/multi-person-yolo.jpg")

You can specify the maximum number of individuals to detect using the num_individuals parameter (MediaPipe only):

In [None]:
# num_individuals set to 2
result = estimate_pose(multi_person_image, model="mediapipe", model_type="full", num_individuals=2)
visualize_pose(result, "visualize/multi-person-mp.jpg")

### Handling No Person Detected
If no person is detected in the image, the output will have zero individuals.

In [None]:
no_person_image = "tutorial_images/no_people.jpeg"
result = estimate_pose(no_person_image, model="mediapipe", model_type="full")

if len(result.individuals) == 0:
    print("No individuals detected in the image.")