# Pose Estimation with MediaPipe

This notebook demonstrates real-time pose detection and exercise counting using MediaPipe's pose estimation model. We'll build a system that tracks arm movement and counts repetitions (curls) based on joint angles.

## Setup & Dependencies

In [1]:
! pip install opencv-python mediapipe

Collecting numpy>=1.21.2 (from opencv-python)
  Using cached numpy-1.26.4-cp310-cp310-win_amd64.whl.metadata (61 kB)
Using cached numpy-1.26.4-cp310-cp310-win_amd64.whl (15.8 MB)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 2.2.6
    Uninstalling numpy-2.2.6:
      Successfully uninstalled numpy-2.2.6
Successfully installed numpy-1.26.4


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
inference-sdk 0.62.4 requires numpy<2.4.0,>=2.0.0, but you have numpy 1.26.4 which is incompatible.

[notice] A new release of pip is available: 25.1.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
import cv2
import mediapipe as mp
import numpy as np
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose

## Exploration: Basic Webcam Feed

In [3]:
ord('A')

65

In [21]:
# Path to your video file
video_path = ".\glutebridge.mp4"

# Create video capture object
cap = cv2.VideoCapture(video_path)

# Check if opened successfully
if not cap.isOpened():
    print("Error: Could not open video.")
    exit()

while True:
    ret, frame = cap.read()  # Read a frame
    if not ret:
        break  # End of video
    
    cv2.imshow("Video", frame)  # Show the frame

    # Press 'q' to exit early
    if cv2.waitKey(50) & 0xFF == ord('q'):
        break

cap.release()       # Release resources
cv2.destroyAllWindows()
    

## Pose Detection Pipeline

The core workflow for this project:
1. Perform Detection - Use MediaPipe to detect pose landmarks
2. Identify Joints - Extract specific joint coordinates (shoulder, elbow, wrist)
3. Compute Angles - Calculate angles between joints
4. Count Curls - Track movement stages to count exercise repetitions

In [7]:
# cap = cv2.VideoCapture(0)
# Create video capture object
cap = cv2.VideoCapture(video_path)

with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
    while cap.isOpened():
        res, frame = cap.read()
        
        #convert BGR2RGB
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        image.flags.writeable = False

        #make detection
        results = pose.process(image)

        #convert BGR2RGB
        image.flags.writeable = True
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

        try:
            landmarks = results.pose_landmarks.landmark
        except:
            pass

        # Render detections
        mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                   mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=2), 
                                   mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2) 
                                      )   

        cv2.imshow("MediaPipe Feed", image)


        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

## Landmark Analysis

MediaPipe detects 33 pose landmarks across the body. Let's explore the available joint names and their coordinates.

In [10]:
len(landmarks)

33

In [11]:
for landm in mp_pose.PoseLandmark:
    print(landm)

PoseLandmark.NOSE
PoseLandmark.LEFT_EYE_INNER
PoseLandmark.LEFT_EYE
PoseLandmark.LEFT_EYE_OUTER
PoseLandmark.RIGHT_EYE_INNER
PoseLandmark.RIGHT_EYE
PoseLandmark.RIGHT_EYE_OUTER
PoseLandmark.LEFT_EAR
PoseLandmark.RIGHT_EAR
PoseLandmark.MOUTH_LEFT
PoseLandmark.MOUTH_RIGHT
PoseLandmark.LEFT_SHOULDER
PoseLandmark.RIGHT_SHOULDER
PoseLandmark.LEFT_ELBOW
PoseLandmark.RIGHT_ELBOW
PoseLandmark.LEFT_WRIST
PoseLandmark.RIGHT_WRIST
PoseLandmark.LEFT_PINKY
PoseLandmark.RIGHT_PINKY
PoseLandmark.LEFT_INDEX
PoseLandmark.RIGHT_INDEX
PoseLandmark.LEFT_THUMB
PoseLandmark.RIGHT_THUMB
PoseLandmark.LEFT_HIP
PoseLandmark.RIGHT_HIP
PoseLandmark.LEFT_KNEE
PoseLandmark.RIGHT_KNEE
PoseLandmark.LEFT_ANKLE
PoseLandmark.RIGHT_ANKLE
PoseLandmark.LEFT_HEEL
PoseLandmark.RIGHT_HEEL
PoseLandmark.LEFT_FOOT_INDEX
PoseLandmark.RIGHT_FOOT_INDEX


### Accessing Landmark Coordinates

Each landmark contains normalized coordinates (x, y, z) and visibility confidence.

In [9]:
L_HIP = landmarks[mp_pose.PoseLandmark.LEFT_HIP.value]
R_HIP = landmarks[mp_pose.PoseLandmark.RIGHT_HIP.value]

print(L_HIP , R_HIP)

x: 0.534925938
y: 0.375804931
z: 0.116260409
visibility: 0.999196947
 x: 0.553550184
y: 0.432411
z: -0.115669429
visibility: 0.999880075



In [10]:
L_HIP.x

0.5349259376525879

In [14]:
HIP = landmarks[mp_pose.PoseLandmark.RIGHT_HIP.value]
print(HIP)

x: 0.553550184
y: 0.432411
z: -0.115669429
visibility: 0.999880075



In [12]:

shoulder = landmarks[mp_pose.PoseLandmark.RIGHT_SHOULDER.value]
print(shoulder)

x: 0.250368297
y: 0.730311513
z: -0.181980282
visibility: 0.999988079



## Angle Calculation

Calculate the angle between three points using inverse tangent. This determines the arm's bend angle at the elbow.

In [22]:
def calculate_angle(a, b):

    radians = np.arctan2(a.y-b.y, a.x-b.x)
    angle = np.abs(radians*180/np.pi)   
    return round(angle, 2)

In [17]:
calculate_angle(HIP, shoulder)

44.5

## Real-time Angle Display

Overlay the calculated angle on the video feed in real-time.

In [20]:
tuple(np.multiply((HIP.x,HIP.y), [640, 480]).astype(int))

(354, 208)

In [19]:


cap = cv2.VideoCapture(video_path)

with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
    while cap.isOpened():
        res, frame = cap.read()
        
        #convert BGR2RGB
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        image.flags.writeable = False

        #make detection
        results = pose.process(image)

        #convert BGR2RGB
        image.flags.writeable = True
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

        try:
            landmarks = results.pose_landmarks.landmark
            HIP = landmarks[mp_pose.PoseLandmark.RIGHT_HIP.value]
            shoulder = landmarks[mp_pose.PoseLandmark.RIGHT_SHOULDER.value]

            angle = calculate_angle(HIP, shoulder)

            cv2.putText(image, 
                        str(angle),
                        tuple(np.multiply((HIP.x,HIP.y), [640, 480]).astype(int)),
                        cv2.FONT_HERSHEY_SIMPLEX, 
                        0.5, 
                        (255, 255, 255), 
                        2, 
                        cv2.LINE_AA
                        )
        except:
            pass

        

        # Render detections
        mp_drawing.draw_landmarks(image, results.pose_landmarks, mp_pose.POSE_CONNECTIONS,
                                   mp_drawing.DrawingSpec(color=(245,117,66), thickness=2, circle_radius=2), 
                                   mp_drawing.DrawingSpec(color=(245,66,230), thickness=2, circle_radius=2) 
                                      )   

        cv2.imshow("MediaPipe Feed", image)


        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

## Exercise Counter

Complete system that counts bicep curls in real-time. The system tracks:
- **Stage**: "down" (arm extended) or "up" (arm bent)
- **Counter**: Number of completed repetitions

Logic: A curl is counted when the arm transitions from down (angle > 160°) to up (angle < 30°).

In [None]:
import cv2
import mediapipe as mp
import numpy as np

from tempfile import NamedTemporaryFile
video_path = ".\glutebridge.mp4"
mp_drawing = mp.solutions.drawing_utils
mp_pose = mp.solutions.pose


def calculate_angle(a, b):
    radians = np.arctan2(a.y - b.y, a.x - b.x)
    angle = np.abs(radians * 180 / np.pi)
    return round(angle, 2)

cap = cv2.VideoCapture(video_path)

output_path = "output_pose.avi"
# Get video properties
width  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps    = cap.get(cv2.CAP_PROP_FPS)
# Video writer (MP4)
fourcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))


counter = 0
stage = None

with mp_pose.Pose(min_detection_confidence=0.5,
                  min_tracking_confidence=0.5) as pose:

    while cap.isOpened():
        res, frame = cap.read()
        if not res:
            break   # ✅ IMPORTANT

        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        image.flags.writeable = False
        results = pose.process(image)

        image.flags.writeable = True
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

        try:
            landmarks = results.pose_landmarks.landmark
            hip = landmarks[mp_pose.PoseLandmark.RIGHT_HIP.value]
            shoulder = landmarks[mp_pose.PoseLandmark.RIGHT_SHOULDER.value]

            angle = calculate_angle(hip, shoulder)
            height, width, _ = image.shape
            cv2.putText(
                image,
                str(angle),
                tuple(np.multiply((hip.x, hip.y), [width+20, height+20]).astype(int)),
                cv2.FONT_HERSHEY_SIMPLEX,
                0.5,
                (255, 255, 255),
                2,
                cv2.LINE_AA
            )

            if angle < 7:
                stage = "down"
            if angle > 40 and stage == "down":
                stage = "up"
                counter += 1
                   # UI overlays
            h, w, _ = image.shape
            cv2.rectangle(image, (0, 0), (200, 80), (255, 100, 0), -1)
            cv2.putText(image, "REPS:", (10, 30),
                        cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 0), 2)
            cv2.putText(image, str(counter), (100, 65),
                        cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

            cv2.rectangle(image, (w - 200, 0), (w, 80), (255, 100, 0), -1)
            cv2.putText(image, "Stage:", (w - 190, 30),
                        cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 0), 2)
            cv2.putText(image, str(stage), (w - 90, 65),
                        cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

            mp_drawing.draw_landmarks(
                image,
                results.pose_landmarks,
                mp_pose.POSE_CONNECTIONS
            )
            out.write(image)
            cv2.imshow("MediaPipe Feed", image)

            

            if cv2.waitKey(10) & 0xFF == ord('q'):
                break
        except:
            pass

 


cap.release()
cv2.destroyAllWindows()
