In [1]:
import cv2 as cv
import mediapipe as mp

2024-02-23 13:55:22.418061: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Import Libraries

This section imports the necessary libraries for the pose estimation task.

- `cv2`: This is the OpenCV library, which provides tools for image and video processing.
- `mediapipe`: A library developed by Google that offers pre-trained models and tools for various tasks, including pose estimation.


In [2]:
mp_pose = mp.solutions.pose
pose = mp_pose.Pose()


## Initialize MediaPipe Pose

Here, we initialize the pose estimation model provided by MediaPipe.

- `mp_pose`: This acts as a reference to the pose solutions in MediaPipe.
- `pose`: An instance of the pose model, ready to process images and videos.


In [3]:
video_path = '/Users/jolie/CODING/MOVEMENT TRACING/IMG_0521.mov'
cap = cv.VideoCapture(video_path)


INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
OpenCV: Couldn't read video stream from file "/Users/jolie/CODING/MOVEMENT TRACING/IMG_0521.mov"


## Initialize Video Capture

In this section, we set up the video capture using OpenCV.

- `video_path`: The path to the video file that will be processed.
- `cap`: This is the video capture object which will be used to read frames from the video specified by `video_path`.


In [4]:
while cv.waitKey(1) < 0:
    hasFrame, frame = cap.read()
    if not hasFrame:
        cv.waitKey()
        break

    # Convert the BGR image to RGB
    rgb_frame = cv.cvtColor(frame, cv.COLOR_BGR2RGB)
    results = pose.process(rgb_frame)

    # Draw the pose landmarks on the frame
    if results.pose_landmarks:
        for connection in mp_pose.POSE_CONNECTIONS:
            start_idx = connection[0]
            end_idx = connection[1]
            if results.pose_landmarks.landmark[start_idx].visibility > 0.5 and results.pose_landmarks.landmark[end_idx].visibility > 0.5:
                start = tuple([int(results.pose_landmarks.landmark[start_idx].x * frame.shape[1]), int(results.pose_landmarks.landmark[start_idx].y * frame.shape[0])])
                end = tuple([int(results.pose_landmarks.landmark[end_idx].x * frame.shape[1]), int(results.pose_landmarks.landmark[end_idx].y * frame.shape[0])])
                cv.line(frame, start, end, (0, 255, 0), 3)
                cv.ellipse(frame, start, (3, 3), 0, 0, 360, (0, 0, 255), cv.FILLED)
                cv.ellipse(frame, end, (3, 3), 0, 0, 360, (0, 0, 255), cv.FILLED)

    cv.imshow('Pose Estimation using MediaPipe', frame)


## Process and Display Video Frames

This is the main loop where each frame of the video is processed and displayed.

- The loop continues until a key is pressed.
- Each frame is read from the video capture object.
- The frame is then converted from BGR to RGB format, as MediaPipe uses RGB.
- The pose estimation model processes the RGB frame and returns pose landmarks.
- If pose landmarks are detected, they are drawn on the frame.
- The processed frame, with pose landmarks, is displayed in a window titled 'Pose Estimation using MediaPipe'.


In [None]:
cap.release()
cv.destroyAllWindows()


## Release Video Capture and Close Windows

After processing all frames, or if the video is interrupted:

- The video capture object (`cap`) is released, freeing up resources.
- All OpenCV windows are destroyed, closing any open display windows.
