Feat/keypoints from mediapipe #1232

David-rn · 2024-05-27T16:22:49Z

Description

According to issue #1174, this PR extends the usage of KeyPoints class to add from_mediapipe connector.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

The following Google Colab describes the process used for testing this new feature. It allows to upload a picture to test it. (This Colab feature only worked in Chrome for me)

Any specific deployment considerations

This connector needs an additional parameter resolution since the result from mediapipe only provides the normalized coordinates [0.0, 1.0]. I didn't find the equivalent of VideoInfo for images, so the parameter accepts a Tuple of ints.
Another consideration is that mediapipe only outputs the prediction of the most prominent person in the image. So, even if there is several people in the image, the output will be only the keypoints from one person. The implementation has taken this into account.

Docs

Docs updated? What were the changes:

CLAassistant · 2024-05-27T16:22:55Z

All committers have signed the CLA.

SkalskiP · 2024-05-27T19:15:08Z

Hi, @David-rn 👋🏻 Thanks a lot for submitting your PR. @LinasKo, could you take a look?

LinasKo

Overall, this is a very neat addition to our codebase. Easy to test, immediately visible benefit, neatly coded-up.

I left a few smaller changes, and I'd like to ask for some larger ones:

I'd like to verify that using this inside a PoseLandmarker context block works. If you follow the guide you'll see PoseLandmarkerOptions being created and then with PoseLandmarker.create_from_options(options) as landmarker: called. Inside, Detections.from_mediapipe should work.
Next, and that's something I've overlooked for the longest time didn't - MediaPipe does in fact support more than 1 person! To do it, you need to set the number of poses explicitly by passing num_poses=N to PoseLandmarkerOptions.

Would you mind making sure these work and adding the test to the Colab?
The docstring from_mediapipe doesn't need to mentioned these examples.

supervision/keypoint/core.py

David-rn · 2024-06-02T10:39:11Z

Hi @LinasKo! Thanks for the suggestions!

I'd like to verify that using this inside a PoseLandmarker context block works. If you follow the guide you'll see PoseLandmarkerOptions being created and then with PoseLandmarker.create_from_options(options) as landmarker: called. Inside, Detections.from_mediapipe should work.

Regarding the first comment I discover that the use of mp.solutions.pose.Pose (the one I implemented) is a Legacy Solution as of March 1, 2023 as state here. This confusion originated in the example Colab. The one you suggest in your comment is the new way to go. However, the result object is not the same as the legacy one.

For this reason, should we provide support for both types or just focus on the new result? This is quite linked to the next comment.

Next, and that's something I've overlooked for the longest time didn't - MediaPipe does in fact support more than 1 person! To do it, you need to set the number of poses explicitly by passing num_poses=N to PoseLandmarkerOptions.

Regarding this second comment, indeed, it is now possible to obtain more than one pose, but only with the new implementation whose output object is of type PoseLandmarkerResult. This object holds a list of poses whereas the legacy one holds a different structure.

LinasKo · 2024-06-02T10:51:43Z

How hard would it be to support both? If it's tricky to reason about or the code involved is complicated, let's stick to the new version. Even a few years back, I was already using the newer API, I believe.
Aye, let's add the newer one for sure in that case.

David-rn · 2024-06-02T15:56:44Z

I think it wasn't that hard, but let me know if it is a nice solution:

I parsed the legacy result into a list (same as the new one) so the loop to fill the new KeyPoints object can remain the same for both results. I also updated the Google Colab to make easier the checking process.

LinasKo · 2024-06-06T16:34:35Z

Hi @David-rn,

Apologies for the delay! Good to hear it supports both cases - I'm prioritizing to review this next week.

LinasKo · 2024-06-11T11:03:30Z

Tidied up a few small aspects of this:

( instead of [ making mypy happy
Removed legacy example. With our current plans to also add support for mediapipe face, this would bloat the docs.
Brought up to speed with develop.
Added inline suggestion to download model task. Initially intended to add a URL outside the code block, but it doesn't render nicely.
Tested on a local webcam.

Code

import mediapipe as mp  # noqa
import cv2
import numpy as np
import supervision as sv


BaseOptions = mp.tasks.BaseOptions
PoseLandmarker = mp.tasks.vision.PoseLandmarker
PoseLandmarkerOptions = mp.tasks.vision.PoseLandmarkerOptions
VisionRunningMode = mp.tasks.vision.RunningMode

options = PoseLandmarkerOptions(
    base_options=BaseOptions(model_asset_path="pose_landmarker_lite.task"),
    running_mode=VisionRunningMode.IMAGE,
    num_poses=4,
)

cap = cv2.VideoCapture(0)


with PoseLandmarker.create_from_options(options) as landmarker:
    # pose_landmarker_result = landmarker.detect(mp_image)
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        mp_image = mp.Image(
            image_format=mp.ImageFormat.SRGB,
            data=cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        )

        pose_landmarker_result = landmarker.detect(mp_image)

        image_hight, image_width, _ = frame.shape
        keypoints = sv.KeyPoints.from_mediapipe(pose_landmarker_result, (image_width, image_hight))

        edge_annotator = sv.EdgeAnnotator(color=sv.Color.GREEN, thickness=2)
        annotated_image = edge_annotator.annotate(
            scene=frame.copy(),
            key_points=keypoints,
        )

        cv2.imshow("image", annotated_image)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        
cap.release()

I like how this looks. @SkalskiP, up for giving the code a quick glance?

SkalskiP · 2024-06-13T18:09:53Z

Fantastic job, @David-rn and @LinasKo 🔥 ! Merging!

David-rn added 3 commits May 26, 2024 20:14

add GHUM Skeleton enum

2c3fa27

add method in KeyPoints class

490a341

Updated docstring comment due to line too large

4fea003

LinasKo requested changes May 31, 2024

View reviewed changes

David-rn and others added 2 commits June 2, 2024 17:34

Updated from_mediapipe method to support two results types

daf1344

fix(pre_commit): 🎨 auto format pre-commit hooks

26d7021

LinasKo added 4 commits June 11, 2024 13:06

Merge branch 'develop' into feat/keypoints-from-mediapipe

edfb814

Adjust long lines

fd1a07c

minor tidy-up for mediapipe

a4bdfc2

minor improvement to code clarity

93b21a8

LinasKo approved these changes Jun 11, 2024

View reviewed changes

LinasKo linked an issue Jun 11, 2024 that may be closed by this pull request

[KeyPoints] - add from_mediapipe connector allowing Supervision to be used with Google MediaPipe models #1174

Closed

LinasKo mentioned this pull request Jun 11, 2024

[KeyPoints] - extend from_mediapipe with Google MediaPipe FaceMesh #1274

Open

SkalskiP and others added 2 commits June 13, 2024 13:50

docs improvements

3725beb

fix(pre_commit): 🎨 auto format pre-commit hooks

07cb0d8

SkalskiP merged commit f96e776 into roboflow:develop Jun 13, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/keypoints from mediapipe #1232

Feat/keypoints from mediapipe #1232

David-rn commented May 27, 2024

CLAassistant commented May 27, 2024 •

edited

Loading

SkalskiP commented May 27, 2024

LinasKo left a comment •

edited

Loading

David-rn commented Jun 2, 2024

LinasKo commented Jun 2, 2024

David-rn commented Jun 2, 2024

LinasKo commented Jun 6, 2024

LinasKo commented Jun 11, 2024

SkalskiP commented Jun 13, 2024

Feat/keypoints from mediapipe #1232

Feat/keypoints from mediapipe #1232

Conversation

David-rn commented May 27, 2024

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

CLAassistant commented May 27, 2024 • edited Loading

SkalskiP commented May 27, 2024

LinasKo left a comment • edited Loading

Choose a reason for hiding this comment

David-rn commented Jun 2, 2024

LinasKo commented Jun 2, 2024

David-rn commented Jun 2, 2024

LinasKo commented Jun 6, 2024

LinasKo commented Jun 11, 2024

SkalskiP commented Jun 13, 2024

CLAassistant commented May 27, 2024 •

edited

Loading

LinasKo left a comment •

edited

Loading