-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pose detection api #39
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I'm good for this to merge, I left one small comment about the example that does need to be addressed now. Other examples to make (which can be done in a new branch)
- Example for single pose from image
- Example for parts by keyword
Other general comments / questions:
- With the PoseNet model there was also a "skeleton" which if my memory is right was a data structure that indicated which body parts are connected. This doesn't need to be part of the ml5 function, but might be a nice example?
- Should we "future proof" this and build in an optional "string" argument to indicate what model to use. Right now it would only be "BlazePose", is that correct? It might be nice also from a pedagogical standpoint to emphasize there is a specific pre-trained model being used and not a "magic" algorithm.
I'm happy for this to be merged and any of these questions can be moved to separate issues for tracking.
for (let j = 0; j < pose.keypoints.length; j++) { | ||
let keypoint = pose.keypoints[j]; | ||
// Only draw a circle if the keypoint's confidence is bigger than 0.2 | ||
if (keypoint.score > 0.2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is useful to show the possibility but I have a few questions about it:
- The example is simpler to look at without it, should we include it?
- If we include it, maybe a lower threshold is fine? (0.1?)
- Is
score
the term from the model itself? Another option isconfidence
. I don't have a preference I just seem to remember we're usingconfidence
for the image classifiers (teachable machine, etc) so maybe we should be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included the check to make the example output look less confusing since MoveNet will try to fit all the landmark points inside the video even though a body part is clearly out of frame, but I would also be happy to remove it for simplicity.
(The top image is without the confidence check, and the bottom image has the check).
If we decide to include it I think 0.1 would be a good idea.
I believe score
is the term used for tf.js landmark models. I do agree that it should be consistent across different features. Personally, I think confidence
is a little more intuitive, but I also don't have a strong preference.
Please let me know your thoughts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, in that case I think we should keep the if statement, it's confusing to see all the extra points that are out of frame!
I always hesitate to change the name of something from the original source model, but I think we could make the decision to use "confidence" across the board instead of "score". Any thoughts @MOQN @sproutleaf @QuinnHe @gohai? I'm fine either way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my personal opinion, "confidence" will make more sense to beginners and different naming conventions may easily create confusion. The "scores" are confidence scores in the end 😉.
Hi @shiffman, thank you for the review! I think the single pose, parts by keyword, and skeleton examples are great ideas. I would be happy to work on those after merging. The current model used is MoveNet. It may be a little tricky to implement BlazePose since some of its option arguments are different from MoveNet. Personally, I think the best way to incorporate the BlazePose model would be through the existing |
Thanks for the detailed explanation! Apologies, I got things backwards (MoveNet vs. BlazePose). What I was imagining was the following: let bodypose = ml5.bodyPose(); // Default MoveNet
let bodypose = ml5.bodyPose("MoveNet"); // Explicitly using MoveNet
let bodypose = ml5.bodyPose("BlazePose"); // Explicitly using BlazePose (if and when we implement it or other models). But I see now that there is more nuance here and lots of different versions of the model. What is the default one if none is specified: I don't think we need to solve this now. Since it's only MoveNet we can just default to that and let people change the "specific" model with |
Ah, sorry I misunderstood your suggestion earlier. I do think it would be a good idea to have an extra string argument to An alternative solution on top of my head is to create a new ml5 model: |
@shiffman I think this branch is ready to merge. Please let me know if there is any issue that needs immediate addressing. I will make updates (add more examples, change "score" to "confidence", add BlazePose...) in other branches. :) |
Yes, this is great! I'll merge and we can discuss and implement next steps in other branches! |
@ziyuan-linn Thank you for all of your amazing work! Quinn and I are planning to explore the newly implemented models/features for the website. Let me reach out via email! |
Hi everyone, this PR updates the Bodypose (poseDetection) API. The changes here are very similar to the changes made to the handpose API.
detect()
,detectStart()
, anddetectStop()
functionTO-DO: