Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keypoint representation as input to IKNet #98

Open
jdambre opened this issue Nov 18, 2022 · 3 comments
Open

Keypoint representation as input to IKNet #98

jdambre opened this issue Nov 18, 2022 · 3 comments

Comments

@jdambre
Copy link

jdambre commented Nov 18, 2022

I am trying to use IKNet separately, starting from hand keypoints that have been extracted with MediaPipe. In order for this to work, I need to make sure that the Mediapipe hand coordinates are preprocessed in order to match the expected input format of IKNet (origin, scale, possibly rotation as well??).

I ran into two questions here:

  1. I can see from your code that the keypoints have te be shifted to make 'M1' the origin. Bust what is the assumed scale? In the code you use IK_UNIT_LENGTH when rescaling from Mano reference keypoints, but it is not clear what this relates to or where it comes from. Also, is there an assumption on rotation of the hand (e.g. palm orientation)?

  2. I was assuming that the 'mpii_ref' keypoint set you pass as input to the IKNet would be some kind of "relaxed" reference hand (this is converted from the mano code base). When I plot it, however, the projection onto the xz plane matches this assumption, but the y coordinates look very strange, so I am assuming I am doing something wrong in interpreting this. Or maybe this incorporates some assumptions about the IKNet model input that I need to convert also to xyz keypoints input - since this seems to be passed as a reference hand? Could you clarify?

Examples:
(1) mpii_ref hand in front view (looking fine)
mpii_ref_hand_xz
(2) mpii_ref hand in rotated xyz view, showing unnaturally curved fingers and very long wrist-to-thumb connection
mpii_ref_hand_xyz
(3) For comparison: mediapipe hand in front view
Mediapipe_hand_xz
(4) For comparison: mediapipe hand in same xyz view as above
Mediapipe_hand_xyz

@CalciferZh
Copy link
Owner

  1. The unit length is the bone length from M1 to wrist.
  2. I think it's because you are using different scales in xyz axes for visualization. The offset along y axis is 0.2 unit which is roughly 1.8cm. This is reasonable for human hand.

To use the IKNet I think the most safe approach is to only replace xyz and delta and keep other parameters unchanged. Also make sure you have converted the keypoints into mpii format and scale.

@jiangfeng999
Copy link

Sorry to bother you, but did you use mpii_ref_xyz to draw the 3D gesture, and if so, how did you draw the gesture in the coordinate system?

@jdambre
Copy link
Author

jdambre commented Jun 20, 2023

Hi @jiangfeng999 If you're asking me: I think so, but I gave up on this approach months ago, so I don't remember the details ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants