Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2D joints from 3D prediction #21

Closed
kallivad opened this issue Jan 21, 2020 · 6 comments
Closed

2D joints from 3D prediction #21

kallivad opened this issue Jan 21, 2020 · 6 comments

Comments

@kallivad
Copy link

Hi @mkocabas, thank you a lot for your repo!
I wounder, if there any way of how to obtain 2D joints not from STAF tracker, but from 3D joints predicted by your net? Am I use pred_cam from https://github.com/mkocabas/VIBE/blob/master/doc/demo.md and how I can use it?

@mkocabas mkocabas reopened this Jan 21, 2020
@mkocabas
Copy link
Owner

Hi @kallivad,

You can check this projection function:

def perspective_projection(points, rotation, translation,

For a sample usage, check this:

projected_joints = perspective_projection(model_joints, rotation, camera_t,

@mkocabas
Copy link
Owner

I am closing it now, feel free to ask if you need help about it.

@cbsudux
Copy link

cbsudux commented Jun 29, 2020

Hey! I'm trying to get 2D pose from VIBE Output. How do I use the perspective projection function to do this? What inputs should I pass?

@RainkLH
Copy link

RainkLH commented Sep 11, 2020

Same question.
For function "perspective_projection(points, rotation, translation,focal_length, camera_center)"
How to calculate Camera rotation、Camera translation、Focal length、Camera center

@tegusi
Copy link

tegusi commented Aug 13, 2021

For classical intrinsic and extrinsic matrix, the following solution works well.

cam = dicts['orig_cam']
cam_s = cam[0:1]
cam_pos = cam[2:]
flength = w / 2.
tz = flength / (0.5 * w * cam_s)
trans = -np.hstack([cam_pos, tz])
camera_data['color_focal_length'].append(np.array([w / 2, w / 2]))
camera_data['color_center'].append(np.array([[w / 2, h / 2]]))
camera_data['c2w'].append(np.eye(4))
camera_data['c2w'][:3,3] = trans

@lvZic
Copy link

lvZic commented Nov 11, 2022

For classical intrinsic and extrinsic matrix, the following solution works well.

cam = dicts['orig_cam']
cam_s = cam[0:1]
cam_pos = cam[2:]
flength = w / 2.
tz = flength / (0.5 * w * cam_s)
trans = -np.hstack([cam_pos, tz])
camera_data['color_focal_length'].append(np.array([w / 2, w / 2]))
camera_data['color_center'].append(np.array([[w / 2, h / 2]]))
camera_data['c2w'].append(np.eye(4))
camera_data['c2w'][:3,3] = trans

I found the cam params converge worse, and i use weak perspective in my code, in which kpy_2d = scale(kyp3d[, :2] + txy ). I think the key reason is the focal length of the dataset is different with each image, and it range from 400 mm to 800 mm. So maybe the network cannot regress the scale well?
As "It is common to assume a fixed focal length to perform perspective projection. " . I wonder if the performance would be improved if i use perspective projection instead of weak perspective?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants