Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perspective projection #125

Closed
dengyang11 opened this issue May 31, 2024 · 5 comments
Closed

perspective projection #125

dengyang11 opened this issue May 31, 2024 · 5 comments

Comments

@dengyang11
Copy link

Hi, thanks for your wonderful work!

I am wondering that why the pred_cam_t[3] is

    pred_cam_t = torch.stack([pred_cam[:, 1],
                              pred_cam[:, 2],
                              2*focal_length[:, 0]/(self.cfg.MODEL.IMAGE_SIZE * pred_cam[:, 0] +1e-9)],dim=-1)

I look forward to your reply, Thanks

@geopavlakos
Copy link
Collaborator

In this case, the original pred_cam[:,0] value corresponds to s, the scaling factor of the weak perspective projection, which approximates f/Z. So the depth of the human is Z = f/s. Then, we also divide by the factor bbox_size/2, so that we project the human to [-0.5,0.5].

@dengyang11
Copy link
Author

Thanks again. In addition, why focal length changes with image size? Thanks

  1. scaled_focal_length = model_cfg.EXTRA.FOCAL_LENGTH / model_cfg.MODEL.IMAGE_SIZE * img_size.max()
  2. pred_keypoints_2d = perspective_projection(pred_keypoints_3d,
    translation=pred_cam_t,
    focal_length=focal_length / self.cfg.MODEL.IMAGE_SIZE)

@geopavlakos
Copy link
Collaborator

You can use an arbitrary focal length value when you use the above equation. We adopt the design decisions of ProHMR. Note that self.cfg.MODEL.IMAGE_SIZE is constant (set to 256).
For the demo code, this is just a design choice to visualize the results with larger focal length values in general. You could experiment with other values too.

@dengyang11
Copy link
Author

Thanks again

@nnop
Copy link

nnop commented Jul 1, 2024

Then, we also divide by the factor bbox_size/2, so that we project the human to [-0.5,0.5].

You mean normalize to [-1, 1]? @geopavlakos
And I think it's more proper to normalize by bbox_size instead of image size. It's the bbox size which is resized to MODEL.IMAGE_SIZE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants