New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem about transform in ProHMR #1
Comments
Can you explain what you mean by "not consistent"? |
I am using transform_to_noise to check the quality of my fitting, but I found that of value of |
The difference you are seeing probably is due to the non-uniqueness of the 6D rotation representation for the rotation matrices. Let me explain. When you sample a z, this gets mapped to a set of 24 6D representations (global orientation and body pose). At training time we use a loss that encourages this representation to be close to the canonical (please check the paper for that), but this cannot be enforced as a hard constraint. When you compute the z for a particular fitting output, I assume you do it by passing in the 6D representation obtained by the 2 first columns of the rotation matrix. However, this is not the same as the original you got by decoding z. They both get mapped to the same rotation matrix but they have different values. If you look at this line, the last argument that this function returns (and we ignore) is the 6D representation of the pose. If you pass this into the flow network, then the z value should be the same. |
I transform the 6D representation to noise here,but the diff of z is not zero. |
Indeed there was an issue in the implementation of the LU decomposition that must have been introduced after the refactoring. This only affected the pose -> z path, this is why the prediction and fitting was working properly. |
Hi @nkolot , thank you for your work. I want to ask one related question. "Empirically we found that putting no constraints on the 6D representation results in large discrepancy between examples with full 3D SMPL parameter supervision and examples with only 2D keypoint annotations." Could you tell us more about the discrepancy in details? Thank you! |
@zhihaolee So what happens is that the examples with only 2D keypoint annotations are not trained using the inverse path, i.e. gt_pose -> z -> maximize log prob. So when we sample from the latent and transform it to a pose using the forward path then it generates 6D pose vectors that are not in the "canonical" orthonormal representation. For example for when R=I the identity matrix, it might as well output [a ,0 , 0 | b, c, 0] with a !=1, b != 0 and c != 1. This still though gets mapped to the identity rotation matrix. For the examples with ground truth 3D pose the output is practically orthonormal. So if we want the same behavior from the data without 3D supervision it makes sense to encourage the similarity in the output domain. Empirically, we observed that this helped to achieve better diversity in the output samples. |
I got it, with 3D pose supervision, the prediction is "practically" orthonormal, but without 3D pose supervision, the prediction maybe far from orthonormal, although it can be mapped to the right rotation matrix in the end. |
It works now!!! Thanks for your reply! |
Hi @Mercurialzhang and @nkolot @geopavlakos , I also did it in a similar way as urs but found it (LU module) still had non-negligible reconstruction errors like 1e-2 for some entities. Though it has been a long time, could u recap it and check if it is the case? Thanks a lot, and regards, |
Hi, thanks for your great job! But I have some questions about the transforms in ConditionalGlow. It seems that the foward and reverse of the transform in flow is not consistent.
The text was updated successfully, but these errors were encountered: