Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the pose optimization (paper and code) #93

Open
leblond14u opened this issue May 21, 2024 · 6 comments
Open

Questions about the pose optimization (paper and code) #93

leblond14u opened this issue May 21, 2024 · 6 comments

Comments

@leblond14u
Copy link

leblond14u commented May 21, 2024

Dear authors, comunity,

Is it possible for you clarify my understanding on how the pose optimization works ?

In my understanding the pose is optimized via gradient descent to minimize the jacobian values.
Where the jacobian is derived analytically from the reprojection error (equations 3 to 6 in section 3.2).

  1. However I'm a bit confused by the section 3.3.1 that presents a L1 loss in order to minimize the reprojection error. I can't seem to figure out how can the jacobian be used with the L1 loss. Can you bring some precision to this part ?
  2. To my understanding this optimization process is done via the rasterizer submodule but I do not see any outputs of the estimated transform. Is it directly modifying the viewpoint_camera.cam_rot_delta and viewpoint_camera.cam_trans_delta parameters during the rasterization ?
    I'm trying to reuse your rasterizer with the original 3DGS paper (with GT path) but I can't see the delta parameters updating, could you indicate to me how to get a pose estimation out of the rasterizer ?

Thanks in advance,
Best regards,

Hugo

@leblond14u leblond14u changed the title Questions about the pose optimization Questions about the pose optimization (paper and code) May 22, 2024
@identxxy
Copy link

I think the viewpoint_camera.cam_rot_delta and viewpoint_camera.cam_trans_delta are optimizable params updated here with an Adam optimizer.

def tracking(self, cur_frame_idx, viewpoint):

I've tried using the rasterizer with the original 3DGS paper. My experience is that you cannot optimize the camera pose together with the map... The result map will get more and more blury because the camera is shaking. Although the camera pose only have small refinement, but the original pose already leaves great impact on the map, which is hard to elimate or amend.

I think the design of frontend and backend in MonoGS is necessary and smart that it avoids this problem. When tracking camera, the map is fixed. When mapping, you should trust the pose.

This is only my personal experience, maybe there are some other ways to optimize camera poses and the map simultaneously.

@leblond14u
Copy link
Author

leblond14u commented May 22, 2024

Hi thanks for your answer,

Have you succeeded in having a pose estimation though ?

For now printing the viewpoint_cam deltas doesn't show any update of those through the training.
So to my latest understanding, I should be able to update the pose estimate by running the tracking() function inside of the base 3DGS training with the MonoGS rasterizer, right ?

Accordingly to the paper 3.2 section, the gradient is thus provided by the rasterizer jacobian computation and descended by the Adam optimizer in the tracking() function.
And technically the Jacobian is derived from the section 3.3.1 L1 losses which capture the errors between the gaussian reprojections and the captured photometry and depth.

If my above statements are right, the only thing left to understand for me is how the jacobian gradient is comunicated between the rasterizer and the optimizer. Could you explain me this link ?

Many thanks,
Best,

@identxxy
Copy link

I used some prior pose and tried optimizing the map and refining poses simultaneously, which turns out to be a really bad idea. T^T

I should be able to update the pose estimate by running the tracking() function inside of the base 3DGS training with the MonoGS rasterizer

yes I think so.

And technically the Jacobian is derived from the section 3.3.1 L1 losses which capture the errors between the gaussian reprojections and the captured photometry and depth.

Yes, it is in the paper, but in the code I think it's based on silhouette, see my question here #90 (comment).

how the jacobian gradient is comunicated between the rasterizer and the optimizer

Well, I am also not clear about this part... But my undertanding and intuition is that if the rasterizer has position gradient for all GS in one camera, the opposite direction of the mean of all GS gradient projected to the 2D camera plane is the camera gradient. And the optimizer just use this gradient and some learning rate to optimize.

@WFram
Copy link

WFram commented May 23, 2024

how the jacobian gradient is comunicated between the rasterizer and the optimizer

I think it's defined by the order in which the input tensors are specified when calling forward.

Implementation of backward outputs the gradients, rewriting the values in grad variables of tensors, in the same order as were passed when running forward.

That's why viewpoint_camera.cam_rot_delta and viewpoint_camera.cam_trans_delta are passed in forward, but they are not used in there (see the definition of _RasterizeGaussians in submodules/diff-gaussian-rasterization-w-pose/diff_gaussian_rasterization/__init__.py

When calling .step() in Adam optimizer, the values of these tensors get updated according to the gradients stored in grad for these tensors.

@leblond14u
Copy link
Author

@identxxy I have a question concerning your try at getting a pose estimation.
Have you managed to get "rather good" estimations of the pose ?
For now when I try to track the pose of my camera with my adapted tracking() function in my classic 3DGS environment I am getting weird convergence results. I opened a new issue #98 on this.

@identxxy
Copy link

identxxy commented Jun 3, 2024

I was trying to use the tracking() to get refined poses since I already have "rather good" poses. It's just some details cannot align very well... It turns out that optimizaing both poses and the map makes both worse, which is actually intuitive that everything just mess up. So I think that's why MonoGS takes the design to seperate tracking() and mapping().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants