-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use AbsGrad to get better results with less gaussians #3113
Conversation
1bbb1e8
to
48c4eb1
Compare
48c4eb1
to
3dc2056
Compare
@@ -80,6 +80,21 @@ def SH2RGB(sh): | |||
return sh * C0 + 0.5 | |||
|
|||
|
|||
def resize_image(image: torch.Tensor, d: int): | |||
""" | |||
Downscale images using the same 'area' method in opencv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jb-ye, just wondering do you have any reference for this idea? Would like to learn more, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the absgrad is rather sensitive to how we downscale images during coarse-to-fine training. The previous implementation is not antialiased, and would introduce more noises in computing absgrad. The new downscale method in this PR is basically the same INTER_AREA method in opencv. This approach is actually recommended for most NeRF based approach for downscaling images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually OpenCV's INTER_AREA
is a bit different because it also works with non-integer scales.
I would say this change is closer to skimage.transform.downscale_local_mean
Does this implementation support classic rasterizer? @jb-ye |
I got an error when opening viewer when it trains
It runs smoothly and I keep it until about 6000 iters, then I open viewer The error:
The dataset : Waterfall_Kopu |
@ichsan2895 The viewer crashes sometimes and kills the training, but this is a separate issue: #3064 |
Yes |
Some experiments done on bicycle dataset (It looks like our splatfacto-big is the new state-of-the-art!) bicycle (downscale 2x): bicycle (downscale 4x): |
Another mipnerf360 dataset garden (downscale 4x): |
@@ -103,7 +118,7 @@ class SplatfactoModelConfig(ModelConfig): | |||
"""If True, continue to cull gaussians post refinement""" | |||
reset_alpha_every: int = 30 | |||
"""Every this many refinement steps, reset the alpha""" | |||
densify_grad_thresh: float = 0.0002 | |||
densify_grad_thresh: float = 0.0008 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we verified that this threshold results in roughly the same psnr as the standard grad threshold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the recommended setting in the paper. Based on my limited experiments so far, it creates less gaussians (~30% less) in splatfacto but comes with slightly better quality. So the training is faster in general.
Just need to bump the gsplat version to 0.1.11 then looks good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
I see that absgrad function has now been pushed to latest release. Do we have to enable it or is it configured by default when running ns-train splatfacto-big ? @jb-ye Thank you! |
@abrahamezzeddine it is used by default |
I am following the parameter setting as recommended in https://arxiv.org/pdf/2404.10484
In order to experiment this PR, one need to install the customized gsplat library as in this PR nerfstudio-project/gsplat#166
I will follow up my experiments here. I recommend to experiment with one of the two settings (A, B) and use input image resolution higher than 1600 (e.g. 2k or 4k)
O splatfacto-big in latest main branch (--rasterize_mode antialiased)
A splatfacto-big of this PR (--rasterize_mode antialiased --densify_grad_thresh=0.0008)
B splatfacto-big of this PR (--rasterize_mode antialiased --densify_grad_thresh=0.0004)
O. baseline.
A. faster than B, high quality (expected to be better than O), smaller PLY asset.
B. slowest training, highest quality, largest PLY asset.