Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast SH implementation #165

Merged
merged 12 commits into from
Apr 22, 2024
Merged

Fast SH implementation #165

merged 12 commits into from
Apr 22, 2024

Conversation

jb-ye
Copy link
Collaborator

@jb-ye jb-ye commented Apr 20, 2024

I recently read this paper: https://jcgt.org/published/0002/02/06/ (it has sample codes) there is a faster way to evaluate spherical harmonics. I made an attempt to implement the idea in gsplat and was able to observe 40% time saving in forward calls of degree 3 and 50% time saving of degree 4.

@liruilong940607
Copy link
Collaborator

This is very cool! And I also verified the diff is basically zero comparing with the original one.

basis_dim = 25
dirs = torch.rand((100000, 3), device="cuda")
dirs = dirs / dirs.norm(dim=-1, keepdim=True)
result1 = eval_sh_bases(basis_dim, dirs)
result2 = eval_sh_bases_fast(basis_dim, dirs)
diff = (result1 - result2).abs().mean(dim=0)
>> tensor([0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 7.1335e-09,
        2.3338e-08, 7.0950e-09, 0.0000e+00, 1.2602e-08, 8.1815e-09, 1.8609e-08,
        3.7991e-08, 1.8580e-08, 0.0000e+00, 1.2594e-08, 1.0718e-08, 1.6333e-08,
        1.4538e-08, 2.3519e-08, 3.1723e-08, 2.3399e-08, 1.2196e-08, 1.6297e-08,
        1.3505e-08], device='cuda:0')

@liruilong940607
Copy link
Collaborator

@jb-ye I add some profile code in the test_sh.py, and this is what I get by running python tests/test_sh.py on a Tesla V100-SXM2:

[Fwd] Method: poly, ellipsed: 2.19 ms
[Bwd] Method: poly, ellipsed: 35.19 ms
[Fwd] Method: fast, ellipsed: 2.46 ms
[Bwd] Method: fast, ellipsed: 35.16 ms

Somehow I couldn't get speedup on my end. Could you run the profiler on your end and post the results?

Ruilong Li and others added 2 commits April 21, 2024 02:57
@jb-ye
Copy link
Collaborator Author

jb-ye commented Apr 21, 2024

@liruilong940607 Your previous profile code has a typo ... after fixing it, this is what I got:

[Fwd] Method: poly, ellipsed: 5.58 ms
[Bwd] Method: poly, ellipsed: 31.67 ms
[Fwd] Method: fast, ellipsed: 2.02 ms
[Bwd] Method: fast, ellipsed: 31.47 ms

@liruilong940607
Copy link
Collaborator

Oh yeah you are right thanks for the fix! It now makes sense.

Copy link
Collaborator

@liruilong940607 liruilong940607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A few minor comments

z2 = z * z

fTmpA = -0.48860251190292
result[..., 2] = 0.4886025119029199 * z
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional that this 0.4886025119029199 does not equal to -fTmpA?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for matching the sample code from the original authors. They should be equal to -fTmpA.

gsplat/_torch_impl.py Outdated Show resolved Hide resolved
@jb-ye jb-ye merged commit 0cef46c into nerfstudio-project:main Apr 22, 2024
2 checks passed
@ichsan2895
Copy link

How to activate this feature with splatfacto? @jb-ye ?

I did not find any speed improvement, maybe I did it wrong because it was not activated?
image

black line = newest main gsplat (included this merged PR)
red line = nerfstudio v1.0.3 and gsplat 0.1.10

@jb-ye
Copy link
Collaborator Author

jb-ye commented Apr 23, 2024

@ichsan2895 It doesn't bring speed-ups for training since the bwd calls take rough the same time. It may increase ns-viewer fps rate.

@jb-ye jb-ye deleted the new_sh branch April 23, 2024 15:09
JiamingSuen referenced this pull request in mkkellogg/GaussianSplats3D Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants