Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What makes this faster than instant-ngp precisely? #66

Closed
kwea123 opened this issue Oct 12, 2022 · 2 comments
Closed

What makes this faster than instant-ngp precisely? #66

kwea123 opened this issue Oct 12, 2022 · 2 comments

Comments

@kwea123
Copy link

kwea123 commented Oct 12, 2022

According to the paper, I couldn't find obvious evidence that makes your method faster than instant-ngp, as you stated that most accelerating techniques are borrowed from them. Do you have any special technique that is not described in the paper?

Or is #64 related? Maybe training on the train set you get similar result with instant-ngp only after 5min.

@liruilong940607
Copy link
Collaborator

Hi Kui, As pointed out by #64, it was not a fair comparison. So we did a full test in #68 under the fair comparison setting, and the results are basically identical to the official implementation. We just updated the docs to reflect that.

For runtime, though we can converge to the same quality within the same amount of time, we actually trained for less iterations (20k) compared to the official impl (35k). In other words, nerfacc still runs slower at each iteration because of the python overhead (which I think is the tradeoff for flexibility), but somehow converge more efficiently.

We didn't investigate it too much as replicating NGP is not the focus here. Maybe it has something to do with the training recipes we used for NGP, which is slightly different than the paper.

On the CUDA side, there are also some minor design differences I can share with your.

  • One is that in nerfacc each sample comes from the mid point of the interval (t_start, t_end), instead of just the t_start as in the NGP impl. This gives slightly better perf (like ~0.2db). The purpose of this design is to support for Mip-NeRF line of works in the future.
  • Another is that in nerfacc we try to minimize the computation need to be done in CUDA kernels that are paralleled across rays (we mentioned it in the arXiv report). For example something like this is better than something like:
# this function is paralleled across rays
visibility, packed_info_visible = render_visibility(
    packed_info, t_starts, t_ends, sigma, early_stop_eps, alpha_thre
)

Nevertheless, I think there is much space for improvement in this codebase in terms of efficiency. Welcome to contribute if you are interested!

@kwea123
Copy link
Author

kwea123 commented Oct 12, 2022

thanks, so I guess in terms of instant-ngp, this is almost a re-implementation in python.

@kwea123 kwea123 closed this as completed Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants