Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the paper #2

Closed
kwea123 opened this issue Nov 11, 2020 · 5 comments
Closed

Some questions about the paper #2

kwea123 opened this issue Nov 11, 2020 · 5 comments

Comments

@kwea123
Copy link

kwea123 commented Nov 11, 2020

Hi, thanks for the great work!

I have some questions:

  1. How much is the computational overhead introduced by the CNN feature extraction? At inference maybe it's not that much because we only need to do 1 forward pass for each image and store the features in a buffer, but at training, we need to perform it on the entire images at each iteration, and we only train on a very little portion (800-1000 rays), so I wonder isn't it somewhat inefficient and slow, or maybe you have some outstanding implementation to accelerate this part.
  2. As for the generalization, is it correct to understand that it only generalizes to objects within the same class (experiments on shapenetv2) with very similar visual and pose settings? For example, if we train on 7 NeRF-synthetic scenes, does it generalize to the 8th?
@alextrevithick
Copy link
Owner

Thanks for your interest in our work!

  1. The computational overhead for the feature extraction is expensive both in time and space, and you're right, some implementations allow us to cache the features.
  2. That's a really important question, and we are working on it right now. Personally, I'm pretty sure it could generalize across object classes IF it were trained on multiple. For the synthetic question, we will know that answer soon.

@alextrevithick
Copy link
Owner

Here is a result for your final question. This was a model trained on just 4 synthetic scenes, not including lego.
200rendert

@kwea123
Copy link
Author

kwea123 commented Nov 17, 2020

It seems it generalizes not that well in this case. If you then finetune the model to the lego scene from this point, does it take shorter time to attain the same performance (or better?) comparing to training only on the lego scene from scratch?

@alextrevithick
Copy link
Owner

alextrevithick commented Nov 17, 2020

Thanks for your insightful question. Here is an example after 1000 iters(usually takes 250k iters from scratch). Seems like it can achieve good performance very fast.
ficus_100rendert

@kwea123
Copy link
Author

kwea123 commented Nov 17, 2020

Wow, just 1k iters! That's a really fast convergence! How long does it take to actually achieve the same level of performance? From the image I'd say it's still about 26 to 27 in PSNR but the best model trained from scratch can reach 32 to 33.

@kwea123 kwea123 closed this as completed Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants