Evaluation set used in paper? #3

wangjksjtu · 2022-04-22T23:27:58Z

@jasonyzhang Thanks for the awesome work and public code!! In this paper, 20 actors in MVMC dataset are used for quantative evaluation, could you please share the actor ids on evalution set for easier comparisons?

Thanks for your great help!

wangjksjtu · 2022-04-25T16:32:48Z

@jasonyzhang Sorry for the follow up message! Could you also share more about the quantative evaluation details (e.g., which view is held out) that would be super helpful for reproducing and comaprions, thanks!

jasonyzhang · 2022-04-25T18:04:17Z

Hi, I'm working on releasing it asap! Will post the code, models, and splits to recreate the numbers, hopefully by end of the week.

wangjksjtu · 2022-04-25T18:42:48Z

Thanks so much for your help! really appreciate it;)

jasonyzhang · 2022-04-29T05:18:36Z

Hi,

Sorry for the delay! I've now posted all the data for evaluation, which includes the off-the-shelf camera (pre-processed to minimize re-projection error between the template car mesh and the mask) and the optimized cameras (which have also been processed with some manual input).

The data also includes the rendered views from NeRS in the NVS evaluation protocol. I show how to replicate the numbers using the rendered views as well.

Please let me know if you encounter any issues!

wangjksjtu · 2022-04-29T16:38:24Z

Hi @jasonyzhang,

Thanks for the update!! really appreciate it! I could reproduce the numbers using provided evaluation protocol. I noticed that if we use the clean-fid to compute the FID scores, the numbers are inconsistent with papers.

Name            MSE   PSNR   SSIM  LPIPS   clean-FID
ners_fixed   0.0254   16.5  0.720  0.172   113.

I guess you are using pytorch-fid to compute FID scores in the paper. Would you mind sharing the clean FID scores for all the baseline models? Thanks a lot!

wangjksjtu · 2022-04-29T17:30:02Z

Sorry, another question is that from the eval code, seems that the evaluation is done on all views (both training views and a held-out view) is it the correct setting? I thought that we should only eval on the novel views?

wangjksjtu · 2022-04-29T18:47:45Z

Hi @jasonyzhang, Annother question is that retraining results obtained by running train_evaluation_model.py is more blurry compared to dumped results in data/evaluation. Here is one example:

re-trained model:

dumped results:

Is it due to different hyperparameters? Thanks a lot for your great help in advance!

jasonyzhang · 2022-04-29T19:31:59Z

Hi,

Re: FID
I computed FID over all of the generated outputs (ie every image generated for every instance) rather than averaging the FID per instance as done for the other metrics. I've posted the code for this now, and here are the number I get:

Name            MSE   PSNR   SSIM  LPIPS    FID
ners_fixed   0.0254   16.5  0.720  0.172   60.4

The FID for ners_fixed in the paper was 60.9, so only slightly off.

Re: Evaluation protocol.
In the evaluation training code, each image/camera pair is independently treated as a target image/camera. For example, if an instance has 10 images, we would train 10 models, where each model holds out one of the target views. For each of the models, we render from the held out view for evaluation. Thus, we end up evaluating all of the input images even though they are all held out views.

Re: blurry results.
I was trying to train a smaller model to save time, but it looks like the performance is much worse. Evaluation code was training an 8-layer texnet for 1000 iterations, whereas demo code trains a 12-layer texnet for 3000 iterations. I switched back to the latter set of hyperparameters. I'm currently re-running it as well.

wangjksjtu · 2022-04-29T19:39:47Z

I see I see, thanks a lot for the detailed reply! really appreciate it ;)

jasonyzhang · 2022-04-29T22:15:09Z

Ahh actually the blurry results is because the number of fourier bases in the default config is too low. The default is 6, but 10 seems to work much better.

Rendering used for evaluation in main paper

8-layer tex net, 1k training iterations, L=6

12-layer tex net, 3k training iterations, L=6

12-layer tex net, 3k training iterations, L=10

I have updated the code so that evaluation defaults to L=10. This was already the default for the demo script.

jasonyzhang closed this as completed Apr 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation set used in paper? #3

Evaluation set used in paper? #3

wangjksjtu commented Apr 22, 2022

wangjksjtu commented Apr 25, 2022

jasonyzhang commented Apr 25, 2022

wangjksjtu commented Apr 25, 2022

jasonyzhang commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

jasonyzhang commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

jasonyzhang commented Apr 29, 2022 •

edited

Evaluation set used in paper? #3

Evaluation set used in paper? #3

Comments

wangjksjtu commented Apr 22, 2022

wangjksjtu commented Apr 25, 2022

jasonyzhang commented Apr 25, 2022

wangjksjtu commented Apr 25, 2022

jasonyzhang commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

jasonyzhang commented Apr 29, 2022

wangjksjtu commented Apr 29, 2022

jasonyzhang commented Apr 29, 2022 • edited

jasonyzhang commented Apr 29, 2022 •

edited