Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

you should change inception model to evaluation mode before calculating FID score #3

Open
jychoi118 opened this issue Apr 6, 2020 · 4 comments

Comments

@jychoi118
Copy link

You should change inception model to evaluation mode before calculating FID score.
Inception model contains batch normalization, whose training and evaluation behaviors are different.

For example, you should add "inception.eval()" below line 486 of stylegan/finetune.py

With this correction, I got significantly different FID score comparing to reports from your paper.

@sangwoomo
Copy link
Owner

Hi, thank you for noticing that! I completely forgot about this issue.

By the way, while correcting the mode may degrade the absolute FID scores of both fine-tuning and FreezeD, I guess the relative order would remain. Can you report your values if my guess is wrong?

sangwoomo added a commit that referenced this issue Apr 6, 2020
@jychoi118
Copy link
Author

Yes, relative order remains the same.

I'm sorry to tell you that I lost my values with your experiment setting. I will report after I experiment with your setting again.
However, I'm currently experimenting with StyleGAN-V2 and AFHQ dataset (500 dog test set) from stargan-v2 with your FreezeD method. I got FID score of 49.3 without eval(), and 98.1 with eval(). Probably StyleGAN-V1 will get about twice FID scores too.

And, thank you for your nice research!

@sangwoomo
Copy link
Owner

Happy to hear that the relative orders are the same!
Hope you to develop a better method and report updated values in your manuscript :)

@Hsintien-Ng
Copy link

Yes, relative order remains the same.

I'm sorry to tell you that I lost my values with your experiment setting. I will report after I experiment with your setting again.
However, I'm currently experimenting with StyleGAN-V2 and AFHQ dataset (500 dog test set) from stargan-v2 with your FreezeD method. I got FID score of 49.3 without eval(), and 98.1 with eval(). Probably StyleGAN-V1 will get about twice FID scores too.

And, thank you for your nice research!

Hi, do you mean FID score of 49.3 with eval() and 98.1 with train() mode?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants