New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce the FID results of provided pre-trained models (LSUN - Churches & CelebA) #213
Comments
I think larger sampling number like 50k can further drop the performance by 30%? |
Hi @hieuphung97, have you reproduced the results reported in the paper? I have sampled 35k images using the provided pre-trained model on lsun_churches, calculated the FID via the torch-fidelity, and the FID was 15.89. |
Hi @hieuphung97 , I wonder if you are using the celebA or celebA-HQ dataset? I got a FID of 27 on celebA-HQ dataset. |
Does your dataset for caculating FID pre-processed correctly? It's quite important. |
I use pytorch-fid, which directly uses path-to-dataset-folder as args... So it should not be a pre-processing issue |
Are the images in the dataset size 256*256? |
Yes...I download the celebA-HQ dataset from here. But I would greatly appreciate it if you could provide me with the URL of your dataset. I would have a try. |
Update: I also got a FID of ~17 on CelebA dataset, which is similar to @hieuphung97 |
Hi @ader47 @ThisisBillhe
@ThisisBillhe I use celebA-HQ dataset |
Thank you so much :) |
Hi @hieuphung97 Details of what I did: I sampled 50k images with the setting that you mentioned. Then, I wrote a script that instantiates two dataloaders with the same pre-processing as the original ones for the real and fake image directories. I pass these two dataloaders to the torch-fidelity package as the paper suggested. |
Has anyone had the issue when re-evaluating the provided pre-trained models on LSUN - Churches and CelebA datasets?
I cannot get the FIDs as reported in the paper.
In fact, the results are far from the reported ones (LSUN-Churches: 4.02 in the paper compared with 11.5, CelebA: 5.11 in the paper compared with 17.4)
The only difference I noticed is that I sampled only 10k images for each case instead of 50k as in the paper (to save time). I don't know whether the number of samples has this significant impact.
The text was updated successfully, but these errors were encountered: