Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why the refined results are worse than coarse results? #6

Closed
cats-food opened this issue Jun 23, 2022 · 2 comments
Closed

Why the refined results are worse than coarse results? #6

cats-food opened this issue Jun 23, 2022 · 2 comments

Comments

@cats-food
Copy link

Hi, Thank you very much for your great work!

I am testing your pretrained model on the example images you provided, but with the resolution of 256x256 (the original size is 512 but i want to test on 256). Here is my args:

python test.py --name celeba --img_file ./examples/celeba/img/ --mask_file ./examples/celeba/mask/ --results_dir ./results --model tc --coarse_or_refine refine --gpu_id 0 --no_shuffle --batch_size 1 --preprocess scale_shortside --mask_type 3 --load_size 256 --fine_size 256 --attn_G --add_noise

And here is the output results (all the images below are 256x256):
image

These are just 2 examples. I didnt modify any part of the code, I wonder why the refined results are worse than coarse results? Did I use the wrong args? I would be grateful if you can reply to my question, thanks in advance :)

@lyndonzheng
Copy link
Owner

Hi @Yang-Shiyuan Thanks for pointing out this issue. I have not tested the model on 256256 resolution. As our refined model is trained on 512512, the attention-aware layer is tamed on high-resolution features, which do not feed with 256256 images. Therefore, the refined model may perform worse on that resolution. You can directly evaluate the model on 512512 and then downsample the refined results to 256*256 resolution.

@cats-food
Copy link
Author

@lyndonzheng Thank you for your swift reply!Yes I have also noticed the implementation details in your paper, the refined model is trained on 512x512 images, so working on 512-size should yield much more resonable results. Thank you again for your suggestions :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants