-
-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When I should use --attn-res-layers and what principle to setup values at this parameter? #31
Comments
Yes using attention does improve quality at least reflected in the FID scores they tend to go lower. |
What is better? Change [32] to [96] or [32,64]? What is the different? |
I think it should be a power of 2, so 96 would not be valid. lightweight-gan/lightweight_gan/lightweight_gan.py Lines 396 to 401 in 845eb9d
|
Of course, but my question about different beween one large value vs. two smaller values |
@Dok11 I think you are misunderstanding the value, it puts multiple attention layers at the resolutions you specify into the neural network graph, so at more resolutions the better of course as you'll get attention at different levels. It's the same as convolutions. If you can only effort one it depends on your training data has it lot of global structure (then lower resolution layer is beneficial) or lot of local structure (then a higher resolution layer is more beneficial) |
@Mut1nyJD I still not undrestanding attn layers, but I think have reasonable question. By changing attn-res-layers from [32] to [32,64,128,256] the model file size does not increase more then two megabytes. So does it really must improve quality? |
Implementation of GSA in the code is from:
Based on lucidrains' repository, one could refer to this for prior work:
Apparently, it is a cheaper way to have attention. |
Why default is [32] and when it need to increase count of items ([32,64,128] for example) or values?
I see what it use more memory so I think it must increase quality but where is tradeoff?
The text was updated successfully, but these errors were encountered: