-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Learned Prior #11
Comments
I found it from official implementations. It was beneficial to training stability. I think the intuition of this is not found in the papers, but I think having spatial, content dependent priors will make overall problem easier. |
Thanks for the response! |
Hi, I have some detailed questions regarding your code. I will be happy if you have the time to answer them.
I appreciate your time. |
Actually I don't know the exact reasons of these details - I have tried to replicate the paper and after it not works well I took these details from the official implementations. Anyway it is related to training stability, though.
|
Hi, Thanks for your response, you are right. I appreciate your thoughts. |
Hmm, wouldn't it be better to have noises slight smaller than actual pixel value changes in [0, 255]? Though I guess it will not make much differences. |
Yes, might be. I also think it does not make a huge difference overall. |
Hi, after investigating the Glow official implementation more thorougly, I wanted to make some clarifications regarding this topic:
Thanks. |
Slightly different. They also have used zero padding, but concatenated additional input channels that all zeros except borders. |
Absolutely right. Do you understand what the usage of that additional input (the |
It will help features maps to have different values (or kind of biases) on the edge. I think zero padding can also achieves that some degree, though. |
Right. As you mentioned, zero-padding allows for different values since the kernel that will be convolved with the input will have a bias by default, so it allows for different values although the kernel weight is multiplied by zeros (pads). But the |
Hi,
Thank you for your great implementation.
Regarding the "learned" prior, I wanted to ask:
1- Why are you considering the prior to be a Gaussian with trainable parameters rather than a unit Gaussian, for instance.
2- For the Gaussian prior, what is the motivation behind obtaining the mean and std of the Gaussian by passing
out
through the CNN? Is it just because you found it to be more useful? (https://github.com/rosinality/glow-pytorch/blob/master/model.py#L285)Thanks in advance.
The text was updated successfully, but these errors were encountered: