-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tanh in the Generator last activation #522
Comments
The goal is to match the range. The range of real images is [-1, 1]. Tanh outputs a value between [-1, 1]. |
Clear! If I normalize the whole image [HxW] to [-1,1] and then random crop to size of [H/8xW/8] and feed to the network. Clear that the range of [H/8xW/8] will not in the range [-1,1]. Should not use the tanh in the last layer? Which way do you prefer to handle it? I cannot feed the whole [HxW] due to the memory issue |
[-1, 1] is the range of the value each pixel (brightness / color of each pixel should be within -1 and 1), so it has nothing to do with the width and height of the image. |
@taesungp : No, I misunderstood my question. Let's
Now, the image intensity will be in [-1,1]. If I randomly crop the image into [H/8 and W/8]. Do you think the crop image range still in [-1,1]. No. It will be in a different range. |
In the first line you should do
|
Yes. But after normalization, we will crop the image. I know that we should normalize after the crop image but in my case, I want to normalize before crop image. |
I think |
It is correct. But the problem here is that if an image size of WxH is normalized to [-1,1]. Then crop a region in the image, the region may not in range of [-1,1], it may be [-0.5 0.5]. Then the output of tanh is [-1,1], so it makes the inconsistent range between cropped input and output of the network. |
|
I gave an example for that
|
Yes...? The cropped image can be just thought as a smaller uncropped image. You are just training with smaller images. Let's say you don't use cropping. What if the input image is
so that all values are within [101, 119]? As such, cropping does not introduce any extra problem. If images are within range [-0.5, 0.5], the generator will learn to output [-arctanh(-0.5), arctanh(0.5)]. |
Yes, I have heard that the range of real images is [-1, 1] elsewhere as well
What I do know is that torchvision.transforms.ToTensor divides the values ranging from [0, 255] by 255, thus scaling them to [0, 1]
But it's been my belief that this actually didn't matter too much because of the normalization layers. The normalization makes the activations have a mean of 0 and a std of 1, so it doesn't matter what range the original input is in, even though its been shifted to [0, 1]. Is that a bad statement or a bad conceived notion? What are your thoughts on that? |
|
Ah, so that is what you meant! I was worried this whole time, images originally contained values ranging from [-1,1] instead of what I been telling people(i.e [0,255]) Also yes, that would do that. I got so used to using different precomputed means and stds which doesn't give [-1, 1] for new data, that I forgot you were using .5 for all. Does this mean that though, instead of input ranging from [0,1] and output ranging from [0,1] through Sigmoid, its better to normalize the [0,1] input to a [-1,1] input and output a [-1,1] output through Tanh, since the latter is normalized? |
Yes, your understanding is correct. |
Thanks for your work.
I am wondering why are you using tanh in the last activation of the generator?
Thnks again
The text was updated successfully, but these errors were encountered: