Tanh in the Generator last activation #522

YuvalFrommer · 2019-02-10T17:12:32Z

Thanks for your work.
I am wondering why are you using tanh in the last activation of the generator?
Thnks again

junyanz · 2019-02-10T21:02:14Z

The goal is to match the range. The range of real images is [-1, 1]. Tanh outputs a value between [-1, 1].

John1231983 · 2019-02-22T16:28:57Z

Clear!

If I normalize the whole image [HxW] to [-1,1] and then random crop to size of [H/8xW/8] and feed to the network. Clear that the range of [H/8xW/8] will not in the range [-1,1]. Should not use the tanh in the last layer? Which way do you prefer to handle it? I cannot feed the whole [HxW] due to the memory issue

taesungp · 2019-02-22T23:32:38Z

[-1, 1] is the range of the value each pixel (brightness / color of each pixel should be within -1 and 1), so it has nothing to do with the width and height of the image.

John1231983 · 2019-02-22T23:40:03Z

@taesungp : No, I misunderstood my question. Let's I is an image with size of HxW. So the normalization will be

I=I/max(I)
I=(I-0.5)/0.5

Now, the image intensity will be in [-1,1]. If I randomly crop the image into [H/8 and W/8]. Do you think the crop image range still in [-1,1]. No. It will be in a different range.

taesungp · 2019-02-23T00:16:24Z

In the first line you should do

I = I/255.0 instead of I = I/max(I) so that it become independent of the values of the current cropped I.

John1231983 · 2019-02-23T00:47:45Z

Yes. But after normalization, we will crop the image. I know that we should normalize after the crop image but in my case, I want to normalize before crop image.

taesungp · 2019-02-23T01:02:08Z

I think I = I/255.0 is independent of cropping. Cropping and then I/255.0 is same as doing I/255.0 and then cropping.

John1231983 · 2019-02-23T01:13:00Z

It is correct. But the problem here is that if an image size of WxH is normalized to [-1,1]. Then crop a region in the image, the region may not in range of [-1,1], it may be [-0.5 0.5]. Then the output of tanh is [-1,1], so it makes the inconsistent range between cropped input and output of the network.

taesungp · 2019-02-23T01:20:07Z

Even with tanh, if the ground-truth cropped image is in the range of [-.5, .5], the generator network will learn to output [-.5, .5]. In other words, tanh does not make all outputs to have max value 1. For example, if the generator outputs zero everywhere, the image will be also zero everywhere, not [-1, -1].
You actually have exactly same situation with uncropped images. Some images are bright, so they will be in [0, 1] range, not [-1, 1]. Some images are greyish, so they will be within [-0.5, 0.5]. You have the same amount of problem with or without cropping.
Tanh merely constrains the minimum and maximum output of the generator to be -1 and 1. The network can probably do just as well with .clamp(-1, 1) instead of Tanh().

John1231983 · 2019-02-23T01:24:56Z

I gave an example for that

import numpy as np

I = [[1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,255],
     [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,255],
     [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,255],
     [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,255],
     [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,255]]
I = np.asarray(I)
I = I/255
I= (I-0.5)/0.5
print (I.min(), I.max()) #-0.9921568627450981 1.0
I_crop= I[4:6, 4:6]
print(I_crop.min(),I_crop.max()) #-0.9607843137254902 -0.8745098039215686

taesungp · 2019-02-23T01:32:11Z

Yes...? The cropped image can be just thought as a smaller uncropped image. You are just training with smaller images.

Let's say you don't use cropping. What if the input image is

I = [[1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,19], [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,19], [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,19], [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18,19], [1,2,3,4,5,6,7,8,9],[11,12,13,14,15,16,17,18, 19]] + 100

so that all values are within [101, 119]? As such, cropping does not introduce any extra problem. If images are within range [-0.5, 0.5], the generator will learn to output [-arctanh(-0.5), arctanh(0.5)].

pyoungkangkim · 2019-12-13T09:20:13Z

The goal is to match the range. The range of real images is [-1, 1]. Tanh outputs a value between [-1, 1].

Yes, I have heard that the range of real images is [-1, 1] elsewhere as well
However, I have two sequential questions.

When I open an image using PIL, so PIL.Image('some_img.jpg')
Does PIL automatically convert the pixel values ranging from [-1, 1] to [0, 255]? Or did you mean something different when saying the range of real images is [-1, 1]? I guess I'm not totally sure if the actual pixel numerical values actually range from [-1, 1] originally due to my misunderstanding.

What I do know is that torchvision.transforms.ToTensor divides the values ranging from [0, 255] by 255, thus scaling them to [0, 1]

It was a bit odd to me that we usually first shift an image(if PIL does what question 1 says it does) to [0, 1] as original input AND THEN work with trying to output something from [-1, 1] again, then plot by shifting back to [0, 1].
Where I thought it might be better to just take as input the original values in between [-1, 1] and then output something from [-1, 1], then plot by shifting back to [0, 1].

But it's been my belief that this actually didn't matter too much because of the normalization layers. The normalization makes the activations have a mean of 0 and a std of 1, so it doesn't matter what range the original input is in, even though its been shifted to [0, 1]. Is that a bad statement or a bad conceived notion? What are your thoughts on that?

junyanz · 2019-12-14T22:20:41Z

The PIL image is [0, 255]. We convert it into [-1, 1] using torchvision.transforms.Normalize, as neural networks work better with zero-mean data. The input to the networks is [-1, 1] after this conversion. See this line for more details.
The range for both the original images and generated images is [-1, 1].

pyoungkangkim · 2019-12-15T01:48:24Z

The PIL image is [0, 255]. We convert it into [-1, 1] using torchvision.transforms.Normalize, as neural networks work better with zero-mean data. The input to the networks is [-1, 1] after this conversion. See this line for more details.

The range for both the original images and generated images is [-1, 1].

Ah, so that is what you meant! I was worried this whole time, images originally contained values ranging from [-1,1] instead of what I been telling people(i.e [0,255])

Also yes, that would do that. I got so used to using different precomputed means and stds which doesn't give [-1, 1] for new data, that I forgot you were using .5 for all.

Does this mean that though, instead of input ranging from [0,1] and output ranging from [0,1] through Sigmoid, its better to normalize the [0,1] input to a [-1,1] input and output a [-1,1] output through Tanh, since the latter is normalized?

junyanz · 2019-12-15T02:12:25Z

Yes, your understanding is correct.

taesungp closed this as completed Feb 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tanh in the Generator last activation #522

Tanh in the Generator last activation #522

YuvalFrommer commented Feb 10, 2019

junyanz commented Feb 10, 2019

John1231983 commented Feb 22, 2019

taesungp commented Feb 22, 2019

John1231983 commented Feb 22, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

pyoungkangkim commented Dec 13, 2019 •

edited

Loading

junyanz commented Dec 14, 2019

pyoungkangkim commented Dec 15, 2019 •

edited

Loading

junyanz commented Dec 15, 2019

Tanh in the Generator last activation #522

Tanh in the Generator last activation #522

Comments

YuvalFrommer commented Feb 10, 2019

junyanz commented Feb 10, 2019

John1231983 commented Feb 22, 2019

taesungp commented Feb 22, 2019

John1231983 commented Feb 22, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

John1231983 commented Feb 23, 2019

taesungp commented Feb 23, 2019

pyoungkangkim commented Dec 13, 2019 • edited Loading

junyanz commented Dec 14, 2019

pyoungkangkim commented Dec 15, 2019 • edited Loading

junyanz commented Dec 15, 2019

pyoungkangkim commented Dec 13, 2019 •

edited

Loading

pyoungkangkim commented Dec 15, 2019 •

edited

Loading