Question: PatchGAN Discriminator #39

johnkorn · 2017-06-01T09:19:48Z

Hi there.
I was investigating your CycleGAN paper and code. And looks like discriminator you've implemented is just a conv net, not a patchgan that was mentioned in the paper.
Maybe I've missed something. Could you point me where the processing of 70x70 patches is going on.
Thanks in advance!

phillipi · 2017-06-01T18:15:56Z

In fact, a "PatchGAN" is just a convnet! Or you could say all convnets are patchnets: the power of convnets is that they process each image patch identically and independently, which makes things very cheap (# params, time, memory), and, amazingly, turns out to work.

The difference between a PatchGAN and regular GAN discriminator is that rather the regular GAN maps from a 256x256 image to a single scalar output, which signifies "real" or "fake", whereas the PatchGAN maps from 256x256 to an NxN array of outputs X, where each X_ij signifies whether the patch ij in the image is real or fake. Which is patch ij in the input? Well, output X_ij is just a neuron in a convnet, and we can trace back its receptive field to see which input pixels it is sensitive to. In the CycleGAN architecture, the receptive fields of the discriminator turn out to be 70x70 patches in the input image!

This is all mathematically equivalent to if we had manually chopped up the image into 70x70 overlapping patches, run a regular discriminator over each patch, and averaged the results.

Maybe it would have been better if we called it a "Fully Convolutional GAN" like in FCNs... it's the same idea :)

taki0112 · 2017-09-27T06:05:07Z

Can you tell me which line in the code represents patchGAN?

phillipi · 2017-09-27T06:28:35Z

Edit: see defineD

taki0112 · 2017-10-15T07:32:39Z

I have a question.

I saw the code(class NLayerDiscriminator(nn.Module)), but I do not see the number 70 anywhere.
So why is it called 70x70 patchGAN?
that is, Why is it the number 70?
the output of the code is 30x30x1. (X_ij)
The patch of patchGAN was called 70x70. (ij)
You said, you traceback and found that patch ij is 70x70, how did you do it?

phillipi · 2017-10-15T20:04:12Z

The "70" is implicit, it's not written anywhere in the code but instead emerges as a mathematical consequence of the network architecture.
The math is here: https://github.com/phillipi/pix2pix/blob/master/scripts/receptive_field_sizes.m

emilwallner · 2018-02-24T16:22:54Z

Here is a visual receptive field calculator: https://fomoro.com/tools/receptive-fields/#

I converted the math into python to make it easier to understand:

def f(output_size, ksize, stride):
    return (output_size - 1) * stride + ksize

last_layer = f(output_size=1, ksize=4, stride=1)
# Receptive field: 4
fourth_layer = f(output_size=last_layer, ksize=4, stride=1)
# Receptive field: 7
third_layer = f(output_size=fourth_layer, ksize=4, stride=2)
# Receptive field: 16
second_layer = f(output_size=third_layer, ksize=4, stride=2)
# Receptive field: 34
first_layer = f(output_size=second_layer, ksize=4, stride=2)
# Receptive field: 70

print(first_layer)

utkarshojha · 2018-09-15T04:53:43Z

Hi @phillipi @junyanz ,
I understood how patch sizes are calculated implicitly by tracing back the receptive field sizes of successive convolutional layers. But don't you think batch normalization sort of harms the overall idea of patch-gan discriminator? I mean theoretically each member X_ij of the final NxN output should just be dependent on some 70x70 patch in the original image. And that any changes beyond that 70x70 patch should not result in change in the value of X_ij. But if we use batch normalization then that won't necessarily be true right?

phillipi · 2018-09-16T17:53:24Z

That's a good point! Batchnorm does have this property. So to be precise we should say the PatchGAN architecture is equivalent to chopping up the image into 70x70 patches, making a big batch out of these patches, and running a discriminator on each patch, with batchnorm applied across the batch, then averaging the results.

utkarshojha · 2018-09-17T21:59:17Z

Yes that would be a better explanation! And thanks for your response to this.

edoardogiacomello · 2018-11-21T11:38:34Z

Hello phillipi,
thanks for you explaination and for sharing your implementation!
I'm also trying to better understand PatchGAN Discriminator, and I understand that is equivalent to a convnet from a design point of view. In other words, if I have to implement a patchgan discriminator, I should do as you did.
But what happens if I already got a (pre-trained) neural network which accepts as input the receptive-field (in this case 70x70 images) of a bigger image (e.g. 1024x1024)? I couldn't figure out how the network should be integrated efficiently or rewritten using convolutional layers without modifying the architecture of the pre-trained network.
P.S. I'm trying to implement this in tensorflow, but I don't think it's a platform-related issue.

Thank you!

iperov · 2018-12-23T09:54:11Z

tensorflow extract_image_patches is differentiable func and can be used in training

huicongzhang · 2019-10-06T13:54:00Z

well,I understood How the PathGAN work!thx.

daifeng2016 · 2019-12-19T06:57:42Z

Hi， I am wondering why sigmoid activation is not used for pathGAN, since the true patch should be close to 1, while the false should be close to 0.

phillipi · 2019-12-19T07:17:59Z

The sigmoid is contained in the loss function here. But note that some variants of GAN discriminators don't use a sigmoid (e.g., see LSGANs or WGANs).

daifeng2016 · 2019-12-21T03:25:54Z

Thanks. Then what is the difference of the output of D without sigmoid. For example, in LSGAN, if the output of D is very large (far from 1 or 0), can the loss function work? since the real labels are still set to 1 and false labels set to 0.

phillipi · 2019-12-21T04:42:57Z

I believe in LSGAN the loss is squared distance from the labels. So if the output of D is very large, D will get a large penalty and it will learn to make a smaller output. Eventually, D should learn to output the correct labels, since those minimize the loss (and the loss is nice and smooth, just squared distance).

FunkyKoki · 2020-04-23T06:47:30Z

I would like to share some points on why the patch number is counted by:
(output_size - 1) * stride + ksize

Here is what I think. For any i (input feature map size), k (kernel size), p (zero padding size) and s (stride), the output feature map size (o) is:
o = floor((i+2*p-k)/s)+1

when calculating patch number, it is supposed that p=0, so it is very clear that the calculation process above is just the opposite of the patch number calculation process.

serwansj · 2020-05-16T17:40:17Z

Why is a padding of 1 being used in every convolution in the discriminator? If we feed the discriminator an image of size 70x70 we get an output of 6x6. Wouldn't it make more sense to not use a padding and instead get one single output 1x1 for a 70x70 input?

phillipi · 2020-05-16T17:48:02Z

I think the padding was a holdover from the DCGAN architecture. I can't remember if there is a good reason for it. Might have been to make a 256x256 input map to a 1x1 output, in the DCGAN discriminator.

Zero padding also has the effect that it helps localize where you are in the image, since you can see this border of zeros when you are near an image boundary. That can sometimes be beneficial.

JustinAsdz · 2020-06-03T08:05:58Z

Can you tell me which line in the code represents patchGAN?

It lies in here. 538th line in the networks.py

xcc13 · 2020-06-08T07:56:34Z

70

Thank you!
But What does 'output_size' mean here?

JustinAsdz · 2020-06-08T08:05:42Z

70

Thank you!
But What does 'output_size' mean here?

It just means the width/height of the output feature map,
We can calculate the receptive field of the prior layer according to its output_size

shaurov2253 · 2020-09-04T06:03:41Z

Hi, as the discriminator outputs 30x30x1 matrix, does that mean the 70x70 patch was moved over the input image 30 times in each direction (horizontal and vertical) to map to single output for all of them?

junyanz · 2020-09-06T18:45:16Z

Answered at #1106.

yfwang-master · 2020-10-19T03:38:38Z

Hello phillipi,
i am wondering whether 'padding' is necessary in conv processing?

phillipi · 2020-10-19T04:11:11Z

I doubt it has a big effect. You could try removing it and see what happens.

yfwang-master · 2020-10-20T08:10:20Z

Thx,and i wonder whether 'PatchGAN' discriminator (convnet in fact in your responsed) is applied to a 3-d model(C-H-W-L 4dim in code)still work?
if so,use conv3d() instead right?and so called'3-d PatchGAN' can discriminate the local of 3-d model which is real or fake?

johndpope · 2021-06-15T05:18:48Z

thanks @emilwallner

emcrobert · 2021-12-09T12:00:15Z

The one thing I'm struggling to understand is that the discriminator looks at 70 x 70 patches. But if I understand correctly, it's input is the conditional image concatenated with either the real image or synthesised image. So if it's only looking at small patches at a time, how does it learn the relationship between the two images? How does it check that the conditional input has actually informed the image that has been generated?

junyanz · 2022-01-20T23:25:31Z

Most of the applications used in the paper only require local color and texture transfer. In these cases, 70x70 patches might be enough (for a 256x input image). Later work (e.g., pix2pixHD) has explored using multi-scale discriminators, which can look at more pixels.

yearep7 · 2022-07-21T02:21:52Z

If this structure is added to the generator, will it have a good effect? Is there any Ablation Experiment in this regard

CHENHUI-X · 2022-07-23T08:21:55Z

thanks @emilwallner

Great picture, like it!

phillipi closed this as completed Jun 1, 2017

ydnaandy123 mentioned this issue Oct 27, 2017

Why not use a patch discriminator? hardikbansal/CycleGAN#10

Closed

phillipi mentioned this issue Nov 30, 2017

The difference between your paper and your implementations phillipi/pix2pix#120

Closed

phillipi mentioned this issue Dec 26, 2017

why does not the discriminator output a scalar junyanz/CycleGAN#66

Closed

emilwallner mentioned this issue Feb 24, 2018

controlling patch size yenchenlin/pix2pix-tensorflow#11

Open

xyy-ict mentioned this issue Nov 2, 2018

PatchGAN albertpumarola/GANimation#60

Closed

julien2512 mentioned this issue Nov 23, 2018

Does it use patchGAN in discriminator? affinelayer/pix2pix-tensorflow#142

Open

utkarshojha mentioned this issue Jan 15, 2019

bounding box to mask kkanshul/finegan#3

Closed

knazeri mentioned this issue Feb 21, 2019

Output of the discriminator is not 1-D ImagingLab/Colorizing-with-GANs#17

Closed

masontchen mentioned this issue Mar 8, 2019

Number of layers in NLayerDiscriminator #567

Closed

knazeri mentioned this issue Mar 26, 2019

discriminator output knazeri/edge-connect#66

Closed

howardyclo mentioned this issue Jun 11, 2019

SinGAN: Learning a Generative Model from a Single Natural Image howardyclo/papernotes#57

Open

lyndonzheng mentioned this issue Jan 8, 2020

Question about the output of Discriminator lyndonzheng/Pluralistic-Inpainting#44

Closed

shaurov2253 mentioned this issue Jul 28, 2020

1. The "70" is implicit, it's not written anywhere in the code but instead emerges as a mathematical consequence of the network architecture. #1106

Open

JonTong mentioned this issue Jan 6, 2021

Issue with dimension in last output of Discriminator structure in CycleGan notebook. udacity/deep-learning-v2-pytorch#72

Closed

phillipi mentioned this issue Jun 14, 2022

About the receptive field of the discriminator achitecture phillipi/pix2pix#217

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: PatchGAN Discriminator #39

Question: PatchGAN Discriminator #39

johnkorn commented Jun 1, 2017

phillipi commented Jun 1, 2017

taki0112 commented Sep 27, 2017

phillipi commented Sep 27, 2017 •

edited

Loading

taki0112 commented Oct 15, 2017 •

edited

Loading

phillipi commented Oct 15, 2017

emilwallner commented Feb 24, 2018

utkarshojha commented Sep 15, 2018

phillipi commented Sep 16, 2018

utkarshojha commented Sep 17, 2018

edoardogiacomello commented Nov 21, 2018

iperov commented Dec 23, 2018

huicongzhang commented Oct 6, 2019

daifeng2016 commented Dec 19, 2019

phillipi commented Dec 19, 2019

daifeng2016 commented Dec 21, 2019

phillipi commented Dec 21, 2019

FunkyKoki commented Apr 23, 2020

serwansj commented May 16, 2020

phillipi commented May 16, 2020

JustinAsdz commented Jun 3, 2020 •

edited

Loading

xcc13 commented Jun 8, 2020

JustinAsdz commented Jun 8, 2020

shaurov2253 commented Sep 4, 2020

junyanz commented Sep 6, 2020

yfwang-master commented Oct 19, 2020

phillipi commented Oct 19, 2020

yfwang-master commented Oct 20, 2020

johndpope commented Jun 15, 2021

emcrobert commented Dec 9, 2021

junyanz commented Jan 20, 2022

yearep7 commented Jul 21, 2022

CHENHUI-X commented Jul 23, 2022

Question: PatchGAN Discriminator #39

Question: PatchGAN Discriminator #39

Comments

johnkorn commented Jun 1, 2017

phillipi commented Jun 1, 2017

taki0112 commented Sep 27, 2017

phillipi commented Sep 27, 2017 • edited Loading

taki0112 commented Oct 15, 2017 • edited Loading

phillipi commented Oct 15, 2017

emilwallner commented Feb 24, 2018

utkarshojha commented Sep 15, 2018

phillipi commented Sep 16, 2018

utkarshojha commented Sep 17, 2018

edoardogiacomello commented Nov 21, 2018

iperov commented Dec 23, 2018

huicongzhang commented Oct 6, 2019

daifeng2016 commented Dec 19, 2019

phillipi commented Dec 19, 2019

daifeng2016 commented Dec 21, 2019

phillipi commented Dec 21, 2019

FunkyKoki commented Apr 23, 2020

serwansj commented May 16, 2020

phillipi commented May 16, 2020

JustinAsdz commented Jun 3, 2020 • edited Loading

xcc13 commented Jun 8, 2020

JustinAsdz commented Jun 8, 2020

shaurov2253 commented Sep 4, 2020

junyanz commented Sep 6, 2020

yfwang-master commented Oct 19, 2020

phillipi commented Oct 19, 2020

yfwang-master commented Oct 20, 2020

johndpope commented Jun 15, 2021

emcrobert commented Dec 9, 2021

junyanz commented Jan 20, 2022

yearep7 commented Jul 21, 2022

CHENHUI-X commented Jul 23, 2022

phillipi commented Sep 27, 2017 •

edited

Loading

taki0112 commented Oct 15, 2017 •

edited

Loading

JustinAsdz commented Jun 3, 2020 •

edited

Loading