You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the CycleGAN solution notebook, in the Discriminator architecture, as I have seen the image of the architecture, the dimension of the last output (the logit) is 1x1x1; in the following code from the Discriminator:
and I see that the output from self.conv4 is 8x8x512, if it goes through self.conv5, the output should be of shape 7x7x1. How can it be 1x1x1 as you defined in the image architecture as well as in the video lecture (you said it should output a single value)?
The text was updated successfully, but these errors were encountered:
Ah, you're right! Thank you for this feedback. To be more specific, we are looking at one value (the mean of those output values) and using that single value to calculate the real and fake loss, later
Is there a reason for these two steps? Why not go for the single step of creating the final layer with a kernel size of 8, padding 0 and actually have it output just one value?
He links to this [explanation] (junyanz/pytorch-CycleGAN-and-pix2pix#39) of which my understanding is that this allows the discriminator to more easily identify which specific patch in the image (i.e. which of the 7x7 cells) looks fake/real, which can then be traced back (via the receptive field of CNNs) through the network. In order to calculate the overall Discriminator Loss, this 7x7 output of the Discriminator is then averaged (since they are all indications of whether a given patch is "real" or not, and therefore the average should indicate whether the whole image is "real" (or not).
Hi,
In the CycleGAN solution notebook, in the Discriminator architecture, as I have seen the image of the architecture, the dimension of the last output (the logit) is 1x1x1; in the following code from the Discriminator:
and I see that the output from self.conv4 is 8x8x512, if it goes through self.conv5, the output should be of shape 7x7x1. How can it be 1x1x1 as you defined in the image architecture as well as in the video lecture (you said it should output a single value)?
The text was updated successfully, but these errors were encountered: