Is the objective correct? #106

gustavoeb · 2018-12-14T12:25:58Z

Sorry for the lay question but is the objective of these GANs in accord with the original paper?

In the original paper, they seem to be minimizing the log(prob_real)+log(1-prob_fakes); but in most Keras implementations I find on the internet people train the discriminator with binary cross entropy. Does this end up being the same, mathematically?

suzoosuagr · 2018-12-19T08:43:21Z

I think using MSE is base on Least Squares GAN paper. https://arxiv.org/pdf/1611.04076.pdf

gustavoeb · 2018-12-21T12:04:55Z

Thanks for sharing, was not aware about that article. Still my question persists as even in that article the least square is described as minimizing avg((y_real_original-y_real_prediction)²+(y_fake_original-y_fake_prediction)²) [for the discriminator] which (in my mind) is different than just setting discriminator to "mse" in Keras.

I see now that the author has a Pytorch library where things are implemented more like they are described in the papers. So I'm assuming this is a simplification because of the way loss functions are implemented in Keras. He even states in the Keras lib help that "These models are in some cases simplified versions of the ones ultimately described in the papers".

I just which to confirm if this makes them significantly different from the papers, if any instability in training may be do to this, and If I'd be better off writing the loss functions in Tensorflow or Pytorch.

Thanks again.
Cheers

suzoosuagr · 2018-12-21T16:06:38Z

In LSGANs paper, the discriminator's objective function is described as equation (2) (you can check it from original LSGANs paper ). In this code as:

fake_A = self.generator.predict(imgs_B)
# Train the discriminators (original images = real / generated = Fake)
d_loss_real = self.discriminator.train_on_batch([imgs_A, imgs_B], valid)
d_loss_fake = self.discriminator.train_on_batch([fake_A, imgs_B], fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

This code is make some modifies from the original pix2pix paper, and these modifies are based on LSGANs. So, using mse won't be a significant difference, in my opinion.

Thanks

gustavoeb · 2018-12-21T18:06:17Z

Oh, I see what you mean. The loss printed out during training is certainly correct, the problem is that this is not the loss being used in the training. From Keras-GAN/lsgan.py lines 29-31:

        self.discriminator.compile(loss='mse',
            optimizer=optimizer,
            metrics=['accuracy'])

That is the regular 'mse' implemented in keras which is actually this:
K.mean(K.square(y_pred - y_true), axis=-1)

This is not the same as the PyTorch implementation. In PyTorch the d_loss is what is actually being back-propagated:

        real_loss = adversarial_loss(discriminator(real_imgs), valid)
        fake_loss = adversarial_loss(discriminator(gen_imgs.detach()), fake)
        d_loss = 0.5 * (real_loss + fake_loss)

        d_loss.backward()

suzoosuagr · 2018-12-21T18:46:34Z

yes, I miss this. That's a problem, looking for the answer too.

jerevon · 2018-12-22T02:51:24Z

@gustavoeb why the d_loss_real and d_loss_fake give loss values like [0.002,0.222] a two elements list, instead of just one single value?

gustavoeb · 2018-12-22T13:56:28Z

One is for the loss an the other is for the metrics. In the gan.py for example the loss is 'binary_cross_entropy" and the metrics is 'accuracy'.

jerevon · 2018-12-22T21:51:21Z

@gustavoeb okay, I got it. Thank you !

jerevon · 2018-12-22T23:33:32Z

@gustavoeb in conditional gan pix2pix.py: g_loss = self.combined.train_on_batch([imgs_A, imgs_B], [valid, imgs_A]) returns list like [32.91153, 0.8362595, 0.3207527], so the first value is total loss and the last two value are accuracy for valid and generated image ?

bailiqun · 2019-03-19T09:23:42Z

@jerevon 32.91153 = 0.8362595 + 100*0.3207527

eriklindernoren · 2019-04-01T21:16:14Z

@gustavoeb The negative log-likelihood mentioned in the paper and the crossentropy (binary in this case) loss used in the implementation are equivalent. You can find more information e.g. here https://stats.stackexchange.com/questions/198038/cross-entropy-or-log-likelihood-in-output-layer.

kl-31 · 2019-08-15T07:55:23Z

If using MSE for the GAN loss in the generator (such as in Erik's implementation), should we still be using Real labels when training the generator? This 'label flipping' for generator is claimed by some online posts to help when using binary cross entropy, and I'm just wondering if it similarly benefits the use of MSE. Any intuition for using Real labels on the generator? Thanks in advance.

eriklindernoren closed this as completed Apr 1, 2019

wave-transmitter mentioned this issue Apr 30, 2020

MAE instead of L1 loss in pix2pix? #150

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the objective correct? #106

Is the objective correct? #106

gustavoeb commented Dec 14, 2018

suzoosuagr commented Dec 19, 2018

gustavoeb commented Dec 21, 2018

suzoosuagr commented Dec 21, 2018

gustavoeb commented Dec 21, 2018 •

edited

suzoosuagr commented Dec 21, 2018

jerevon commented Dec 22, 2018

gustavoeb commented Dec 22, 2018

jerevon commented Dec 22, 2018

jerevon commented Dec 22, 2018

bailiqun commented Mar 19, 2019

eriklindernoren commented Apr 1, 2019

kl-31 commented Aug 15, 2019

Is the objective correct? #106

Is the objective correct? #106

Comments

gustavoeb commented Dec 14, 2018

suzoosuagr commented Dec 19, 2018

gustavoeb commented Dec 21, 2018

suzoosuagr commented Dec 21, 2018

gustavoeb commented Dec 21, 2018 • edited

suzoosuagr commented Dec 21, 2018

jerevon commented Dec 22, 2018

gustavoeb commented Dec 22, 2018

jerevon commented Dec 22, 2018

jerevon commented Dec 22, 2018

bailiqun commented Mar 19, 2019

eriklindernoren commented Apr 1, 2019

kl-31 commented Aug 15, 2019

gustavoeb commented Dec 21, 2018 •

edited