Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the objective correct? #106

Closed
gustavoeb opened this issue Dec 14, 2018 · 12 comments
Closed

Is the objective correct? #106

gustavoeb opened this issue Dec 14, 2018 · 12 comments

Comments

@gustavoeb
Copy link

Sorry for the lay question but is the objective of these GANs in accord with the original paper?

In the original paper, they seem to be minimizing the log(prob_real)+log(1-prob_fakes); but in most Keras implementations I find on the internet people train the discriminator with binary cross entropy. Does this end up being the same, mathematically?

@suzoosuagr
Copy link

I think using MSE is base on Least Squares GAN paper. https://arxiv.org/pdf/1611.04076.pdf

@gustavoeb
Copy link
Author

Thanks for sharing, was not aware about that article. Still my question persists as even in that article the least square is described as minimizing avg((y_real_original-y_real_prediction)²+(y_fake_original-y_fake_prediction)²) [for the discriminator] which (in my mind) is different than just setting discriminator to "mse" in Keras.

I see now that the author has a Pytorch library where things are implemented more like they are described in the papers. So I'm assuming this is a simplification because of the way loss functions are implemented in Keras. He even states in the Keras lib help that "These models are in some cases simplified versions of the ones ultimately described in the papers".

I just which to confirm if this makes them significantly different from the papers, if any instability in training may be do to this, and If I'd be better off writing the loss functions in Tensorflow or Pytorch.

Thanks again.
Cheers

@suzoosuagr
Copy link

In LSGANs paper, the discriminator's objective function is described as equation (2) (you can check it from original LSGANs paper ). In this code as:

fake_A = self.generator.predict(imgs_B)
# Train the discriminators (original images = real / generated = Fake)
d_loss_real = self.discriminator.train_on_batch([imgs_A, imgs_B], valid)
d_loss_fake = self.discriminator.train_on_batch([fake_A, imgs_B], fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

This code is make some modifies from the original pix2pix paper, and these modifies are based on LSGANs. So, using mse won't be a significant difference, in my opinion.

Thanks

@gustavoeb
Copy link
Author

gustavoeb commented Dec 21, 2018

Oh, I see what you mean. The loss printed out during training is certainly correct, the problem is that this is not the loss being used in the training. From Keras-GAN/lsgan.py lines 29-31:

        self.discriminator.compile(loss='mse',
            optimizer=optimizer,
            metrics=['accuracy'])

That is the regular 'mse' implemented in keras which is actually this:
K.mean(K.square(y_pred - y_true), axis=-1)

This is not the same as the PyTorch implementation. In PyTorch the d_loss is what is actually being back-propagated:

        real_loss = adversarial_loss(discriminator(real_imgs), valid)
        fake_loss = adversarial_loss(discriminator(gen_imgs.detach()), fake)
        d_loss = 0.5 * (real_loss + fake_loss)

        d_loss.backward()

@suzoosuagr
Copy link

yes, I miss this. That's a problem, looking for the answer too.

@jerevon
Copy link

jerevon commented Dec 22, 2018

@gustavoeb why the d_loss_real and d_loss_fake give loss values like [0.002,0.222] a two elements list, instead of just one single value?

@gustavoeb
Copy link
Author

One is for the loss an the other is for the metrics. In the gan.py for example the loss is 'binary_cross_entropy" and the metrics is 'accuracy'.

@jerevon
Copy link

jerevon commented Dec 22, 2018

@gustavoeb okay, I got it. Thank you !

@jerevon
Copy link

jerevon commented Dec 22, 2018

@gustavoeb in conditional gan pix2pix.py: g_loss = self.combined.train_on_batch([imgs_A, imgs_B], [valid, imgs_A]) returns list like [32.91153, 0.8362595, 0.3207527], so the first value is total loss and the last two value are accuracy for valid and generated image ?

@bailiqun
Copy link

@jerevon 32.91153 = 0.8362595 + 100*0.3207527

@eriklindernoren
Copy link
Owner

@gustavoeb The negative log-likelihood mentioned in the paper and the crossentropy (binary in this case) loss used in the implementation are equivalent. You can find more information e.g. here https://stats.stackexchange.com/questions/198038/cross-entropy-or-log-likelihood-in-output-layer.

@kl-31
Copy link

kl-31 commented Aug 15, 2019

If using MSE for the GAN loss in the generator (such as in Erik's implementation), should we still be using Real labels when training the generator? This 'label flipping' for generator is claimed by some online posts to help when using binary cross entropy, and I'm just wondering if it similarly benefits the use of MSE. Any intuition for using Real labels on the generator? Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants