New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Label smoothing should be one-sided? #10
Comments
I think your advice is slightly different, you are replacing the labels with a random number instead of a fixed label. This will ameliorate the optimal discriminator numerator problem with a fixed fake label and probably avoid it. Is this your reasoning? Thanks, |
https://arxiv.org/abs/1701.00160 Have you read this paper ? Hope to help you. |
@zhangqianhui, thank you, as you can read there, the suggestion is to do one-sided label smoothing. I was wondering why Soumith chose a double sided one with randomization and if there is empirical evidence that this works better. |
I'd acutally also be interested to know this. Did you find any good theoretical or empirical hints to this? |
From what I understand, Soumith is describing one-sided label smoothing, that is, only the labels of the discriminator receives label smoothing, not the generator. Double-sided would be smoothing the labels of the discriminator and generator |
From what I understood, this trick (no. 6)
Would smooth both real and fake labels for the discriminator. But: In the NIPS 2016 tutorial, Soumith explicitely writes:
So I'm still wondering if he has new insight on that, and if so, where to read up on it. :) |
Also wondering about this. Ian Goodfellow specifically said not to smooth the fake labels. |
@MustafaMustafa This issue did not get any concrete answer, merely opened more, maybe let it still open? |
What is meant by
E.g., should we ocassionally assign for a datapoint with real label the fake label and vice-versa? Why? |
@plopd, I think the advice is to smooth the real and fake labels of the discriminator only. It can be rephrased to make that clearer. |
See the implementation of Salimans et. al. 2016. According to it, only the real labels of the discriminator should be smoothed. |
I tried to smooth both real and fake samples, and things get messed up. It seems that only smoothing real samples gets a better result. |
I don't this issue should be closed. there are two opinions:
which one is not clear yet |
I was just trying to figure that out and I wanted to give my two cents. As others here have already mentioned, I believe that the "one-sided" stands for only real samples. After all, it makes sense to talk about sides when we're referring to the interval [0,1]. whereas not so much when talking about the pair generator/discriminator. So I don't see ambiguity in that sense. Moreover, the whole motivation behind this technique seems to be making the discriminator's job harder in order to avoid a series of problems regarding stability [1]. TLDR: smooth only real samples, and only for the discriminator. |
I think in "NIPS 2016 tutorial: Generative adversarial networks" (Goodfellow) Section 4.2, it pointed out that the label smoothing proposed for GAN is specifically for D and for real data only (sample tensorflow code in section 4.2 as well). and if I understand it correctly, with equation 15 and quote:
the most important reason to do one-sided smoothing for real data is to cap the optimal D, and thus, there will be more adversarial signals for G. That mitigates the problem of a too powerful discriminator. |
Regarding trick 6, label smoothing should be one-sided, real images only (Salimans et. al. 2016). Their rationale makes sense. Did you find evidence to the contrary?
The text was updated successfully, but these errors were encountered: