New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Probable things to investigate when Generator falls behind the Discriminator #2
Comments
Your first case seems to be a collapsed GAN problem. Try some of the tricks in this document, such as training discriminator more per generator iteration. About your second point, pix2pix and text-to-image are both derived from dcgan.torch code, and maybe they never needed to change it.
This is unclear apart from a loose intuition. |
Thanks for the prompt reply. My discriminator is already performing well above the generator. It is bit unclear to me why training it more per generator iteration will help. |
oh right, in this case, look at (13), it will help. https://github.com/soumith/ganhacks#13-add-noise-to-inputs-decay-over-time |
Yeah, this may help. I will try it out. Thanks ! :) I'll close this issue then for the time-being. |
@soumith Hi, one trick mentioned here is to train the discriminator more per generator iteration. Have you tried the converse i.e. train the generator more per discriminator iteration as well. From an intuition point of view i think it might allow the generator to catch up to the discriminator a bit more. Does this kinda make sense from your experience? |
same as @omair-kg |
I do not understand why almost all research works trained discriminator >= 1 time(s) when updating generator once. My own experience on various tasks (sample for recognition, super resolution, or others) are opposite. ALL showed that generator is much much much much harder to learn than the discriminator. I usually needs to learn generator dozens of times when updating discriminator once. Also, even I used some online published GAN codes, their performances are still the same! Generator loss is really high while discriminator loss is really low. I tried almost 10 published codes, and all are the same situation. Has anyone experienced the same as me? |
Most of the posts I read about mentioned that if the discriminator wasn't good enough, the generator would get away with creating trash, so it wouldn't learn anything. Therefor, it is in the interest of the generator to have a strong discriminator.
Could you post a code snippet of how you accomplish this?
I have been using the tensorflow port of pix2pix and have modified it also. I experienced the same as you when training on the cityscapes dataset. By adding label flipping I was able to even out the losses, such that the loss of the generator is only a bit higher than the discriminator. |
How did that work out for you? |
I have similar experience with you. I usually train the generator 2 times more than the discriminator to ensure it is catching up. |
Hi,
I am unsure whether this is worth creating a new issue or not, so please feel free to let me know if it's not. Actually I am quite new to training GANs and hence was hoping someone with more experience can provide some guidance.
My problem is, my Generator error keeps on increasing steadily (not spiking suddenly, but gradually) and the Discriminator error keeps on reducing simultaneously. Below I provide the statistics :
When this is the case, what are some of the probable things to investigate or I should look out for while trying to mitigate the problem ?
For example, I came across: Make the discriminator much less expressive by using a smaller model. Generation is a much harder task and requires more parameters, so the generator should be significantly bigger. [1]
If someone has similar pointers up their sleeves, it will be very helpful.
Secondly, just out of curiosity, is there a reason that most of the implementations that I have come across, uses the same
lr
for both the generator and discriminator ? (DC-GAN, pix2pix, Text-to-image).I mean, since the generator's job is much harder (generating something plausible from random noise), in hindsight, giving a higher
lr
to it makes more sense. Or is it just simply application specific and the samelr
just works out for the above mentioned works ?Thanks in advance !
The text was updated successfully, but these errors were encountered: