Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probable things to investigate when Generator falls behind the Discriminator #2

Closed
kmul00 opened this issue Dec 12, 2016 · 10 comments
Closed

Comments

@kmul00
Copy link

kmul00 commented Dec 12, 2016

Hi,

I am unsure whether this is worth creating a new issue or not, so please feel free to let me know if it's not. Actually I am quite new to training GANs and hence was hoping someone with more experience can provide some guidance.

My problem is, my Generator error keeps on increasing steadily (not spiking suddenly, but gradually) and the Discriminator error keeps on reducing simultaneously. Below I provide the statistics :

Generator error / Discriminator error
0.75959807634354 / 0.59769108891487
1.3820139408112 / 0.35363309383392
1.9390360116959 / 0.2000379934907
2.1018676519394 / 0.16694237589836
2.5574874728918 / 0.10423161834478
2.8098415493965 / 0.082516837120056
3.2860078886151 / 0.046023709699512
3.630749514699 / 0.028832530975342
3.7707495708019 / 0.022863862104714
3.8990840911865 / 0.020417057722807
4.1248006802052 / 0.017872251570225
4.259504699707 / 0.01507920846343
4.2479295730591 / 0.013462643604726
4.4426490783691 / 0.010646429285407
4.6057481756434 / 0.0098107368685305
4.6718273162842 / 0.0096474666148424
4.8214926728979 / 0.0079655896406621
4.7656826004386 / 0.0076067917048931
4.8425741195679 / 0.0080536706373096
4.9743659980595 / 0.0066521260887384

When this is the case, what are some of the probable things to investigate or I should look out for while trying to mitigate the problem ?

For example, I came across: Make the discriminator much less expressive by using a smaller model. Generation is a much harder task and requires more parameters, so the generator should be significantly bigger. [1]

If someone has similar pointers up their sleeves, it will be very helpful.

Secondly, just out of curiosity, is there a reason that most of the implementations that I have come across, uses the same lr for both the generator and discriminator ? (DC-GAN, pix2pix, Text-to-image).

I mean, since the generator's job is much harder (generating something plausible from random noise), in hindsight, giving a higher lr to it makes more sense. Or is it just simply application specific and the same lr just works out for the above mentioned works ?

Thanks in advance !

@soumith
Copy link
Owner

soumith commented Dec 12, 2016

Your first case seems to be a collapsed GAN problem. Try some of the tricks in this document, such as training discriminator more per generator iteration.

About your second point, pix2pix and text-to-image are both derived from dcgan.torch code, and maybe they never needed to change it.

I mean, since the generator's job is much harder (generating something plausible from random noise), in hindsight, giving a higher lr to it makes more sense. Or is it just simply application specific ?

This is unclear apart from a loose intuition.

@kmul00
Copy link
Author

kmul00 commented Dec 12, 2016

Thanks for the prompt reply.

My discriminator is already performing well above the generator. It is bit unclear to me why training it more per generator iteration will help.

@soumith
Copy link
Owner

soumith commented Dec 12, 2016

oh right, in this case, look at (13), it will help. https://github.com/soumith/ganhacks#13-add-noise-to-inputs-decay-over-time

@kmul00
Copy link
Author

kmul00 commented Dec 12, 2016

Yeah, this may help. I will try it out. Thanks ! :)

I'll close this issue then for the time-being.

@kmul00 kmul00 closed this as completed Dec 12, 2016
@omair-kg
Copy link

@soumith Hi, one trick mentioned here is to train the discriminator more per generator iteration. Have you tried the converse i.e. train the generator more per discriminator iteration as well. From an intuition point of view i think it might allow the generator to catch up to the discriminator a bit more.

Does this kinda make sense from your experience?

@ahmed-fau
Copy link

same as @omair-kg

@ybsave
Copy link

ybsave commented Oct 11, 2018

I do not understand why almost all research works trained discriminator >= 1 time(s) when updating generator once.

My own experience on various tasks (sample for recognition, super resolution, or others) are opposite. ALL showed that generator is much much much much harder to learn than the discriminator. I usually needs to learn generator dozens of times when updating discriminator once.

Also, even I used some online published GAN codes, their performances are still the same! Generator loss is really high while discriminator loss is really low. I tried almost 10 published codes, and all are the same situation.

Has anyone experienced the same as me?

@davesean
Copy link

I do not understand why almost all research works trained discriminator >= 1 time(s) when updating generator once.

Most of the posts I read about mentioned that if the discriminator wasn't good enough, the generator would get away with creating trash, so it wouldn't learn anything. Therefor, it is in the interest of the generator to have a strong discriminator.

My own experience on various tasks (sample for recognition, super resolution, or others) are opposite. ALL showed that generator is much much much much harder to learn than the discriminator. I usually needs to learn generator dozens of times when updating discriminator once.

Could you post a code snippet of how you accomplish this?

Also, even I used some online published GAN codes, their performances are still the same! Generator loss is really high while discriminator loss is really low. I tried almost 10 published codes, and all are the same situation.

Has anyone experienced the same as me?

I have been using the tensorflow port of pix2pix and have modified it also. I experienced the same as you when training on the cityscapes dataset. By adding label flipping I was able to even out the losses, such that the loss of the generator is only a bit higher than the discriminator.

@davesean
Copy link

Yeah, this may help. I will try it out. Thanks ! :)

I'll close this issue then for the time-being.

How did that work out for you?

@yushuinanrong
Copy link

@ybsave

I have similar experience with you. I usually train the generator 2 times more than the discriminator to ensure it is catching up.
I guess it also depends on the learning capacities of gen/disc. If you cripple your discriminator, you may need to train it more than the generator.
I personally like the idea of GAN, but not enjoy playing with it:(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants