Probable things to investigate when Generator falls behind the Discriminator #2

kmul00 · 2016-12-12T12:56:13Z

Hi,

I am unsure whether this is worth creating a new issue or not, so please feel free to let me know if it's not. Actually I am quite new to training GANs and hence was hoping someone with more experience can provide some guidance.

My problem is, my Generator error keeps on increasing steadily (not spiking suddenly, but gradually) and the Discriminator error keeps on reducing simultaneously. Below I provide the statistics :

Generator error / Discriminator error
0.75959807634354 / 0.59769108891487
1.3820139408112 / 0.35363309383392
1.9390360116959 / 0.2000379934907
2.1018676519394 / 0.16694237589836
2.5574874728918 / 0.10423161834478
2.8098415493965 / 0.082516837120056
3.2860078886151 / 0.046023709699512
3.630749514699 / 0.028832530975342
3.7707495708019 / 0.022863862104714
3.8990840911865 / 0.020417057722807
4.1248006802052 / 0.017872251570225
4.259504699707 / 0.01507920846343
4.2479295730591 / 0.013462643604726
4.4426490783691 / 0.010646429285407
4.6057481756434 / 0.0098107368685305
4.6718273162842 / 0.0096474666148424
4.8214926728979 / 0.0079655896406621
4.7656826004386 / 0.0076067917048931
4.8425741195679 / 0.0080536706373096
4.9743659980595 / 0.0066521260887384

When this is the case, what are some of the probable things to investigate or I should look out for while trying to mitigate the problem ?

For example, I came across: Make the discriminator much less expressive by using a smaller model. Generation is a much harder task and requires more parameters, so the generator should be significantly bigger. [1]

If someone has similar pointers up their sleeves, it will be very helpful.

Secondly, just out of curiosity, is there a reason that most of the implementations that I have come across, uses the same lr for both the generator and discriminator ? (DC-GAN, pix2pix, Text-to-image).

I mean, since the generator's job is much harder (generating something plausible from random noise), in hindsight, giving a higher lr to it makes more sense. Or is it just simply application specific and the same lr just works out for the above mentioned works ?

Thanks in advance !

The text was updated successfully, but these errors were encountered:

soumith · 2016-12-12T13:04:11Z

Your first case seems to be a collapsed GAN problem. Try some of the tricks in this document, such as training discriminator more per generator iteration.

About your second point, pix2pix and text-to-image are both derived from dcgan.torch code, and maybe they never needed to change it.

I mean, since the generator's job is much harder (generating something plausible from random noise), in hindsight, giving a higher lr to it makes more sense. Or is it just simply application specific ?

This is unclear apart from a loose intuition.

kmul00 · 2016-12-12T13:07:13Z

Thanks for the prompt reply.

My discriminator is already performing well above the generator. It is bit unclear to me why training it more per generator iteration will help.

soumith · 2016-12-12T13:08:14Z

oh right, in this case, look at (13), it will help. https://github.com/soumith/ganhacks#13-add-noise-to-inputs-decay-over-time

kmul00 · 2016-12-12T13:09:37Z

Yeah, this may help. I will try it out. Thanks ! :)

I'll close this issue then for the time-being.

omair-kg · 2017-01-20T15:32:40Z

@soumith Hi, one trick mentioned here is to train the discriminator more per generator iteration. Have you tried the converse i.e. train the generator more per discriminator iteration as well. From an intuition point of view i think it might allow the generator to catch up to the discriminator a bit more.

Does this kinda make sense from your experience?

ahmed-fau · 2018-05-31T23:21:19Z

same as @omair-kg

ybsave · 2018-10-11T23:03:40Z

I do not understand why almost all research works trained discriminator >= 1 time(s) when updating generator once.

My own experience on various tasks (sample for recognition, super resolution, or others) are opposite. ALL showed that generator is much much much much harder to learn than the discriminator. I usually needs to learn generator dozens of times when updating discriminator once.

Also, even I used some online published GAN codes, their performances are still the same! Generator loss is really high while discriminator loss is really low. I tried almost 10 published codes, and all are the same situation.

Has anyone experienced the same as me?

davesean · 2018-10-24T12:52:46Z

I do not understand why almost all research works trained discriminator >= 1 time(s) when updating generator once.

Most of the posts I read about mentioned that if the discriminator wasn't good enough, the generator would get away with creating trash, so it wouldn't learn anything. Therefor, it is in the interest of the generator to have a strong discriminator.

My own experience on various tasks (sample for recognition, super resolution, or others) are opposite. ALL showed that generator is much much much much harder to learn than the discriminator. I usually needs to learn generator dozens of times when updating discriminator once.

Could you post a code snippet of how you accomplish this?

Also, even I used some online published GAN codes, their performances are still the same! Generator loss is really high while discriminator loss is really low. I tried almost 10 published codes, and all are the same situation.

Has anyone experienced the same as me?

I have been using the tensorflow port of pix2pix and have modified it also. I experienced the same as you when training on the cityscapes dataset. By adding label flipping I was able to even out the losses, such that the loss of the generator is only a bit higher than the discriminator.

davesean · 2018-10-24T12:53:09Z

Yeah, this may help. I will try it out. Thanks ! :)

I'll close this issue then for the time-being.

How did that work out for you?

yushuinanrong · 2018-12-07T02:17:35Z

@ybsave

I have similar experience with you. I usually train the generator 2 times more than the discriminator to ensure it is catching up.
I guess it also depends on the learning capacities of gen/disc. If you cripple your discriminator, you may need to train it more than the generator.
I personally like the idea of GAN, but not enjoy playing with it:(

kmul00 closed this as completed Dec 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Probable things to investigate when Generator falls behind the Discriminator #2

Probable things to investigate when Generator falls behind the Discriminator #2

kmul00 commented Dec 12, 2016 •

edited

soumith commented Dec 12, 2016

kmul00 commented Dec 12, 2016

soumith commented Dec 12, 2016

kmul00 commented Dec 12, 2016 •

edited

omair-kg commented Jan 20, 2017

ahmed-fau commented May 31, 2018

ybsave commented Oct 11, 2018 •

edited

davesean commented Oct 24, 2018

davesean commented Oct 24, 2018

yushuinanrong commented Dec 7, 2018

Probable things to investigate when Generator falls behind the Discriminator #2

Probable things to investigate when Generator falls behind the Discriminator #2

Comments

kmul00 commented Dec 12, 2016 • edited

soumith commented Dec 12, 2016

kmul00 commented Dec 12, 2016

soumith commented Dec 12, 2016

kmul00 commented Dec 12, 2016 • edited

omair-kg commented Jan 20, 2017

ahmed-fau commented May 31, 2018

ybsave commented Oct 11, 2018 • edited

davesean commented Oct 24, 2018

davesean commented Oct 24, 2018

yushuinanrong commented Dec 7, 2018

kmul00 commented Dec 12, 2016 •

edited

kmul00 commented Dec 12, 2016 •

edited

ybsave commented Oct 11, 2018 •

edited