Question on conditional image generation #6

tackgeun · 2017-10-19T05:45:22Z

With the help of your advice, I reproduced the main part of the paper and succeed in training your model to ImageNet with specific classes, minibus, dog.

Now, I try to reproduce 6.8 conditional image generation in the paper to use this architecture for unsupervised/weakly-supervised object segmentation.
But I had a hard time to find a trainable setup. Can you share a setting in the experiment?

[Encoder architecture]

Use same architecture with a discriminator.
If an image is given, encoder returns a vector with the dimension same as that of random noise vector (nz).
To generate background vector, I use Encoder(real image)=background vector
To generate foreground vector, I use Encoder(real image - generated image)=foreground vector

[Optimization method]

Original code (1) maximize log(D(x)) + log(1 - D(G(z))) with fake image then (2) maximize log(D(G(z))) with true image.
I use three-options in optimization (1) update auto-encoder + generator part additionally (2) update auto-encoder part with maximizing log(D(G(z)) step. (3) update only auto-encoder part additonally. But all variants are not working.

Thanks.

jwyang · 2017-10-19T06:34:22Z

Hi, @tackgeun ,

glad to know you have successfully get the results as the paper!

Regarding the conditional image generation,

[Encoder architecture]

Use same architecture with a discriminator.

yes, I also used the same architecture as the discriminator, except that the output layer is a fc layer.

If an image is given, encoder returns a vector with the dimension same as that of random noise vector (nz).

yes, exactly, but I put a batch normalization layer before the last fc layer.

To generate background vector, I use Encoder(real image)=background vector

yeah, that's what I did.

To generate foreground vector, I use Encoder(real image - generated image)=foreground vector

yes, that's what I did.

[Optimization method]

Original code (1) maximize log(D(x)) + log(1 - D(G(z))) with fake image then (2) maximize log(D(G(z))) with true image.
I use three-options in optimization (1) update auto-encoder + generator part additionally (2) update auto-encoder part with maximizing log(D(G(z)) step. (3) update only auto-encoder part additonally. But all variants are not working.

In my experiments, I used reconstruction loss (with weight 1e-3) plus the adversarial loss for training the generator, and the same loss as original GAN for training the discriminator.

Also, please remember to regularize the transformation parameters. The rotation should be minor or even void during training.

Hope these are helpful.

tackgeun · 2017-10-23T23:26:16Z

With your help, my code starts to generate an image. But reconstructed images are blurry compared to the paper. But I'm confusing that the encoder architecture and the way of updating generator weight.

Architecture

Do you mean that the last part of encoder network like this?

Discriminator CNN part - last conv - batch norm(dimension nz) - relu - fc

Optimization

The encoder is embedded within generator class. So I compute gradient for

adversarial loss for generator(random noise)
weighted reconstruction loss(encoded image).

Then, those gradients are accumulated to compute the gradient of the sum of all losses.
Following the above description, the generator's weight is updated w.r.t (1) + (2) and the encoder's weight is updated w.r.t only (2).

Thank you.

jwyang · 2017-10-24T00:31:09Z

Hi, @tackgeun

With your help, my code starts to generate an image. But reconstructed images are blurry compared to the paper. But I'm confusing that the encoder architecture and the way of updating generator weight.

Architecture

Do you mean that the last part of encoder network like this?

Discriminator CNN part - last conv - batch norm(dimension nz) - relu - fc

I used Tanh() as the last layer for the encoder, to make the output in a reasonable range.

Optimization

The encoder is embedded within generator class. So I compute gradient for

adversarial loss for generator(random noise)
weighted reconstruction loss(encoded image).

Then, those gradients are accumulated to compute the gradient of the sum of all losses.
Following the above description, the generator's weight is updated w.r.t (1) + (2) and the encoder's weight is updated w.r.t only (2).

Actually, I update both the generator and the encoder with (1) and (2). And the weight for reconstruction loss is much lower, 1e-3 to 1e-2.

Thank you.

tackgeun · 2017-11-04T19:51:55Z

@hi, I just have returned from the ICCV... and my implementation still has problems.

Did you use Euclidean loss for reconstruction loss? I heard that some works (pix2pix, CycleGAN) insist that L1 loss generates more fine details...
Did you not use size_average option for reconstruction loss?
Did you additional generator for the auto-encoder part? or use the same generator with the encoder by weight sharing?
You said that you updated both the generator and the encoder with (1) and (2). How to update the encoder parameter using random noise? part (1). Did you use reconstruction loss on random noise?

Thank you for your advice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on conditional image generation #6

Question on conditional image generation #6

tackgeun commented Oct 19, 2017

jwyang commented Oct 19, 2017

tackgeun commented Oct 23, 2017 •

edited

jwyang commented Oct 24, 2017

tackgeun commented Nov 4, 2017 •

edited

Question on conditional image generation #6

Question on conditional image generation #6

Comments

tackgeun commented Oct 19, 2017

[Encoder architecture]

[Optimization method]

jwyang commented Oct 19, 2017

[Encoder architecture]

[Optimization method]

tackgeun commented Oct 23, 2017 • edited

Architecture

Optimization

jwyang commented Oct 24, 2017

Architecture

Optimization

tackgeun commented Nov 4, 2017 • edited

tackgeun commented Oct 23, 2017 •

edited

tackgeun commented Nov 4, 2017 •

edited