MASK LOSS to 0 #32

imlixinyang · 2019-10-01T08:40:42Z

I apply your code to celeba-hq, but found the loss of attention turn to 0 and so the networks do nothing. Why?
the d_cls is about 10 while g_cls go to 30 but networks seem to never optim this.
Another question is that why you set the parameter of DE fixed?

affromero · 2019-10-01T08:46:31Z

Hello,
I need more details here.

Are you using celeba-hq with 256x256?
We discuss the DE fixed in Section 5.1 of our paper.

imlixinyang · 2019-10-01T10:06:10Z

yes, i use 256x256 and just fix some difference for the difference of the formats at Celeba.py. The output shows that mask ->0 and G_cls around 30 .

imlixinyang · 2019-10-01T10:07:47Z

I've tried to run for several epochs but this problem seems to never improve.

affromero · 2019-10-01T10:08:17Z

May I know what kind of labels are you using for that dataset?

imlixinyang · 2019-10-01T10:09:56Z

Yes, i use 'Black_Hair', 'Blond_Hair', 'Brown_Hair', 'Male', 'Young', 'Smiling', 'Eyeglasses', 'Goatee', 'Pale_Skin', 'Bangs', the same setting of my model, i'm ready to cite your paper and do some comparison, so it's a little urge.

affromero · 2019-10-01T10:11:13Z

Yes but as far as I am aware CelebA-HQ (at least the original implementation from here) does not have any labels.

affromero · 2019-10-01T10:13:30Z

Can you send me a screenshot or something regarding the problem, so I can give you a more detailled feedback?

imlixinyang · 2019-10-01T10:28:20Z

Some one has collect the mapping from celeba-hq to celeba. So the labels are available somewhere.
I don't know how to upload pictures at github, i can just see that after some iters, the samples show that the mask is all white and translation do nothing but keep all the source image.

imlixinyang · 2019-10-01T10:31:08Z

Actually, because of the different format, i used to run successfully once but the order of labels is wrong (csv begins at 1 but txt begins at 0, the first line of all labels). I fixed it and then the training begins to fail.

affromero · 2019-10-01T10:44:01Z

Now I see. So it is not that the attention loss goes to zero but it remains saturated: if λ_mask=0.1, then L_attn is saturated at 0.2 and never goes down.This means that the mask is always 1.0, so from the attention equation we have always the real image:

$\mathcal{X}_{f}=\mathcal{M} \boldsymbol{\cdot} \mathcal{X}_{r} + (1 - \mathcal{M}) \boldsymbol{\cdot} \mathcal{X}_{f}$

Normally it is a problem of random initialization that should be easily fixed by --seed in a different number. Or try with a larger λ_mask=1.0. For fine-grained translations (tested on CelebA, EmotioNet, and BP4D), it is saturated mostly during the first epoch, and before the end of the first epoch L_attn goes down.

Please try this and let me know.

imlixinyang · 2019-10-01T11:26:05Z

Thank you, i'll run for one night to see if this problem can be solved.

imlixinyang · 2019-10-01T23:33:05Z

Hello, i've tried to run for one night, around 18 epochs with another seed, but this problem is still not improved. This time the G_cls is about 40 and D_cls about 4, why?

imlixinyang · 2019-10-02T00:43:31Z

I refreshed all files and try to find the reason.
SMIT-master/solver.py", line 373, in label2embedding
assert target.max() == 1 and target.min() == 0
In last time, i just delete this. What is this for? Does it matter?

imlixinyang · 2019-10-02T02:55:27Z

And i do think the attention loss is not what i thought it is. In Ganimation, the output is:
$$x_f = (1 - A) x_f + A x_r$$
and attention loss is $|A|$.
so the the change part is forced to be small, but in your paper, it seems to the opposite. I don't know if there is something i missed.

affromero · 2019-10-02T05:07:38Z

I do not know what you are doing.

assert target.max() == 1 and target.min() == 0

This is to ensure you are inserting a one hot encoding vector. If you are not inserting it, can I know what kind of labels are you using? If that was the problem it would have raised an error.

And i do think the attention loss is not what i thought it is. In Ganimation, the output is:
$$x_f = (1 - A) x_f + A x_r$$
and attention loss is $|A|$.

Why do you say so? This is exactly what we do as we mention it in the paper in section 3.2.1 and in the code here and here.

I do not know if you were able to fix it. Without screenshots or more detailled instructions to reproduce it I cannot help you much.

imlixinyang · 2019-10-02T05:27:53Z

Yes, i know your code is doing what you show in paper, but this is different from the attention loss from other models, e.g. GANimation. In their papers, the change part should be minimized.
I have fix this by set $x_f = M x_f + (1 - M) x_r$. And the training seems to be improved.
Please check this point carefully.
Actually when i first use this loss function, i made a mistake, too. So it doesn't matter, and i really appreciate your work!
I would email you if i can not reproduce in this dataset still.

affromero · 2019-10-02T05:32:54Z

Perfect. Let me know if anything good or bad happens and feel free to close the issue or keep it open until we fix this.

imlixinyang · 2019-10-02T05:45:42Z

Okay, i would feedback timely. And thank you so much for your timely reply, too.

imlixinyang · 2019-10-05T04:10:44Z

Hello, this time the results are good enough for me to compare.
It's interesting that I corrected the mask loss to minimize the cover part but the cover part become larger than maximizing it.
And I do think it's necessary for you to correct this in your official code, too. It it simple but benefits the training a lot.

affromero · 2019-10-07T19:01:19Z

Thanks for your nice feedback. I will have that in mind from now on.

affromero closed this as completed Oct 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MASK LOSS to 0 #32

MASK LOSS to 0 #32

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019 •

edited

affromero commented Oct 1, 2019 •

edited

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 2, 2019

imlixinyang commented Oct 2, 2019 •

edited

affromero commented Oct 2, 2019

imlixinyang commented Oct 2, 2019

affromero commented Oct 2, 2019

imlixinyang commented Oct 2, 2019

imlixinyang commented Oct 5, 2019

affromero commented Oct 7, 2019

MASK LOSS to 0 #32

MASK LOSS to 0 #32

Comments

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

affromero commented Oct 1, 2019

affromero commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019 • edited

affromero commented Oct 1, 2019 • edited

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 1, 2019

imlixinyang commented Oct 2, 2019

imlixinyang commented Oct 2, 2019 • edited

affromero commented Oct 2, 2019

imlixinyang commented Oct 2, 2019

affromero commented Oct 2, 2019

imlixinyang commented Oct 2, 2019

imlixinyang commented Oct 5, 2019

affromero commented Oct 7, 2019

imlixinyang commented Oct 1, 2019 •

edited

affromero commented Oct 1, 2019 •

edited

imlixinyang commented Oct 2, 2019 •

edited