Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The adv_loss curve is strange. #39

Closed
Pyten opened this issue Aug 21, 2019 · 7 comments
Closed

The adv_loss curve is strange. #39

Pyten opened this issue Aug 21, 2019 · 7 comments

Comments

@Pyten
Copy link

Pyten commented Aug 21, 2019

Hi! First, I'd like to thank you for the very helpful repo. During training with the same discriminator as yours, I have met some problems which I can't figure out. 1. The discriminator loss of pred and GT are almost unchanged for most of the time. So I am wondering if this phenomenon is normal. Btw, I haven't added semi data yet.

adv_loss

@hfslyc
Copy link
Owner

hfslyc commented Aug 26, 2019

Hi,

It seems that your discriminator is pretty converged. Usually, we would expect than loss_D is around 0.2-0.5 when adversarial training is properly working. What kind of data are you using?

@Pyten
Copy link
Author

Pyten commented Aug 26, 2019

Hi, thank you for your replay. My dataset is some synthetic document data. I found a probable reason for the curve that I add an extra sigmoid function before bcewithlogit which already integrated sigmoid in it. But when I remove the sigmoid function, the result gets much worse than before. with the lambda_adv = 0.01, batch size = 8 and other params same with you, I got the following figure. I am still not sure if this is normal? Please give me some advice.
image

@hfslyc
Copy link
Owner

hfslyc commented Aug 26, 2019

The D_loss is way too low. There must be some unsymmertric statistics between your GT and pred so that D can easily differentiate it. The other possibility is that the adversarial loss is not trained properly.

btw, when D_loss is low, the adv_loss should be pretty high. This part is a bit weird.

@Pyten
Copy link
Author

Pyten commented Aug 26, 2019

Thank you! My GT and pred are the same types as the usual segmentation task, the label is one-hot format got from the BxHxW tensor label with 4 classes. I'll check my code again. But what is strange is that although the loss is not normal in the first experiment as the first figure shown above, it finally got a good result than experiment without adversarial training. And the second one got a much worse result than experiments without adv.

@hfslyc
Copy link
Owner

hfslyc commented Aug 26, 2019

I see. I don't have any new suggestions other than looking into why adv_loss and D_loss are both low while they should be competing against each other.

@Pyten
Copy link
Author

Pyten commented Aug 30, 2019

Thanks for your early reply!
Since the adv_loss shown in the 2nd figure is multiplied by 0.01, I am wondering if the original value, about 1, is high enough compared with D_pred_loss, about 0.01. I think I need to find out why the
D_pred_loss can't decrease desirably.
Another question I have met is that if trained with more than one GPU, the result will be much worse, and often can't converge. I noticed that you trained with one gpu in your paper. I'd like to know have you tried more than one gpu? Or do you have some clue on that?
Thanks again!

@hfslyc
Copy link
Owner

hfslyc commented Jan 16, 2020

Hi, sorry for not following up in time. I'm closing this issue for now. Feel free to shoot me an email if there is any more question regarding this work.

@hfslyc hfslyc closed this as completed Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants