Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupervised loss not back-propogated through the main decoder #43

Closed
QingyongHu opened this issue Jul 9, 2021 · 1 comment
Closed

Comments

@QingyongHu
Copy link

Hi Yassine,

Thanks for your great work! I have noticed that you mentioned the unsupervised loss is not back-propagated through the main decoder in the paper. From my understanding, this means the trainable parameters are only optimized through the supervised loss?

Can you please help me to figure out where the implementation is?

Many thanks,

@yassouali
Copy link
Owner

yassouali commented Jul 9, 2021

Hi @QingyongHu

Thank you for your interest.

For the main decoder, yes you are correct, the main decoder is not optimized with the unsupervised loss, only the supervised loss. So to summarize, the main decoder: trained with supervised loss, aux. decoders trained with the unsupervised loss, and the encoder trained with both.

The stop gradient over the main decoder helps with two things: 1) avoid collapsing solutions, if we backpropagated through both, the main decoder will collapse since the unsupervised loss will be minimized if the predictions are zeros, 2) the main decoder is only trained on clean inputs, making adaptable to test time since the test time inputs are also clean.

In the implementation, this is done by simply detaching the main decoders outputs here:

targets = F.softmax(output_ul.detach(), dim=1)

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants