Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeMatch SAF loss is a negative value? #182

Closed
hzhz2020 opened this issue Dec 17, 2023 · 4 comments
Closed

FreeMatch SAF loss is a negative value? #182

hzhz2020 opened this issue Dec 17, 2023 · 4 comments

Comments

@hzhz2020
Copy link

Bug

In your FreeMatch paper equation 11, the SAF loss is a negative value. Then the SAF loss is added to the total loss. Isnt this causing SumNorm (p˜t/h˜t) and SumNorm(p¯/h¯) to be more dis-simmilar?

Reproduce the Bug

Error Messages and Logs

@Hhhhhhao
Copy link
Collaborator

Negative loss doesn't affect the gradient

@hzhz2020
Copy link
Author

thank you for your fast reply!
I am just curious about the intuition behind this loss. By minimizing $L_{f}$ in eq11, aren't you pushing SumNorm (p˜t/h˜t) and SumNorm(p¯/h¯) to be further apart? I thought you want to make them closer?

@Hhhhhhao
Copy link
Collaborator

thank you for your fast reply! I am just curious about the intuition behind this loss. By minimizing Lf in eq11, aren't you pushing SumNorm (p˜t/h˜t) and SumNorm(p¯/h¯) to be further apart? I thought you want to make them closer?

The loss encourages fairness on average predictions. We expect the average predictions to be close to uniform. Reflected on the entropy loss, it corresponds to maximizing the entropy. We replace the target term with a momentum-smoothed average prediction for stability. That's the intuition behind this loss. I think the paper and relevant works mentioned in the paper discussed this in more detail.

@hzhz2020
Copy link
Author

thank you for explaining it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants