-
Notifications
You must be signed in to change notification settings - Fork 483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
energy_new in CAM #9
Comments
Prevent loss divergence during training |
To prevent overflow or underflow, it should be x - max(x), rather than max(x) - x |
Yes. I am also wondering why using max(x) - x, seems that your correlation will favor the most different channel information instead the most similar one |
i have the same opinion with you, using max(x) - x will give bigger weights to the feature with less similarity after performing softmax |
I think it should be x - max(x) and actually this operation is also redundant, pytorch itself has done it already. |
Hi @junfu1115 ,
why do you use the energy_new in attention.py?
Correct me if I miss something.
The text was updated successfully, but these errors were encountered: