We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi @davda54. Thank you for open-sourcing this implementation.
Wanted to know the reason behind adding 1e-12 in https://github.com/davda54/sam/blob/main/sam.py#L18?
1e-12
The text was updated successfully, but these errors were encountered:
Hi, thanks! :) A small positive number is added to the denominator for numerical stability — to avoid division by zero when grad_norm == 0.0.
grad_norm == 0.0
Sorry, something went wrong.
I see. That is what I had thought too. Thank you for confirming.
I am also assuming your e_w calculation is with respect to L2 norm as the authors assert that they get the optimal results with that?
e_w
Exactly, I assume that p == q == 2 (similarly to the paper), which simplifies the equations.
p == q == 2
No branches or pull requests
Hi @davda54. Thank you for open-sourcing this implementation.
Wanted to know the reason behind adding
1e-12
in https://github.com/davda54/sam/blob/main/sam.py#L18?The text was updated successfully, but these errors were encountered: