We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In the following code for computing the (unnormalized) log probability, the network output is divided by b0.
b0
recovery_likelihood/model.py
Line 154 in c77cc05
I wonder if there is a legitimate explanation for this division.
b0 is supposed to be step_size_square, which usually has a very small value.
step_size_square
Line 184 in c77cc05
I wonder if dividing by this b0 makes the gradient too large and harms the training in some settings.
The text was updated successfully, but these errors were encountered:
I think that's the scaling trick explained in "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models" see Appendix A here.
Sorry, something went wrong.
No branches or pull requests
In the following code for computing the (unnormalized) log probability, the network output is divided by
b0
.recovery_likelihood/model.py
Line 154 in c77cc05
I wonder if there is a legitimate explanation for this division.
b0
is supposed to bestep_size_square
, which usually has a very small value.recovery_likelihood/model.py
Line 184 in c77cc05
I wonder if dividing by this
b0
makes the gradient too large and harms the training in some settings.The text was updated successfully, but these errors were encountered: