-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correction for Adjoint Gradient Calculations #35
Comments
Thanks for opening such a detailed issue! I'm not familiar with the implementation, but is this a bug due to |
Great! Thank you for editing my post. There is also another issue. At sufficiently small time evaluations, the gradient computations via the adjoint equations are inaccurate when compared to those of forward sensitivities. Following the example above, if I evaluate the adjoint equation at time = 0, 5, 10 and grads = np.ones_like(yout) then I get that
However as in my original post, if I evaluate the adjoint equations at np.linspace(0, 10, 21), which includes time = 0, 5, 10, and zeros-pad grads at the time not equal to 0, 5, 10, then I get
Thank you for looking to these issues! |
@aarcher07 Thank you for reporting this, and sorry for the very late reply... I think the problem you are seeing is due to a small mistake in the arguments to solve_backward. If I replace it by this, I get the same results as the forward solver: # Instead of this
#solver.solve_backward(t0=tvals_expanded[-1], tend=tvals_expanded[0], tvals=tvals_expanded[1:-1],
# grads=grads, grad_out=grad_out, lamda_out=lambda_out)
# It should be this
solver.solve_backward(
t0=tvals_expanded[-1],
tend=tvals_expanded[0],
tvals=tvals_expanded,
grads=grads,
grad_out=grad_out,
lamda_out=lambda_out
)
grad_out_adj = -sens0 @ lambda_out + grad_out
print(grad_out_adj)
# Output
# from forward
# [29.633367875233063, -8.63361922455043, 10.2485995824757]
# from adjoint
# [29.63336772 -8.63361915 10.24859955] The problem is that by passing in |
I'm closing this because I think it was a problem in the example code, but feel free to reopen or comment if you don't agree or have questions. |
When computing the gradient with respect to initial conditions, I find that the adjoint and fwd gradient computations differ slightly. Following equation 14 of CVODES manual, the discrepancy is in the adjoint equation and can be corrected by adding the constant, -np.matmul(sens0, lambda_out - grads[0, :]), to grad_out from solve_backward. sens0 is the initial sensitivities, lambda_out is the adjoint variables at time 0 and grads[0, :]) is the derivative of the likelihood with respect to the state variables at time 0.
I have to adjust lambda at 0 by -grads[0, :] because line 691 of sunode/solver.py appears to not loop over the initial time point of grads.
Does SolveODEAdjointBackward in sunode/sunode/wrappers/as_aesara.py implement a similar correction constant when computing the gradient wrt initial conditions?
I attached some code below as an example. It is based on the example script of the readMe. It computes the gradient of the sum of Hares and Lynx at time 0, 5 and 10 wrt alpha, beta and hares0 where hares0 is log10(Hares(0)).
The text was updated successfully, but these errors were encountered: