-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adjoint vector-Jacobian product form of precomputed gradients for reverse #876
Comments
Here's a sketch that should work once it's debugged. All the work's done in the
|
Yes, that's the n-th dimension of the result (f maps R^N -> R^M). If we store the Jacobian, then the code is the naive loop to propagate each of the M output adjoints ot each of the N input adjoints. The goal's not to cut down on arithmetic. It might save memory by being lazy about the product compared to a simpler implementation. But the main goal is just to make it easier to write efficient multivariate functions without having to deal with a ton of fiddly memory operations. |
My mistake. I just always get confused about the indexing of Jacobians because it's R^N -> R^M. The loop should be over m in 1:M and and index f(x)[m]. |
adjoint-vector product, #876 Thanks!
Was this fixed by #924? |
Yeah, I think this closes good catch. |
Summary:
Add precomputed gradients class for multivariate functions that reduces user burden from understanding autodiff stack management and expression templates to computing the function and a simple vector-Jacobian product.
Description:
For a multivariate function
f:R^N -> R^M
and inputx in R^N
, the chain rule for reverse mode autodiff unfolds to:where
J
is theM x N
Jacobian, i.e.,Taken as a whole, this is neatly expressed as:
So we should be able to take a user-defined functor:
From that, we can create an appropriate implementation of
chain()
for the first result and make the rest non-chainable as we have done for other specialized function. Thischain()
method will just call themultiply_adjoint_jacobian
method of the object and do the increments. The user will never have to even think about our autodiff types.The obvious next step would be to generalize to matrix arguments (not so hard given Eigen's underlying memory layout and access) and then to pairs of arguments in
{matrix, vector, scalar}
. That might not be too hard given Eigen's templating and underlying linear memory layout and indexing.Current Version:
v2.17.0
The text was updated successfully, but these errors were encountered: