You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, if I have a function f(a, b, c) and I only want to create a function which returns the gradient w.r.t. to a and b, I have two options:
∇f(a, b, c) = ReverseDiff.gradient((x, y) -> f(x, y, c), (a, b)))
∇f! = ReverseDiff.compile_gradient(f, (a, b, c)), and just ignore the c gradient that will pop out
The former has to re-record the function for every call, while the latter wastes some computation differentiating w.r.t. c.
We should support something akin to Tensorflow's placeholders for the pre-recorded API, allowing you to drop in updatable parameters that aren't differentiated against. This can be accomplished by recording the tape as normal, and then "turning off" differentiation on the selected parameters (the idiom for that currently is to set the tape to NULL_TAPE, but I'm going to play around with it). Some refactoring should probably be done to get the most out of this change performance-wise (e.g., allow the instantiation of a TrackedArray with deriv == nothing).
As for the API, I can think of two different paths we could take:
Select which arguments are to be differentiated against using a wrt function, e.g. ReverseDiff.compile_gradient(f, (wrt(a), wrt(b), c))
Select which arguments are not to be differentiated against using a param function, e.g. ReverseDiff.compile_gradient(f, (a, b, param(c)))
The text was updated successfully, but these errors were encountered:
Right now, if I have a function
f(a, b, c)
and I only want to create a function which returns the gradient w.r.t. toa
andb
, I have two options:∇f(a, b, c) = ReverseDiff.gradient((x, y) -> f(x, y, c), (a, b)))
∇f! = ReverseDiff.compile_gradient(f, (a, b, c))
, and just ignore thec
gradient that will pop outThe former has to re-record the function for every call, while the latter wastes some computation differentiating w.r.t.
c
.We should support something akin to Tensorflow's placeholders for the pre-recorded API, allowing you to drop in updatable parameters that aren't differentiated against. This can be accomplished by recording the tape as normal, and then "turning off" differentiation on the selected parameters (the idiom for that currently is to set the tape to
NULL_TAPE
, but I'm going to play around with it). Some refactoring should probably be done to get the most out of this change performance-wise (e.g., allow the instantiation of aTrackedArray
withderiv == nothing
).As for the API, I can think of two different paths we could take:
wrt
function, e.g.ReverseDiff.compile_gradient(f, (wrt(a), wrt(b), c))
param
function, e.g.ReverseDiff.compile_gradient(f, (a, b, param(c)))
The text was updated successfully, but these errors were encountered: