-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preparation for second order #86
Comments
So here's where I'm at. The typical structure of a second-order operator is: function second_order_operator(f, backend::SecondOrder, x)
function inner_operator_closure(z)
inner_extras = prepare_inner_operator(f, inner(backend), z)
return inner_operator(f, inner(backend), z, inner_extras)
end
outer_extras = prepare_outer_operator(inner_operator_closure, outer(backend), x)
return outer_operator(inner_operator_closure, outer(backend), x, outer_extras)
end It's hard to prepare the extras because
My current suggested workflow for preparation (disregarding the
|
Actually this will not work because
|
Partially solved by #135 where the outer differentiation is prepared, but not the inner one. I think it is close to optimal |
I see how it is difficult for us to provide default fallbacks for the inner preparation. How about allowing people to manually deal with the inner preparation by adding an |
Possibly but that would be a very advanced use, and my take is that plenty of things will fail when people first try out the HVP, so optimizing performance that way is not high-priority for me. Besides, for reverse mode backends which do not require preparation and work out of place (Zygote, Tracker), this is already optimal |
In the end I think the easiest approach is to have a mutable extras prepared on the first run, like so: https://discourse.julialang.org/t/second-order-autodiff-which-combinations-should-work/114892/12 |
This sounds reasonable to me. To play the devil's advocate: on which backends are mutable extras doable and more performant than allocating new extras? |
I really can't think of any scenario where modifying a field of a mutable struct is more costly than essentially re-creating that field from scratch |
Sure, but for which backends is it possible? (And while it might not be more costly, it should be strictly less costly to warrant the increase in code complexity.) |
It is doable on all backends. It's not the extras itself you mutate, it's just a field from a wrapper. Here's an example: mutable struct InnerGradientWrapper{F,B}
const f::F
const backend::B
extras::Union{Nothing,GradientExtras} # type-unstable
end
function (igw::InnerGradientWrapper)(x::AbstractVector)
if isnothing(igw.extras)
igw.extras = prepare_gradient(igw.f, igw.backend, x)
end
return gradient(igw.f, igw.backend, x, igw.extras)
end |
I'm just wondering how much the type instability will hurt us here |
Tried it in #291 but the problem is that changing this |
How about the following? mutable struct InnerGradientWrapper{F,B,E<:Union{Nothing,GradientExtras}}
const f::F
const backend::B
extras::E
end |
It's the same discussion that we have had for SparseConnectivityTracer and in #252. The |
In ForwardDiff-over-ReverseDiff, let |
Constructing the right extras becomes very tricky when different inner/outer backends must be called in various ways on closure functions
The text was updated successfully, but these errors were encountered: