Preparation for second order #86

gdalle · 2024-03-21T11:56:48Z

Constructing the right extras becomes very tricky when different inner/outer backends must be called in various ways on closure functions

gdalle · 2024-04-02T16:37:39Z

So here's where I'm at. The typical structure of a second-order operator is:

function second_order_operator(f, backend::SecondOrder, x)
    function inner_operator_closure(z)
        inner_extras = prepare_inner_operator(f, inner(backend), z)
        return inner_operator(f, inner(backend), z, inner_extras)
    end
    outer_extras = prepare_outer_operator(inner_operator_closure, outer(backend), x)
    return outer_operator(inner_operator_closure, outer(backend), x, outer_extras)
end

It's hard to prepare the extras because

the inner operator is called on a variable z generated during the outer operator, so it may not have the same type as x. Typically, it might be a vector of Duals instead of a vector of numbers.
the outer operator extras depend on what is called, in this case a prepared inner operator closure (if it is not prepared it might take a different path, which means the outer operator tape would be wrong)
in one case (reverse-over-forward HVP), the inner operator closure closes over the vector v in addition to the rest, so the preparation signature may need to look different

My current suggested workflow for preparation (disregarding the v thing for now):

Define a function wrapper InputCopier which deepcopies and stores the first thing on which it is called, so that we can see what z is like inside the outer operator
Define the inner operator closure
Wrap it in an InputCopier
Call the outer operator on this, now we have the type of z
Prepare the inner operator closure with z
Prepare the outer operator on the prepared inner operator closure

gdalle · 2024-04-03T05:15:37Z

Actually this will not work because

some backends don't even call the underlying function as-is
some only call it during their own preparation step

gdalle · 2024-04-03T11:35:51Z

Partially solved by #135 where the outer differentiation is prepared, but not the inner one. I think it is close to optimal

adrhill · 2024-04-03T13:29:57Z

I see how it is difficult for us to provide default fallbacks for the inner preparation.

How about allowing people to manually deal with the inner preparation by adding an inner_XYZ_extras field to the HVPExtras and defaulting to NoXYZExtras?

gdalle · 2024-04-05T07:30:14Z

Possibly but that would be a very advanced use, and my take is that plenty of things will fail when people first try out the HVP, so optimizing performance that way is not high-priority for me.

Besides, for reverse mode backends which do not require preparation and work out of place (Zygote, Tracker), this is already optimal

gdalle · 2024-05-30T04:16:28Z

In the end I think the easiest approach is to have a mutable extras prepared on the first run, like so: https://discourse.julialang.org/t/second-order-autodiff-which-combinations-should-work/114892/12

adrhill · 2024-05-30T13:06:49Z

This sounds reasonable to me. To play the devil's advocate: on which backends are mutable extras doable and more performant than allocating new extras?

gdalle · 2024-05-30T16:41:09Z

I really can't think of any scenario where modifying a field of a mutable struct is more costly than essentially re-creating that field from scratch

adrhill · 2024-05-30T17:12:11Z

Sure, but for which backends is it possible?

(And while it might not be more costly, it should be strictly less costly to warrant the increase in code complexity.)

gdalle · 2024-05-30T17:19:26Z

It is doable on all backends. It's not the extras itself you mutate, it's just a field from a wrapper. Here's an example:

mutable struct InnerGradientWrapper{F,B}
    const f::F
    const backend::B
    extras::Union{Nothing,GradientExtras}  # type-unstable
end

function (igw::InnerGradientWrapper)(x::AbstractVector)
    if isnothing(igw.extras)
        igw.extras = prepare_gradient(igw.f, igw.backend, x)
    end
    return gradient(igw.f, igw.backend, x, igw.extras)
end

gdalle · 2024-05-30T18:54:05Z

I'm just wondering how much the type instability will hurt us here

gdalle · 2024-05-31T06:25:37Z

Tried it in #291 but the problem is that changing this extras object modifies the inner state of our gradient closure. As a result, outer preparation becomes invalid

adrhill · 2024-05-31T09:41:13Z

I'm just wondering how much the type instability will hurt us here

How about the following?

mutable struct InnerGradientWrapper{F,B,E<:Union{Nothing,GradientExtras}}
    const f::F
    const backend::B
    extras::E
end

adrhill · 2024-05-31T09:44:00Z

Tried it in #291 but the problem is that changing this extras object modifies the inner state of our gradient closure. As a result, outer preparation becomes invalid

Could you give an example? This is not clear to me from reading the diff in #291 and the PR contains no further comments.

gdalle · 2024-05-31T10:00:49Z

It's the same discussion that we have had for SparseConnectivityTracer and in #252. The InnerGradientWrapper is a closure that changes its state between calls, so reusing preparation is invalid for the outer backend which differentiates through it.

gdalle · 2024-07-26T08:56:25Z

In ForwardDiff-over-ReverseDiff, let g be the gradient closure.
I need the exact type of g to tag the duals for the pushforward.
I need the duals to prepare g and get extras.
I need the extras to have the exact type of g because it closes over them.

gdalle added backend Related to one or more autodiff backends core Related to the core utilities of the package labels Mar 28, 2024

gdalle mentioned this issue May 25, 2024

Second order for vector to vector functions #206

Open

gdalle linked a pull request May 31, 2024 that will close this issue

Self-preparing inner closures for second order #291

Closed

This was referenced Jul 29, 2024

Prepared sparse hessian! calls prepare_gradient at every invocation? #376

Closed

ForwardDiff over anything - prepare inner gradient for HVP #385

Merged

gdalle removed a link to a pull request Sep 19, 2024

Self-preparing inner closures for second order #291

Closed

gdalle mentioned this issue Sep 19, 2024

Self-preparing inner closures for second order #291

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preparation for second order #86

Preparation for second order #86

gdalle commented Mar 21, 2024

gdalle commented Apr 2, 2024 •

edited

Loading

gdalle commented Apr 3, 2024

gdalle commented Apr 3, 2024

adrhill commented Apr 3, 2024

gdalle commented Apr 5, 2024

gdalle commented May 30, 2024

adrhill commented May 30, 2024 •

edited

Loading

gdalle commented May 30, 2024

adrhill commented May 30, 2024

gdalle commented May 30, 2024 •

edited

Loading

gdalle commented May 30, 2024 •

edited by adrhill

Loading

gdalle commented May 31, 2024 •

edited by adrhill

Loading

adrhill commented May 31, 2024

adrhill commented May 31, 2024

gdalle commented May 31, 2024 •

edited

Loading

gdalle commented Jul 26, 2024

Preparation for second order #86

Preparation for second order #86

Comments

gdalle commented Mar 21, 2024

gdalle commented Apr 2, 2024 • edited Loading

gdalle commented Apr 3, 2024

gdalle commented Apr 3, 2024

adrhill commented Apr 3, 2024

gdalle commented Apr 5, 2024

gdalle commented May 30, 2024

adrhill commented May 30, 2024 • edited Loading

gdalle commented May 30, 2024

adrhill commented May 30, 2024

gdalle commented May 30, 2024 • edited Loading

gdalle commented May 30, 2024 • edited by adrhill Loading

gdalle commented May 31, 2024 • edited by adrhill Loading

adrhill commented May 31, 2024

adrhill commented May 31, 2024

gdalle commented May 31, 2024 • edited Loading

gdalle commented Jul 26, 2024

gdalle commented Apr 2, 2024 •

edited

Loading

adrhill commented May 30, 2024 •

edited

Loading

gdalle commented May 30, 2024 •

edited

Loading

gdalle commented May 30, 2024 •

edited by adrhill

Loading

gdalle commented May 31, 2024 •

edited by adrhill

Loading

gdalle commented May 31, 2024 •

edited

Loading