Start using DifferentiationInterface #140

gdalle · 2024-05-29T07:55:49Z

This PR is the beginning of a solution for #25. It shows how you can use DifferentiationInterface to compute derivatives in a backend-agnostic fashion.
It seems mergeable to me, but here are some more things you can do to increase performance:

Figure out if you need several backend objects or just one. For instance, derivatives are more efficient with a forward-mode backend, while gradients are usually more efficient with a reverse-mode backend.
Adjust the public API so that you can pass the backend object(s) from the user-facing functions all the way down to the utility functions.
Decide if you can reuse the extras that arise from the preparation step, and if so, adjust the code to initialize them once and then pass them around.

Ping @ocots @jbcaillau

Project.toml

src/CTBase.jl

src/utils.jl

jbcaillau · 2024-05-29T21:46:46Z

Hi @gdalle; many thanks for the PR!

This PR is the beginning of a solution for #25. It shows how you can use DifferentiationInterface to compute derivatives in a backend-agnostic fashion. It seems mergeable to me, but here are some more things you can do to increase performance:

Figure out if you need several backend objects or just one. For instance, derivatives are more efficient with a forward-mode backend, while gradients are usually more efficient with a reverse-mode backend.

Sure. Right now forward mode is OK for the internal AD we need. (Mostly taking gradients of functions with < 1e2 variables.) There would be good reasons to switch to reverse, though. To be tested.

Adjust the public API so that you can pass the backend object(s) from the user-facing functions all the way down to the utility functions.

Yes. Right now, defining a single default __auto = AutoForwardDiff can do the job, but good point.

Decide if you can reuse the extras that arise from the preparation step, and if so, adjust the code to initialize them once and then pass them around.

Just have had a look at DifferentiationInterface doc: very nice mechanism. Will be very interesting to use when (i) solving an ADNLPModel (iterative calls to gradients of objective and constraints) in CTDirect.jl, (ii) solving a shooting equation (Newton like method).

gdalle · 2024-05-30T04:24:57Z

I made the changes, not sure why CI failed the first time but the tests pass locally (at least on 1.10). I guess it's related to your local registry

jbcaillau

thanks @gdalle your suggestion to change the API is indeed the good one to have a full parametrisation by a single backend. The more sophisticated alternative being to be able to specify and combine several backends (e.g. to compute second order derivatives).

gdalle · 2024-05-30T08:12:03Z

The more sophisticated alternative being to be able to specify and combine several backends (e.g. to compute second order derivatives).

Yes, in the general case you may want to let the user specify

a forward mode backend for scalar derivatives
a reverse mode backend for gradients
a (sparse) forward mode backend for Jacobians
a (sparse) forward-over-reverse DifferentiationInterface.SecondOrder backend for Hessians

But for problems with <100 variables, I think ForwardDiff is close to optimal in all of these settings, provided you reuse the preparation step.

gdalle · 2024-05-30T08:14:25Z

Did you figure out why CI fails?

ocots · 2024-05-30T08:57:33Z

Did you figure out why CI fails?

Maybe the file .github/worflows/CI.yml has to be updated:

actions/checkout@v4
uses: julia-actions/add-julia-registry@v2

    steps:
      - uses: actions/checkout@v4
      - uses: julia-actions/setup-julia@latest
        with:
          version: ${{ matrix.version }}
          arch: ${{ matrix.arch }}
      - uses: julia-actions/add-julia-registry@v2
        with:
          key: ${{ secrets.SSH_KEY }}
          registry: control-toolbox/ct-registry
      - uses: julia-actions/julia-runtest@latest
      - uses: julia-actions/julia-uploadcodecov@latest
        env:
          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

ocots · 2024-05-30T09:04:38Z

In the package CTFlows.jl, we build a rhs of an ode computing derivatives of a function. See here.

Here is the piece of code:

function rhs(h::AbstractHamiltonian)
    function rhs!(dz::DCoTangent, z::CoTangent, v::Variable, t::Time)
        n      = size(z, 1) ÷ 2
        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)
        dh     = ctgradient(foo, z)
        dz[1:n]    =  dh[n+1:2n]
        dz[n+1:2n] = -dh[1:n]
    end
    return rhs!
end

If we want to use the preparation step, should we keep track of extras inside rhs or ctgradient?

function rhs(h::AbstractHamiltonian)

    # compute preparation step here to get extras?

    function rhs!(dz::DCoTangent, z::CoTangent, v::Variable, t::Time)
        n      = size(z, 1) ÷ 2
        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)

        dh     = ctgradient(foo, z)  # use extras as an argument to ctgradient?

        dz[1:n]    =  dh[n+1:2n]
        dz[n+1:2n] = -dh[1:n]
    end
    return rhs!
end

jbcaillau · 2024-05-30T09:13:55Z

@ocots side note: any reason to write this

        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)

instead of

        foo(z) = h(t, z[1:n], z[n+1:2n], v)

jbcaillau · 2024-05-30T09:23:11Z

The more sophisticated alternative being to be able to specify and combine several backends (e.g. to compute second order derivatives).

Yes, in the general case you may want to let the user specify

a forward mode backend for scalar derivatives

a reverse mode backend for gradients

a (sparse) forward mode backend for Jacobians

a (sparse) forward-over-reverse DifferentiationInterface.SecondOrder backend for Hessians

✅ check this WIP

But for problems with <100 variables, I think ForwardDiff is close to optimal in all of these settings, provided you reuse the preparation step.

✅

jbcaillau · 2024-05-30T09:29:25Z

checks passed, well done @ocots 👍🏽 please document somewhere what you did to solve the CI issue (Error: Input required and not supplied: key)

Did you figure out why CI fails?

Maybe the file .github/worflows/CI.yml has to be updated:

gdalle · 2024-05-30T09:30:46Z

If we want to use the preparation step, should we keep track of extras inside rhs or ctgradient?

The general rules of preparation are given in this section of the docs. See in particular the paragraph on reusing extras.

The trouble here is that you need the function foo to prepare the gradient operator. If it is generated inside rhs! as a closure, then it seems you cannot do much preparation at all.

ocots · 2024-05-30T09:39:47Z

@ocots side note: any reason to write this

        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)

instead of

        foo(z) = h(t, z[1:n], z[n+1:2n], v)

I thing it is because of the scalar case.

ocots · 2024-05-30T09:45:10Z

checks passed, well done @ocots 👍🏽 please document somewhere what you did to solve the CI issue (Error: Input required and not supplied: key)

Did you figure out why CI fails?

Maybe the file .github/worflows/CI.yml has to be updated:

See here.

jbcaillau · 2024-05-30T09:50:35Z

checks passed, well done @ocots 👍🏽 please document somewhere what you did to solve the CI issue (Error: Input required and not supplied: key)

Did you figure out why CI fails?

Maybe the file .github/worflows/CI.yml has to be updated:

See here.

🙏🏽 Reads "If your package depends on private packages registered in a private registry..." Indeed another reason for the next move to the general registry 🤞🏾

jbcaillau · 2024-05-30T09:52:25Z

@ocots side note: any reason to write this
        foo(z) = h(t, z[rg(1,n)], z[rg(n+1,2n)], v)
instead of
        foo(z) = h(t, z[1:n], z[n+1:2n], v)
I thing it is because of the scalar case.

Oh right. Again. No z[1:1] on a scalar.

ocots · 2024-05-30T09:53:10Z

If we want to use the preparation step, should we keep track of extras inside rhs or ctgradient?

The general rules of preparation are given in this section of the docs. See in particular the paragraph on reusing extras.

The trouble here is that you need the function foo to prepare the gradient operator. If it is generated inside rhs! as a closure, then it seems you cannot do much preparation at all.

Thanks for links. I see your point.

There is no possibility to differentiate a parametric function f(x, p) = y with respect to x and do a preparation step also?

gdalle · 2024-05-30T09:56:05Z

There is no possibility to differentiate a parametric function f(x, p) = y with respect to x and do a preparation step also?

DifferentiationInterface was built to support 1-argument functions f(x) = y or f!(y, x). At the moment it does not support multiple arguments, and I don't think it ever will. The reason is that many AD backends themselves (like ForwardDiff) don't support multiple arguments.

Start using DifferentiationInterface

f18da04

gdalle mentioned this pull request May 29, 2024

Possible direct users gdalle/DifferentiationInterface.jl#134

Open

jbcaillau changed the base branch from main to differentiationinterface May 29, 2024 21:10

jbcaillau reviewed May 29, 2024

View reviewed changes

Project.toml Show resolved Hide resolved

src/CTBase.jl Show resolved Hide resolved

src/utils.jl Outdated Show resolved Hide resolved

jbcaillau mentioned this pull request May 29, 2024

AD: change the backend transparently #25

Open

2 tasks

Add __auto

1314e97

jbcaillau approved these changes May 30, 2024

View reviewed changes

jbcaillau merged commit 7440621 into control-toolbox:differentiationinterface May 30, 2024
0 of 4 checks passed

gdalle mentioned this pull request May 30, 2024

Preparing functions of 2 variables with respect to just one. gdalle/DifferentiationInterface.jl#286

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start using DifferentiationInterface #140

Start using DifferentiationInterface #140

gdalle commented May 29, 2024 •

edited

Loading

jbcaillau commented May 29, 2024

gdalle commented May 30, 2024

jbcaillau left a comment

gdalle commented May 30, 2024

gdalle commented May 30, 2024

ocots commented May 30, 2024

ocots commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

gdalle commented May 30, 2024 •

edited

Loading

ocots commented May 30, 2024

ocots commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

ocots commented May 30, 2024

gdalle commented May 30, 2024

Start using DifferentiationInterface #140

Start using DifferentiationInterface #140

Conversation

gdalle commented May 29, 2024 • edited Loading

jbcaillau commented May 29, 2024

gdalle commented May 30, 2024

jbcaillau left a comment

Choose a reason for hiding this comment

gdalle commented May 30, 2024

gdalle commented May 30, 2024

ocots commented May 30, 2024

ocots commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

gdalle commented May 30, 2024 • edited Loading

ocots commented May 30, 2024

ocots commented May 30, 2024

jbcaillau commented May 30, 2024

jbcaillau commented May 30, 2024

ocots commented May 30, 2024

gdalle commented May 30, 2024

gdalle commented May 29, 2024 •

edited

Loading

gdalle commented May 30, 2024 •

edited

Loading