Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement AbstractDifferentiation interface #550

Closed
wants to merge 7 commits into from

Conversation

sethaxen
Copy link
Member

@sethaxen sethaxen commented Oct 4, 2021

This PR implements the AbstractDifferentiation interface.

It adapts the pushforward_function-based implementation in the AbstractDifferentiation tests and replaces several of the fallbacks with more efficient FD-specific implementations.

@sethaxen
Copy link
Member Author

sethaxen commented Oct 4, 2021

Test failure on nightly is unrelated to this PR. Test failure on v1.0 is due to JuliaDiff/AbstractDifferentiation.jl#18.

I noticed several implementations of an FD backend out there (see e.g. Turing's and LogDensityProblems') allow specifying the chunk size. Perhaps we should do that here as well? The challenge is that the default chunk size is set based on the size of the inputs to gradient, derivative, etc, which we don't know, so we would have to pick some global default chunk size.

Should ForwardDiffBackend be added to the API?

@andreasnoack
Copy link
Member

I think it would be useful with some motivation for this. What are the benefits? Are there any potential drawbacks?

@sethaxen
Copy link
Member Author

sethaxen commented Oct 6, 2021

Sure! All of the (many!) AD packages have different interfaces. As a result, when some package uses AD internally, the developers end up designing their own interface to unify a subset of AD packages and some way for users to set one of the supported AD packages as a backend. A small number of examples are Turing, Manifolds, LogDensityProblems, Optim, etc.

AbstractDifferentiation cuts back on this common boilerplate by offering a unified interface for doing basic AD operations (derivatives, gradiebts, jacobians, pushforwards, and pullbacks for number/arrays of numbers). The goal is for every AD package to implement this interface, and then packages that want to use AD for these basic operations can do so in an AD-agnostic manner. For this to be useful, the most popular AD packages need to implement the interface, hence this PR.

The only downside I see right now is that AbstractDifferentiation is new, and the interface could change, which would require maintenance here. But since I'm guessing ForwardDiff is the most used AD package in the ecosystem, I'd expect any changes would get propagated here fairly quickly.

One of the AbstractDifferentation devs (@mohamed82008 @frankschae) could say more.

@mohamed82008
Copy link
Member

I see all the benefits of AbstractDifferentiation but I don't know if we want to open PRs to the AD packages in Julia or to AbstractDifferentiation itself using Requires to load the glue code when different AD packages are loaded. I think the former makes more sense when things are stable but the latter probably makes more sense now?

@sethaxen
Copy link
Member Author

sethaxen commented Oct 6, 2021

Good point. I'll close this for now in favor of a PR to AbstractDifferentiation.jl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants