Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Zygote.jacobian etc. #128

Merged
merged 1 commit into from
Feb 7, 2024
Merged

Use Zygote.jacobian etc. #128

merged 1 commit into from
Feb 7, 2024

Conversation

devmotion
Copy link
Member

Zygote already has an interface for pullbacks, jacobians, etc., hence IMO we should use it in AD.jacobian etc. as I guess users (arguably correctly) expect that AD.jacobian performs the same computations and is as efficient as Zygote.jacobian. That is, I think AbstractDifferentiation should not treat Zygote as "just" an example of ChainRules AD backend but rather as a separate entity as - I think - we should not redefine jacobian etc. for Zygote based on the ChainRules pullback with Zygote.ZygoteRuleConfig.

Fixes #54.

Copy link

codecov bot commented Feb 4, 2024

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (afec712) 82.32% compared to head (202c86f) 82.54%.

Files Patch % Lines
ext/AbstractDifferentiationZygoteExt.jl 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #128      +/-   ##
==========================================
+ Coverage   82.32%   82.54%   +0.22%     
==========================================
  Files           8        8              
  Lines         413      424      +11     
==========================================
+ Hits          340      350      +10     
- Misses         73       74       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mohamed82008
Copy link
Member

IIRC, one fundamental difference is that Zygote.jacobian preallocates the Jacobian matrix. This is not good for higher order derivatives with reverse mode because it is mutating.

@devmotion
Copy link
Member Author

My feeling is that this is a problem that should be addressed in Zygote (possibly they should use the default implementation in AbstractDifferentiation based on value_and_pullback_function but that's their decision). Based on the number of open (and closed) issues about AbstractDifferentiation taking different code paths, and how it leads to sometimes different behaviour and sometimes different (typically worse) performance, I think if a function such as jacobian, hessian etc. exists in an AD backend it should be used by AbstractDifferentiation.

@gdalle
Copy link
Member

gdalle commented Feb 5, 2024

I agree with this take, and I push it further here: #129

@mohamed82008
Copy link
Member

Zygote.hessian is arguably even more questionable because it actually uses ForwardDiff for the second derivative.

@devmotion
Copy link
Member Author

devmotion commented Feb 5, 2024

My view is the same there as well: If Zygote.hessian is broken/bad, an issue should be opened in Zygote and possibly they might want to switch to the hessian definition for pullbacks in AbstractDifferentiation. But we should not fix AD backends in AbstractDifferentiation.

@devmotion devmotion marked this pull request as ready for review February 5, 2024 12:16
@devmotion
Copy link
Member Author

Bump 🙂

For other AD backends such as ForwardDiff, ReverseDiff, and Tracker actually we already use jacobian, hessian, etc. in downstream packages when applicable.

return res.val, res.grad
end

AD.jacobian(::AD.ZygoteBackend, f, args...) = Zygote.jacobian(f, args...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we provide a way to opt out of this, perhaps a type parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't provide any opt-outs for any other backend, so my feeling is that we should not add one here. I think it would also unnecessarily complicate the backend (and if there's a problem with Zygote.jacobian I still think it should be fixed in Zygote). If a user really wants to use a different implementation of AD.jacobian I think it should be reasonably easy to define a custom MyZygoteBackend that forwards everything but AD.jacobian to a ZygoteBackend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be the minority vote here in which case I will take the extra inconvenience of defining my own backend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with David on this one

@devmotion devmotion merged commit 2bc18d0 into master Feb 7, 2024
8 checks passed
@devmotion devmotion deleted the dw/zygotebackend branch February 7, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AD.jacobian much slower than Zygote.jacobian
3 participants