-
-
Notifications
You must be signed in to change notification settings - Fork 15
Add AutoReactant{AutoEnzyme} #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I understand where you're coming from but I'm not sure this is the right abstraction either. As you said, Reactant is not an AD backend, and this package is for AD backends. |
|
The argument for a distinct thing is that there are reactant-specific attributes (like device/sharding, not yet added here) that are relevant to reactant compilation, but not enzyme. It also can be relevant for other AD tools, but that's presently unsupported, so it's best to keep the assurances limited until then. |
Added documentation for forward, reverse, or sparse mode.
|
Can you give an example of where this would simplify things? Lux's API? |
|
Lux already has a better way of calling from the trainapi overall. However, in places, like for example, some turing/sciml/etc packages where adtype is passed through a long train of configs before being selected on -- this means Reactant can be tried without having to rewrite all the intermediate calls. It still might be better to do that rewriting to enable fusion of AD and the outside context, but this at least makes it easy to use/try out. |
Co-authored-by: Avik Pal <avik.pal.2017@gmail.com>
|
In that case it might be time to revive my initial attempt JuliaDiff/DifferentiationInterface.jl#325 for the general case. I'm not fundamentally opposed to the principle of the current PR (haven't reviewed yed) but there is a documentation challenge to convey what Reactant is, how it is not an AD backend but it can instead combine with AD backends |
Yeah go for, I think that is just waiting for someone to bring it over the line |
|
I think this is fine as an abstraction. It's kind of a detail that Reactant uses Enzyme in order to do its AD. The fact that the Reactant compiler passes use another AD system that is also exposed in Julia is just a detail. To the users, Reactant is just a choice, like ForwardDiff, and it supports some codes and is faster in some cases but has limitations. And importantly, those are different from Enzyme. So to a user of Optimization.jl, SciMLSensitivity.jl, etc,, Reactant is just another choice in the list. @gdalle if you're fine with the merge I think we just go for it. |
|
I haven't yet read the code, I'll review it today to make sure |
gdalle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the previous conversation, I understand that the field should be an actual mode and not a full AutoEnzyme object? If so, there are a few necessary changes (or we name the field differently).
So things like chunk size, or function annotation would equally apply to both. Also I would likely expect folks to implement this by calling into gradient with autoenzyme or similar so the adtype itself could make sense. Tangentially there's a debate at the moment of moving chunk size into the enzyme mode directly |
Right, sorry, the fact that you called it |
|
@gdalle you all good? |
Clarified the description of the 'mode' field in AutoReactant.
gdalle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
It's not really an AD backend, but this can help make things easier for folks to use throughout the ecosystem.
I'm not quite sure this is the right place/abstraction though, alternatively we can add bool flags to turn on reactant/etc as applicable throughout
cc @avik-pal