Skip to content

Conversation

@wsmoses
Copy link
Collaborator

@wsmoses wsmoses commented Nov 7, 2025

It's not really an AD backend, but this can help make things easier for folks to use throughout the ecosystem.

I'm not quite sure this is the right place/abstraction though, alternatively we can add bool flags to turn on reactant/etc as applicable throughout

cc @avik-pal

@gdalle
Copy link
Collaborator

gdalle commented Nov 7, 2025

I understand where you're coming from but I'm not sure this is the right abstraction either. As you said, Reactant is not an AD backend, and this package is for AD backends.
Is there any reason to use Reactant with another AD than Enzyme? If not, maybe a parameter inside AutoEnzyme would make more sense?

@wsmoses
Copy link
Collaborator Author

wsmoses commented Nov 7, 2025

The argument for a distinct thing is that there are reactant-specific attributes (like device/sharding, not yet added here) that are relevant to reactant compilation, but not enzyme.

It also can be relevant for other AD tools, but that's presently unsupported, so it's best to keep the assurances limited until then.

Added documentation for forward, reverse, or sparse mode.
@gdalle
Copy link
Collaborator

gdalle commented Nov 7, 2025

Can you give an example of where this would simplify things? Lux's API?

@wsmoses
Copy link
Collaborator Author

wsmoses commented Nov 7, 2025

Lux already has a better way of calling from the trainapi overall.

However, in places, like for example, some turing/sciml/etc packages where adtype is passed through a long train of configs before being selected on -- this means Reactant can be tried without having to rewrite all the intermediate calls. It still might be better to do that rewriting to enable fusion of AD and the outside context, but this at least makes it easy to use/try out.

Co-authored-by: Avik Pal <avik.pal.2017@gmail.com>
@gdalle
Copy link
Collaborator

gdalle commented Nov 7, 2025

In that case it might be time to revive my initial attempt JuliaDiff/DifferentiationInterface.jl#325 for the general case.

I'm not fundamentally opposed to the principle of the current PR (haven't reviewed yed) but there is a documentation challenge to convey what Reactant is, how it is not an AD backend but it can instead combine with AD backends

@wsmoses
Copy link
Collaborator Author

wsmoses commented Nov 7, 2025

In that case it might be time to revive my initial attempt JuliaDiff/DifferentiationInterface.jl#325 for the general case.

Yeah go for, I think that is just waiting for someone to bring it over the line

@ChrisRackauckas
Copy link
Member

I think this is fine as an abstraction. It's kind of a detail that Reactant uses Enzyme in order to do its AD. The fact that the Reactant compiler passes use another AD system that is also exposed in Julia is just a detail. To the users, Reactant is just a choice, like ForwardDiff, and it supports some codes and is faster in some cases but has limitations. And importantly, those are different from Enzyme. So to a user of Optimization.jl, SciMLSensitivity.jl, etc,, Reactant is just another choice in the list.

@gdalle if you're fine with the merge I think we just go for it.

@gdalle
Copy link
Collaborator

gdalle commented Nov 14, 2025

I haven't yet read the code, I'll review it today to make sure

Copy link
Collaborator

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the previous conversation, I understand that the field should be an actual mode and not a full AutoEnzyme object? If so, there are a few necessary changes (or we name the field differently).

@wsmoses
Copy link
Collaborator Author

wsmoses commented Nov 14, 2025

Based on the previous conversation, I understand that the field should be an actual mode and not a full AutoEnzyme object? If so, there are a few necessary changes (or we name the field differently).

So things like chunk size, or function annotation would equally apply to both. Also I would likely expect folks to implement this by calling into gradient with autoenzyme or similar so the adtype itself could make sense.

Tangentially there's a debate at the moment of moving chunk size into the enzyme mode directly

@gdalle
Copy link
Collaborator

gdalle commented Nov 14, 2025

So things like chunk size, or function annotation would equally apply to both. Also I would likely expect folks to implement this by calling into gradient with autoenzyme or similar so the adtype itself could make sense.

Right, sorry, the fact that you called it mode confused me a little. Can we call the field backend instead, and add constructors that take the mode as argument?

@wsmoses
Copy link
Collaborator Author

wsmoses commented Nov 15, 2025

@gdalle you all good?

Clarified the description of the 'mode' field in AutoReactant.
Copy link
Collaborator

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@gdalle gdalle merged commit 0eeb716 into SciML:main Nov 15, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants