-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR] Integrated BackpropOp lowering with Enzyme #193
Conversation
…on_like to avoid double free error with generated derivative code
… use 'auto' when clear from context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Just a high level comment regarding memory management functions.
…when use-generic-functions option is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @pengmai 💯 Some questions and comments, happy to approve after that!
"""Run optimizations on the LLVM IR prior to being run through Enzyme.""" | ||
|
||
_executable = get_executable_path("llvm", "opt") | ||
_default_flags = ["-O2", "-S"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @erick-xanadu and @dime10 know more about this, can it impact negatively the compilation time for other programs (no gradients), is it negligible in general?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible, but if it is necessary for correctness, then it is necessary. What we should do first is measure compilation/execution time on a set of representative programs to have certainty of how compiler options affect compilation time and execution time. But that is a work in progress...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @erick-xanadu !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimization is necessary for correctness? 😮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe correctness is the wrong word, but I believe it is necessary for feasibility (it might even improve compile time of gradients when compared to compile time of gradients without -O2
) ? @pengmai can correct me if I'm wrong. One example that I keep in mind from previous conversation is the -mem2reg
optimization. Instead of tracking the indirection due to pointers, pointers get converted to registers. Same with inlining. -O2
might inline some functions that will make the compilation (and execution) of gradients perform better. However, the cost is that we do need to optimize it first.
I believe a lot of Enzyme presentations start with a diagram showing "optimize first" -> "enzyme" -> "optimize again".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, Enzyme's main research thesis is to improve performance by differentiating optimized code, so getting good performance is dependent on first optimizing the code before running Enzyme.
Separately, we see some issues where Enzyme fails to generate code for our tests if we don't optimize the code first. These are likely bugs or unsupported cases within Enzyme, so a workaround is running this optimization first.
[sc-41510] |
Co-authored-by: Romain Moyard <rmoyard@gmail.com>
Codecov Report
@@ Coverage Diff @@
## enzyme_integration #193 +/- ##
======================================================
- Coverage 99.95% 99.10% -0.85%
======================================================
Files 21 37 +16
Lines 4096 6462 +2366
Branches 0 333 +333
======================================================
+ Hits 4094 6404 +2310
- Misses 2 33 +31
- Partials 0 25 +25
|
…yst into jmp/enzyme_integration
…m, run those passes separately afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pengmai, great PR 🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
* Draft integration * Shape * optional * Add Enzyme * Draft * more * Checkout 0.0.69 * Update * Right versions * Compile works * Wheels and tests * Typo * Black * Change hash * Add dep version * Typo * Remove cache llvm * Update lit * Remove bin * Remove * Add ls * CHnage key * ENzyme path * test * Change * Add tests * Black * Typo * Compiling * Revive param count * Revive param count * Update * Remove comment * Compute backprop res type * Readd tests * Update buff * Change tests * Changes * Working insert slice * Add diff arg indices * Add copyright * Update * Fix tests buff and hybrid * Test param shift * Fix classical and adjoint * Fix parameter shift tests * Fix tests * Pylint disable * Add globals * Typos * Update tests openqasm * Comment from review * More * Correct enzyme path * Add free * Typos * Explicitly call preserve-nvvm with Enzyme * Add loose types * Remove tests and revive pcount in argmap * Update to 0.0.74 * Update Enzyme * Update format * Fix wheels * Attempt to fix failing upload to codecov by bumping Enzyme to working build * [MLIR] Integrated BackpropOp lowering with Enzyme (#193) * Rework Enzyme calling convention to unpack pointers to fix incorrect zero gradients * Exclude Enzyme from autoformatting * Update MemSetOp builder API * Switch metadata approach from enzyme_function_like to enzyme_allocation_like to avoid double free error with generated derivative code * Remove Enzyme wrapper function, switch argmap to destination-passing style * Remove parameter count from argmap parameters * Fix bug with Enzyme calling convention around inactive memrefs * Update backprop lowering test * Add index dialect to dependent dialects of gradient_to_llvm * Update pinned version of Enzyme to one that builds with LLVM 17, only use 'auto' when clear from context * Get generic alloc/free names from MLIR utilities, only register them when use-generic-functions option is set * Update changelog * Remove enzyme-loose-types flag Co-authored-by: Romain Moyard <rmoyard@gmail.com> * Add tests for PreEnzymeOpt pipeline, add explicit enzyme_const annotation for floats * Remove index -> llvm and memref -> llvm patterns from gradient -> llvm, run those passes separately afterwards * Autoformatting --------- Co-authored-by: Romain Moyard <rmoyard@gmail.com> * Enzyme integration + Deallocation (#201) * Re-enable buffer deallocation passes. * Register _mlir_memref_to_llvm_free as "like" free. Using the allocation registration mechanism for Enzyme, fails to automatically register the free-like function as free. Likely due to a possibility on different arity. * Explicitly name post enzyme file as ".postenzyme.ll" * Comments. * Readd tests * Fix adjoint * Fix parameter shift * Update classical jac * format * Smaller fixes to the backprop operation & lowering (#224) * Fix bug with grad on classical qjit function If the grad target is a qjit object, the current implementation assumes that the wrapped function is already a "JAX function", that is a `QNode` or `Function` object. This is not the case however when a bare Python function is wrapped in qjit. * Refactor `compDiffArgIndices` and add... ... `computeDiffArgs` as a convenience method to filter the callee argument list. * Tighten backprop type declarations & add verifier For example, the op didn't check whether the declared number of gradient results matches the number of differentiable arguments, a bug in one of the bufferization tests. The `quantum_jacobian` operand name was switched out to `cotangents` since it more generally describes the purpose of that argument, and in the way it is used now only receives a slice of the quantum jacobian anyways. * Allocate primal out buffers during bufferization Reorganizes backprop op arguments into: - primal `args` - `diffArgShadows` or gradient result buffers - `calleeResults` or primal output buffers - `cotangents` or callee result shadows * Update mlir/lib/Gradient/Transforms/BufferizationPatterns.cpp Co-authored-by: Jacob Mai Peng <jacobmpeng@gmail.com> --------- Co-authored-by: Jacob Mai Peng <jacobmpeng@gmail.com> * Remove too many free * Format * Right indentation * Change function like * Update from review * [MLIR] Add support for differentiating scalars with the backprop op (#225) * Add docstring explaining computeMemRefSizeInBytes * Update changelog * Support scalars in BackpropOp * WIP: Adding back removed tests * Add float type returns to backprop op * Bug fixes with scalar differentiation support * Add scalar support to verifier, revert changes to test * Add back more tests * Try to clarify role of scalarIndices * Add verifier for shaped differentiable types, fix bug from merge commit * Update comment Co-authored-by: David Ittah <dime10@users.noreply.github.com> * Add back more deleted tests, comment explaining scalar unpacking of the enzyme_autodiff call * Clarify terminology around quantum gradients/cotangents * Update shadow vs data in terminology in comments --------- Co-authored-by: David Ittah <dime10@users.noreply.github.com> * Update doc/changelog.md Co-authored-by: David Ittah <dime10@users.noreply.github.com> * More tests * Add tests * Changelog. --------- Co-authored-by: Jacob Mai Peng <jacobmpeng@gmail.com> Co-authored-by: erick-xanadu <110487834+erick-xanadu@users.noreply.github.com> Co-authored-by: David Ittah <dime10@users.noreply.github.com> Co-authored-by: Erick <erick.ochoalopez@xanadu.ai>
Context: The existing Enzyme integration has issues that prevent Enzyme from compiling our code and producing the correct results.
Description of the Change:
Benefits: The integration tests around hybrid gradients produce the correct results using Enzyme. Enzyme is computing exact classical gradients via backpropagation instead of estimating them with finite differences.
Note that although the hybrid gradients are producing correct results, the complete hybrid gradient work is still considered WIP. This PR primarily focuses on the lowering of the
BackpropOp
.