Add BLAS triangular solve derivatives#2825
Conversation
|
This is intended as the first step in solving EnzymeAD/Enzyme.jl#3039 Full disclosure: This is heavily LLM generated because I do not understand the LLVM/etc. side. Beyond my capability, but given the comment from @vchuravy it seemed that this approach was the only feasible one to get support. I tried to test the usage with downstream Enzyme.jl by flipping a fallback flag, etc. but I might have made a mistake. |
|
Thanks @wsmoses I was about to ping you. This is a test to see if I can get an LLM to generate the C++ rules (impenetrable for someone like me) to get the triangular and (and eventually LU stuff) up to date. I want to push out some communications on Enzyme to the economics community but right now it is failing on some basic examples. If what the LLM generated is complete gibberish then I should just close this entirely. If it isn't then keep putting comments and I will tell the AI agent to implement anything you suggest and push again. |
|
Updated this to reduce the helper surface and address the review comments. Main changes:
Validation:
|
|
@wsmoses OK, take two with the AI. It changed the code to simplify given your feedback. I feel like this is worth at most one more iteration. If this is nonsense and a waste of your time just let me know and I will kill it. If you think the AI is getting pretty close to the right design then we can continue the experiment. |
wsmoses
left a comment
There was a problem hiding this comment.
I don't understand why we would need any changes to enzyme/Enzyme or tools/enzyme-tblgen other than adding def trsv, or adding the forward mode rules to things like potrf/etc.
Concurrently, it would be a lot easier to separate these into individual prs for each rule added.
|
@wsmoses Splitting this up per your suggestion. First of the split is #2828, which is just |
Summary
Adds native BLAS/LAPACK derivative support for real triangular solve routines:
trsvtrsmpotrsThis covers forward and reverse mode for real
s/droutines, with tests for Fortran and selected CBLAS entry points.Implementation Notes
The
trsvrule follows the existingtrtrsshape with a single RHS/vector solve. Reverse mode needs to solve thexcotangent before forming theAcotangent, so the generated reverse rewrite order is extended fortrsvliketrtrs.The
trsmrule adds activeA/Bsupport with inactivealpha. It also addstrsmto BLAS extraction. Supporting both left/right side solves and CBLAS layout required a few small tablegen extensions rather than hand-written derivative code.The
potrsrule handles differentiation through the Cholesky solve using triangular BLAS operations. This does not add pivot support and does not address LUgetrf/getrs.Generator/Infrastructure Changes
Most changes are local extensions to the existing BLAS tablegen machinery:
transto all BLAS char arguments such asuplo,diag, andside.mat_ldsupport for temporary leading dimensions that depend on CBLAS row-major vs column-major layout.side_squaretemporary shape fortrsm, where the triangular factor dimension depends onside.DepDAG helper so generated operands can express data/cache dependencies while emitting the intended BLAS argument.Tests
Adds focused lit tests for:
dtrsv_64_dtrsm_64_cblas_dtrsm64_dpotrs_64_Extends BLAS integration tracing and integration tests for forward and reverse triangular solve coverage.
I also validated the Julia integration locally by building Enzyme.jl against this branch with
deps/build_local.jl, temporarily disabling fallback fortrsv,trsm, andpotrs, and running:julia --project=test test/runtests.jl --verbose blasjulia --project=test test/runtests.jl --verbose rules/internal_rules/linear_algebra_rulesThe Julia checks passed without fallback warnings for
BLAS.trsv!andBLAS.trsm!. LUgetrf/getrsand high-levelA \ bremain separate follow-up work.