An LLVM Fused Multiply-Add (FMA) Optimization Pass
An LLVM pass to detect and transform sequences of separate FMUL and FADD instructions into a single fused multiply-add instruction (FMA), if the target hardware supports it.
To be able to compile the FMA optimization pass, you can run the following:
clang++ -std=17 -fPIC \
-shared FusedMultiplyAdd.cpp -o libFMA.dylib \
$(llvm-config --cxxflags --ldflags) -lLLVMTo be able to compile the C sources (tests) to LLVM IR:
clang -O0 -Xclang -disable-O0-optnone -S -emit-llvm \
test.c -o test.llOptionally, you can run an opt pass to promote allocations to registers:
opt -passes=mem2reg test.ll -S -o test_simplified.llTo run the FMA optimization pass on a test file:
opt -load-pass-plugin=./libFMA.dylib \
-passes=fma \
your_ir.ll -S -o your_ir_optimized.ll