Skip to content

[Flang] Canonicalize divdc3 calls into arithmetic-based complex division #146017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

Hanwen-ece
Copy link

This patch introduces a new MLIR-based frontend pass in Flang to optimize complex division by rewriting calls to __divdc3 into arithmetic-based FIR operations. The pass uses the complex division formula (a + bi)/(c + di) = ((ac + bd)/(c² + d²)) + ((bc - ad)/(c² + d²))i, eliminating external library calls and enabling further optimization in the LLVM pipeline.

Motivation
The __divdc3 helper function is typically used to perform double-precision complex division. However, such calls can inhibit optimization and inline opportunities. By rewriting these calls into raw floating-point operations (arith dialect), we enable better mid-level IR analysis, improve code generation, and reduce runtime dependencies.

Implementation
A new OpRewritePattern<fir::CallOp> match is added to detect calls to __divdc3 with four floating-point operands representing the real and imaginary parts of the numerator and denominator. The transformation applies the standard complex division formula: (a + bi)/(c + di) = ((ac + bd)/(c² + d²)) + ((bc - ad)/(c² + d²))i

This is implemented using the arith.{MulFOp, AddFOp, SubFOp, DivFOp} ops, followed by construction of the complex result via fir.insert_value.

Key details

  • Only active when -mllvm --flang-complex-div-converter is enabled (disabled by default)
  • Integrated with Flang’s FIR-to-LLVM pipeline, ensuring compatibility with downstream optimizations.
  • Handles real and imaginary parts explicitly
  • Ensures type consistency and uses fir.undef + fir.insert_value to build the result

Tests

  • LIT test (flang/test/Fir/target-rewrite-complex-division.fir) added to confirm that calls to __divdc3 are replaced by explicit arithmetic in the IR
  • Verified the correctness of the transformation by matching the expected IR output
  • Ensured no regressions in existing complex arithmetic lowering
  • SPEC2017 benchmark results will be shared in comments

Future Work

  • Extend to support __divsc3, __divxc3, and other complex intrinsics calls
  • Investigate rewrites for __muldc3, __adddc3, etc.
  • Enable backend-specific optimizations on the rewritten complex patterns

Authors
The XSCC compiler team developed this implementation.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@Hanwen-ece
Copy link
Author

SPEC2017 benchmark results have been shared in comments below:

X86-intel(I9-11900K, L1-cache 384KB, L2-cache 4MB, L3-cache 16MB, cache line 64B)      
Benchmark runtime with __divdc3 function calls runtime with arithmetic-based complex division speedup
627.cam4_s 1688s 1529s 1.1039

@Hanwen-ece Hanwen-ece changed the title Canonicalize divdc3 calls into arithmetic-based complex division [Flang] Canonicalize divdc3 calls into arithmetic-based complex division Jun 27, 2025
@s-watanabe314
Copy link
Contributor

Thank you for working on this!

Actually, we've been discussing a similar feature on Discourse. You can refer to it here if you're interested: (https://discourse.llvm.org/t/optimization-of-complex-number-division/83468)

I'm currently working on a feature based on the discussion at the end of that thread, and I plan to post the patch next week. In that patch, specifying the newly added driver option -fcomplex-arithmetic=basic will allow complex number division to be expanded into a algebraic formula instead of using a runtime function.

@Hanwen-ece Hanwen-ece closed this Jun 27, 2025
@Hanwen-ece Hanwen-ece reopened this Jun 27, 2025
@Hanwen-ece
Copy link
Author

Thank you for working on this!

Actually, we've been discussing a similar feature on Discourse. You can refer to it here if you're interested: (https://discourse.llvm.org/t/optimization-of-complex-number-division/83468)

I'm currently working on a feature based on the discussion at the end of that thread, and I plan to post the patch next week. In that patch, specifying the newly added driver option -fcomplex-arithmetic=basic will allow complex number division to be expanded into a algebraic formula instead of using a runtime function.

Thank you for the heads-up, and it’s great to hear that we’re thinking along the same lines—seems like a case of great minds thinking alike!

It’s truly sweet to see that others have been independently thinking along similar lines—clearly, this is an area worth optimizing. I found the discussion quite illuminating and it actually informed some aspects of the design in my implementation.

I’ve recently finished implementing and submitted a patch that addresses this very optimization. It supports expanding complex division algebraically without relying on the runtime function, and it could potentially serve the same goal as the one you’re working on.

Of course, I’d be happy to hear your thoughts or suggestions—collaboration always leads to better outcomes. But perhaps this patch could save you some effort and serve as a starting point or even a complete solution.

Thanks again for the discussion link, I’d be very glad to receive any suggestions or feedback. I really appreciate the collaborative spirit in this community! Have a wonderful day :)

@Hanwen-ece
Copy link
Author

Hi @jeanPerier, this PR implements a new MLIR-based frontend pass in Flang to optimize complex division. As an external contributor, I am unable to assign reviewers directly. Would you be willing to review this change when you have the chance?
Many thanks in advance :)

Copy link
Contributor

@clementval clementval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using the complex dialect directly? There has been discussion on the discourse about that.

@jeanPerier
Copy link
Contributor

jeanPerier commented Jul 1, 2025

Hi @Hanwen-ece, thanks for the patch! I do have a preference using and improving mlir complex.div as discussed in https://discourse.llvm.org/t/optimization-of-complex-number-division/83468/24 instead of lowering the division directly at the FIR level because this duplicates complex div implementation and maintenance IMHO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants