Skip to content

Proposal: Add fma_{mul,div} for FMA-based complex operations #146

@zhongyi51

Description

@zhongyi51

Proposal

I propose adding fma_mul and fma_div methods to the Complex type. These methods would leverage fused multiply-add (FMA) operations for the calculation.

Motivation

Using FMA can offer significant performance benefits on hardware with native support, but it comes with important trade-offs:

  • Performance Variance: On modern CPUs that support FMA instructions (e.g., AArch64), these methods can be faster. However, without native hardware support, the compiler may fall back to a slow software library call (fmaf).

  • Numerical Differences: FMA computes a * b + c with a single rounding operation. This means the results from an FMA-based method are not guaranteed to be bit-for-bit identical to the standard methods.

Implementation

This Compiler Explorer link clearly illustrates the performance dichotomy between architectures and compiler settings: https://godbolt.org/z/joW4eqvT9

If this approach is ok, I would be happy to implement it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions