Tracking compiler inefficiencies #357

mratsim · 2024-02-11T20:57:39Z

As mentioned in https://github.com/mratsim/constantine/blob/661a481/README-PERFORMANCE.md#compiler-caveats
Compilers have a hard time optimizing bigint operations, even as simple as an addition with carries.

This issue track their evolution and the quality of the code generated with compiler builtins for ISAs of interest.

Note that as of February 2024, we use:

2019, GCC 9.2 and Clang 9.0

Even with intrinsics, an operation as simple as addition-with-carry is uglily implemented in GCC.
This has been mentioned by the GMP folks 30 years ago: https://gmplib.org/manual/Assembly-Carry-Propagation.html

GMP fixed the x86 intrinsics but unfortunately the portable intrinsics has a terribad codegen and hence makes a terrible fallback for ARM.

Due to GCC abysmal __builtin__addcll it is a non-starter.
Clang has decent codegen.

Assembly is still very much needed.

This also explains the bad ARM performance on Apple M1, M2, M3, mentioned by @agnxsh (#354 (comment)) and @bkomuves

The text was updated successfully, but these errors were encountered:

mratsim added upstream 🐉 performance 🏁 labels Feb 11, 2024

mratsim mentioned this issue May 1, 2024

EIP-2537 - BLS12-381 precompiles for the EVM #368

Merged

3 tasks

mratsim mentioned this issue May 9, 2024

C API for BLS signature aggregate_verify & batch_verify (incl parallel) #381

Merged