We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As mentioned in https://github.com/mratsim/constantine/blob/661a481/README-PERFORMANCE.md#compiler-caveats Compilers have a hard time optimizing bigint operations, even as simple as an addition with carries.
This issue track their evolution and the quality of the code generated with compiler builtins for ISAs of interest.
Note that as of February 2024, we use:
_addcarry_u64
The original problem: https://gcc.godbolt.org/z/2h768y
Even with intrinsics, an operation as simple as addition-with-carry is uglily implemented in GCC. This has been mentioned by the GMP folks 30 years ago: https://gmplib.org/manual/Assembly-Carry-Propagation.html
https://gcc.godbolt.org/z/jdecvffaP
GMP fixed the x86 intrinsics but unfortunately the portable intrinsics has a terribad codegen and hence makes a terrible fallback for ARM.
Due to GCC abysmal __builtin__addcll it is a non-starter. Clang has decent codegen.
Assembly is still very much needed.
This also explains the bad ARM performance on Apple M1, M2, M3, mentioned by @agnxsh (#354 (comment)) and @bkomuves
The text was updated successfully, but these errors were encountered:
aggregate_verify
batch_verify
No branches or pull requests
As mentioned in https://github.com/mratsim/constantine/blob/661a481/README-PERFORMANCE.md#compiler-caveats
Compilers have a hard time optimizing bigint operations, even as simple as an addition with carries.
This issue track their evolution and the quality of the code generated with compiler builtins for ISAs of interest.
Note that as of February 2024, we use:
_addcarry_u64
on x86-642019, GCC 9.2 and Clang 9.0
The original problem:
https://gcc.godbolt.org/z/2h768y
Even with intrinsics, an operation as simple as addition-with-carry is uglily implemented in GCC.
This has been mentioned by the GMP folks 30 years ago: https://gmplib.org/manual/Assembly-Carry-Propagation.html
2024, GCC 13.2 and Clang 17.0
https://gcc.godbolt.org/z/jdecvffaP
GMP fixed the x86 intrinsics but unfortunately the portable intrinsics has a terribad codegen and hence makes a terrible fallback for ARM.
Current status
Due to GCC abysmal __builtin__addcll it is a non-starter.
Clang has decent codegen.
Assembly is still very much needed.
This also explains the bad ARM performance on Apple M1, M2, M3, mentioned by @agnxsh (#354 (comment)) and @bkomuves
The text was updated successfully, but these errors were encountered: