LibCrypto/Curves/SECPxxxr1: Review/issues/areas for improvement #23918

Yawning · 2024-04-10T12:54:44Z

Preamble:

I do not use this project, but it came up in a conversation and I looked through the LibCrypto library out of intellectual curiosity.
My understanding of anything outside the scope at discussion is non-existent.
I will use SEC 1 ver 1.9 (https://www.secg.org/sec1-v2.pdf) and FIPS 186-5 as the definitions of "correct", with a preference towards the NIST spec if there is a difference of opinion.
I understand this is a hobbyist project, but doing this correctly is hard, so please don't take this the wrong way, and I understand some of this is highly subjective.

Notes:

Internal representation uses Jacobian coordinates ( x=X/Z^2 y=Y/Z^3) with the incomplete addition/doubling formulas.
ECDH, ECDSA verify (no sign).
Field elements are internally represented in Montgomery form.

Bugs:

ECDSA verify accepts non-canonical representations of (r, s). Step 1 of FIPS 186-5 6.4.2, SEC 1 4.1.4 is missing entirely. ECDSA by definition has malleable signatures, however this implementation will accept signatures that correct implementations will reject.
The check for "is the point on the curve" is done after the scalar-point multiply for the ECDSA. The impact of this somewhat depends on what VERIFY means (I didn't check, sorry). It is somewhat moot as r, s get reduced modulo the group order when converted to Montgomery-form, but this is spooky. ECDH checks that the point is on the curve early (compute_cooordinate_internal) so is not affected.
As noted by the compute_coordinate_internal FIXME comment, reducing the scalar modulo group order when doing a scalar-point multiply results in bias, particularly for P-256 (the bias is insignificant for P-384/P-521). This currently has fairly minimal impact, but is a case that MUST be fixed when ECDSA sign is implemented (https://github.com/C2SP/CCTV/tree/main/RFC6979 has a test case), and SHOULD be fixed for ECDH.

Performance issues:

Rescaling (convert_jacobian_to_affine) does 2 inversions. Computing 1/Z and then multiplying repeatedly as necessary is significantly cheaper.
A 4-bit window is easy to implement and would significantly increase performance of the scalar-point multiply.

General recommendations:

read_uncompressed_point should be where the is_point_on_curve check happens. (generate_public_key_internal should bypass the check, and just cover it with test cases as s * G is guaranteed to be valid). "Best practice" these days leans heavily toward "there is no way to create a point that is not on the curve", so early-rejection is better.
Instead of using the Jacobian coordinates and incomplete formulas, I strongly recommend using projective coordiantes (x = X/Z, y = Y/Z), and the complete (exception free) formulas, especially for a hobbyist project as the performance difference is minimal, and it is more "obviously correct". Algorithm 4 and 6 from https://eprint.iacr.org/2015/1060.pdf will save you a lot of grief.
Unless there are substantial reasons otherwise, strongly consider using fiat-crypto (https://github.com/mit-plv/fiat-crypto) and addchain (https://github.com/mmcloughlin/addchain) for the low level field-arithmetic, as the artifacts are available under a series of very liberal licenses, and are provably correct.

ps: On an unrelated note, Cipher/AES.h and Cipher/AES.cpp need to be shot into the sun and burned. Please use AES-NI (or a bitsliced implementation cribbed from BearSSL), since the current code leaks the symmetric key via cache-timing sidechannels.

The text was updated successfully, but these errors were encountered:

alimpfard · 2024-04-10T13:58:04Z

Cipher/AES.h and Cipher/AES.cpp need to be shot into the sun and burned

As the one who wrote them, yes, I'll even sign the space gun sending them to the sun.

Yawning · 2024-04-10T14:34:05Z

Cipher/AES.h and Cipher/AES.cpp need to be shot into the sun and burned

As the one who wrote them, yes, I'll even sign the space gun sending them to the sun.

😺

If someone wants to implement this the "right way" for fun from first principles, https://eprint.iacr.org/2009/129 and https://eprint.iacr.org/2009/191 are good starting points.

MarekKnapek · 2024-04-12T10:24:04Z

For improved efficiency I would also suggest to change interface of Crypto::Cipher::Cipher::encrypt_block function and corresponding decrypt function. This is because hardware provided accelerated instructions are capable of working on more than one block at a time. I would change the interface of these functions to take span of blocks rather than single block. There is AVX512VL and AVX512F extensions to the i386 / amd64 architectures. This could be useful for the ECB and CTR modes of operations as they are capable of operating on crypt operation (both enc and dec) with multiple blocks independently (in parallel) (no need to crypt previous block to be able to crypt following block). I would also add 128 bit / 16 byte alignment to the AES block data type. Edit: Also the key and round key should be aligned to 128 bits / 16 bytes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LibCrypto/Curves/SECPxxxr1: Review/issues/areas for improvement #23918

LibCrypto/Curves/SECPxxxr1: Review/issues/areas for improvement #23918

Yawning commented Apr 10, 2024 •

edited

alimpfard commented Apr 10, 2024

Yawning commented Apr 10, 2024

MarekKnapek commented Apr 12, 2024 •

edited

LibCrypto/Curves/SECPxxxr1: Review/issues/areas for improvement #23918

LibCrypto/Curves/SECPxxxr1: Review/issues/areas for improvement #23918

Comments

Yawning commented Apr 10, 2024 • edited

alimpfard commented Apr 10, 2024

Yawning commented Apr 10, 2024

MarekKnapek commented Apr 12, 2024 • edited

Yawning commented Apr 10, 2024 •

edited

MarekKnapek commented Apr 12, 2024 •

edited