New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add library for compile time instantiation of elliptic curves #3979
base: master
Are you sure you want to change the base?
Conversation
Awesome! I'll certainly have a look. Not before tomorrow, though. |
No rush! This is certainly not going into 3.4 so there is plenty of time. Most interesting thing about the sea of red in CI - GCC 11.4 on x86 fails with constexpr timeout, while same version of GCC on say Aarch64 or S390 accepts. So constexpr limits are architecture dependent [*] :( and thus intrinsically flaky since you can be near some limit without knowing it [*] And also version dependent since GCC 13 on my machine is fine with the code without increasing the constexpr limit. |
Clang seems to have a nasty bug where It works fine for me in Clang 17 on my machines so I'm assuming this was a Clang bug that was fixed subsequently. I'd vote for just bumping the Clang minimum version - that's an absolutely insane bug and nearly impossible to work around - but unfortunately the version in Android NDK also has the bug and we are stuck with that version at least until the next NDK release. |
Downside of this approach is that (due to name mangling being somewhat horribly thought out for this case) object sizes are really big. Even with just 3 curves, on my machine the object file is one of the largest from the whole library, that will get a lot worse with 27. And compile times may prove untenable. I'm sure 90% of this can be salvaged but I suspect in the end the string based instantiation won't work in practice 😭 which is a shame since it's quite pretty imo |
Clang bug might be llvm/llvm-project#55638 |
This error
Looks like llvm/llvm-project#51182 |
I can repro the |
3496464
to
cd40575
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! 😄
I merely skimmed through the changes and left a few suggestions and comments here and there. No thorough review whatsoever.
*/ | ||
|
||
#ifndef BOTAN_PCURVES_UTIL_H_ | ||
#define BOTAN_PCURVES_UTIL_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comment for this file (and perhaps other places):
I'd suggest to use std::span<const W, N>
instead of std::array<>&
for the parameters of those utilities. No need to restrict those functions to arrays, IMO. And a static-length span should provide the same guarantees and optimization opportunities as an array, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main problem is for some technical reason I'm not bothering to learn the details of, C++ can't deduce the conversion from a statically sized array
to a statically sized span
. https://stackoverflow.com/questions/70983595
This can be worked around with some additional noise but the additional flexibility doesn't buy us anything in this context so I'd rather keep things as simple as possible.
auto v = bigint_monty_redc(m_val, Self::P, Self::P_dash); | ||
std::reverse(v.begin(), v.end()); | ||
auto bytes = store_be(v); | ||
|
||
if constexpr(Self::BYTES == Self::N * WordInfo<W>::bytes) { | ||
return bytes; | ||
} else { | ||
// Remove leading zero bytes | ||
const size_t extra = Self::N * WordInfo<W>::bytes - Self::BYTES; | ||
std::array<uint8_t, Self::BYTES> out; | ||
copy_mem(out.data(), &bytes[extra], Self::BYTES); | ||
return out; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could get away without the if constexpr
and the potential memory copy entirely?
auto v = bigint_monty_redc(m_val, Self::P, Self::P_dash); | |
std::reverse(v.begin(), v.end()); | |
auto bytes = store_be(v); | |
if constexpr(Self::BYTES == Self::N * WordInfo<W>::bytes) { | |
return bytes; | |
} else { | |
// Remove leading zero bytes | |
const size_t extra = Self::N * WordInfo<W>::bytes - Self::BYTES; | |
std::array<uint8_t, Self::BYTES> out; | |
copy_mem(out.data(), &bytes[extra], Self::BYTES); | |
return out; | |
} | |
auto v = bigint_monty_redc(m_val, Self::P, Self::P_dash); | |
std::reverse(v.begin(), v.end()); | |
std::array<uint8_t, Self::BYTES> out = {0}; | |
static_assert(Self::N * WordInfo<W>::bytes >= Self::BYTES); // is this guranteed somehow anyway? | |
const size_t zero_padding_offset = Self::N * WordInfo<W>::bytes - Self::BYTES; | |
store_be(std::span{out}.template subspan<zero_padding_offset>(), v); | |
return out; |
... this is untested. Just a sketch of the idea. store_be
with a statically-sized out-param will static_assert that the byte buffer matches the input range exactly.
constexpr std::array<uint8_t, Self::BYTES> serialize() const { | ||
std::array<uint8_t, Self::BYTES> r = {}; | ||
BufferStuffer pack(r); | ||
pack.append(0x04); | ||
pack.append(m_x.serialize()); | ||
pack.append(m_y.serialize()); | ||
return r; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could use the concat
helpers for this. Currently, that doesn't work, because concat
requires the output type to have .insert()
.This could then also (statically) assert that the output array was filled entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3994 is work towards this. It makes concat()
constexpr
and also automagically inferring the output array size. So this should actually work as:
constexpr std::array<uint8_t, Self::BYTES> serialize() const { | |
std::array<uint8_t, Self::BYTES> r = {}; | |
BufferStuffer pack(r); | |
pack.append(0x04); | |
pack.append(m_x.serialize()); | |
pack.append(m_y.serialize()); | |
return r; | |
} | |
constexpr std::array<uint8_t, Self::BYTES> serialize() const { | |
return concat( | |
store_le<uint8_t>(0x04), // store_le is just to make it look like a byte-array | |
m_x.serialize(), | |
m_y.serialize()); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something seems to be missing still
build/include/internal/botan/internal/loadstor.h: In instantiation of ‘constexpr auto Botan::store_le(ParamTs&& ...) [with ModifierT = unsigned char; ParamTs = {int}]’:
build/include/internal/botan/internal/pcurves_impl.h:351:30: required from here
build/include/internal/botan/internal/loadstor.h:699:67: error: no matching function for call to ‘store_any<Botan::detail::Endianness::Little, unsigned char>(int)’
699 | return detail::store_any<detail::Endianness::Little, ModifierT>(std::forward<ParamTs>(params)...);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
void conditional_assign(bool cond, const Self& pt) { | ||
m_x.conditional_assign(cond, pt.x()); | ||
m_y.conditional_assign(cond, pt.y()); | ||
m_z.conditional_assign(cond, pt.z()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this is needed for a constant-time implementation? So far, I saw it as an unspoken rule that all methods that are "supposed to be constant-time" are prefixed with ct_...
. I found this a helpful convention, that might be applicable here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this code I'm going for the opposite convention namely that everything is constant time by default and anything that is intentionally variable time has a _vartime
suffix.
(There are some current exceptions eg in the point multiplication that will be fixed before merging.)
GCC 11 miscompilation, excellent |
2feb181
to
3bde391
Compare
This cuts the size of the object file by more than half, since we avoid embedding literal strings of the params into the final typenames.
Also some multiplication related improvements.
This drastically speeds up the projective->affine conversion since we can use a batch operation.
Improves point doubling performance by over 20% with Clang
It slices, it dices.
_vartime
or comment to similar effect.Still todo