Refactor: Final iteration on load/store #3869

reneme · 2024-01-03T10:14:07Z

Description

This should be the final iteration on the big/little endian helpers for now (modulo the eventual removal of the legacy ptr-based overloads).

The patch fulfills three main objectives:

Remove the inherent code duplication of the load/store overloads for big-endian and little endian
This is achieved by introducing detail::load_any<> and detail::store_any<> with a template parameter Endianness. By passing this all the way to the lowest-level overload, we can do the distinction once and implement convenience overloads without duplicating them each time (+more duplication for legacy overloads)
Support strong types and enums that wrap unsigned integers
This allows for things like: load_be<MyIntStrongType>(some_bytes), or store_be(some_enum_value). It makes working with strong types and enums more convenient when loading/storing them.
Allow loading/storing collections of unsigned integers
Before, one could load/store multiple variables at once (passed as variadic parameters). Now, its possible to pass ranges of integers (similarly to typecast_copy()). That is useful initializing and extracting internal hash states (that are collections of uint64_t), for instance.
constexpr
The whole range of features described above may also be used at compile-time. For instance, to initialize algorithm parameters that need a certain endianness.

Quick and Dirty Benchmark

Running ./botan speed both on the feature branch and on the master commit it is based on, both compiled with clang-15 on ubuntu 22.04 (running in WSL). Is that a reasonable smoke test?

Many algorithms actually seem to benefit from this patch (without modifying the used load/store overloads). For instance, AES encryption/decryption (using NI extensions) clocks in at a 7% speedup. Is that even realistic? 😲 Keccak seems to suffer slightly. For instance, the SHAKE throughput is about 2% less.

Here's the full data set: bench_loadstore_refactoring.txt

coveralls · 2024-01-03T15:53:29Z

coverage: 92.048% (+0.04%) from 92.006%
when pulling dde4cba on Rohde-Schwarz:chore/more_loadstore_convenience
into 13c7e5f on randombit:master.

reneme · 2024-01-17T07:39:15Z

Rebased after #3888.

This refactors the load/store helpers to duplicating all overloads, for big-endian and little-endian functionality. We hope to reduce the complexity and maintenance burden of this code. As a side-effect this also centralizes some of the pre-processor defines into low-level helper methods. The actual logic now uses `if constexpr` based on those helpers. Also, the support for legacy overloads (using pointers or C-arrays) produces much less clutter now. Additionally, the load/store helpers can now handle strong types that wrap unsigned integer values. Co-Authored-By: Fabian Albert <fabian.albert@rohde-schwarz.com>

This allows loading and storing of entire collections of unsigned integers from/into byte ranges. That can be used to extract a digest from the hash function's internal state array, for instance. Co-Authored-By: Fabian Albert <fabian.albert@rohde-schwarz.com>

reneme · 2024-02-16T15:25:02Z

Rebased and resolved conflicts after #3908 got merged.

src/lib/utils/loadstor.h

randombit

I’m a little skeptical of this change simply because it’s +900/-300 just to keep feature parity, and the new code is rather more complicated than what came before, which does not seem like a simplification so much to me 🙂 but I can see the benefits re strong types and consistent constexpr.

As a smoke test botan speed seems to me fine, but 7% for AES-NI seems quite surprising! For one, because AES-NI uses SIMD_4x32::load_le which doesn’t use this load infrastructure at all - it basically is just a _mm_loadu_si128. (And likewise for stores.)

I’ll look into benchmarks on my machine asap. Approving in so far as - assuming we can clearly demonstrate no significant regressions - change seems ok.

reneme · 2024-03-19T21:58:16Z

I'd like to argue that the this adds almost 400 lines of test code. 🙂 However, admittedly, the code is denser than before, that's for sure.

randombit · 2024-03-21T16:03:07Z

I checked times of this PR vs master on some algorithms that seemed likely to be impacted by performance of load/store (especially hash functions), nothing notable so ok to merge.

reneme force-pushed the chore/more_loadstore_convenience branch 2 times, most recently from cab16ed to 5ad0f31 Compare January 3, 2024 15:27

reneme force-pushed the chore/more_loadstore_convenience branch from 5ad0f31 to 4c9f780 Compare January 3, 2024 17:14

This was referenced Jan 3, 2024

[std::span] Pass std::span buffers into Block_Cipher Implementations #3870

Closed

Extend/Update (public) APIs with std::span overloads #3318

Open

reneme force-pushed the chore/more_loadstore_convenience branch from 4c9f780 to 530ad51 Compare January 4, 2024 08:53

reneme changed the title ~~Chore: More loadstor.h modernization~~ Chore: Replace copy_out_{le/be} where possible Jan 4, 2024

reneme changed the title ~~Chore: Replace copy_out_{le/be} where possible~~ Chore: Replace copy_out_{le/be} (loadstor.h) where possible Jan 4, 2024

reneme added this to the Botan 3.4.0 milestone Jan 4, 2024

reneme marked this pull request as ready for review January 4, 2024 09:59

reneme mentioned this pull request Jan 4, 2024

PQC: FrodoKEM #3679

Merged

reneme force-pushed the chore/more_loadstore_convenience branch from cb8089a to 8aa12b3 Compare January 10, 2024 16:36

reneme marked this pull request as draft January 10, 2024 16:37

reneme force-pushed the chore/more_loadstore_convenience branch 2 times, most recently from 7c3c0f1 to bc377d6 Compare January 11, 2024 14:57

reneme mentioned this pull request Jan 11, 2024

Chore: Modernize copy_out_be/le #3885

Merged

reneme changed the title ~~Chore: Replace copy_out_{le/be} (loadstor.h) where possible~~ Refactor: Final iteration on load/store Jan 11, 2024

reneme force-pushed the chore/more_loadstore_convenience branch 4 times, most recently from c5f52a1 to a177713 Compare January 11, 2024 17:11

reneme marked this pull request as ready for review January 11, 2024 17:22

reneme mentioned this pull request Jan 12, 2024

HSS-LMS Signature Algorithm Implementation #3716

Merged

reneme force-pushed the chore/more_loadstore_convenience branch from 675cf4c to 2bd8195 Compare January 17, 2024 07:38

reneme mentioned this pull request Jan 25, 2024

PQC: Classic McEliece #3883

Open

14 tasks

reneme and others added 3 commits February 16, 2024 16:15

Add additional concepts

096ac81

reneme added 2 commits February 16, 2024 16:22

load/store handles enums that use unsigned integers

eb1b801

add note to future-us

74012f9

reneme force-pushed the chore/more_loadstore_convenience branch from 2bd8195 to 74012f9 Compare February 16, 2024 15:23

reneme commented Feb 16, 2024

View reviewed changes

src/lib/utils/loadstor.h Outdated Show resolved Hide resolved

FAlbertDev mentioned this pull request Mar 6, 2024

Feature: x448 and Ed448 #3933

Merged

reneme added 2 commits March 13, 2024 17:36

Allow using load/store functionality as constexpr

4194240

Update src/lib/utils/loadstor.h

dde4cba

randombit approved these changes Mar 19, 2024

View reviewed changes

reneme merged commit 94b1e77 into randombit:master Mar 21, 2024
39 checks passed

reneme deleted the chore/more_loadstore_convenience branch March 21, 2024 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: Final iteration on load/store #3869

Refactor: Final iteration on load/store #3869

reneme commented Jan 3, 2024 •

edited

Loading

coveralls commented Jan 3, 2024 •

edited

Loading

reneme commented Jan 17, 2024

reneme commented Feb 16, 2024

randombit left a comment

reneme commented Mar 19, 2024

randombit commented Mar 21, 2024

Refactor: Final iteration on load/store #3869

Refactor: Final iteration on load/store #3869

Conversation

reneme commented Jan 3, 2024 • edited Loading

Description

Quick and Dirty Benchmark

coveralls commented Jan 3, 2024 • edited Loading

reneme commented Jan 17, 2024

reneme commented Feb 16, 2024

randombit left a comment

Choose a reason for hiding this comment

reneme commented Mar 19, 2024

randombit commented Mar 21, 2024

reneme commented Jan 3, 2024 •

edited

Loading

coveralls commented Jan 3, 2024 •

edited

Loading