Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serialization: c++20 endian/byteswap/clz modernization #29263

Merged
merged 4 commits into from Mar 1, 2024

Conversation

theuni
Copy link
Member

@theuni theuni commented Jan 17, 2024

This replaces #28674, #29036, and #29057. Now ready for testing and review.

Replaces platform-specific endian and byteswap functions. This is especially useful for kernel, as it means that our deep serialization code no longer requires bitcoin-config.h.

I apologize for the size of the last commit, but it's hard to avoid making those changes at once.

All platforms now use our internal functions rather than libc or platform-specific ones, with the exception of MSVC.

Sadly, benchmarking showed that not all compilers are capable of detecting and optimizing byteswap functions, so compiler builtins are instead used where possible. However, they're now detected via macros rather than autoconf checks.

This matches how libc++ implements std::byteswap for c++23.

I suggest we move/rename compat/endian.h, but I left that out of this PR to avoid bikeshedding.

#29057 pointed out some irregularities in benchmarks. After messing with various compilers and configs for a few weeks with these changes, I'm of the opinion that we can't win on every platform every time, so we should take the code that makes sense going forward. That said, if any real-world slowdowns are caused here, we should obviously investigate.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 17, 2024

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK maflcko
Concept ACK fanquake

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #29494 (build: Assume HAVE_CONFIG_H, Add IWYU pragma keep to bitcoin-config.h includes by maflcko)
  • #29450 (build: replace custom MAC_OSX macro with standard __APPLE__ for consistent macOS detection by paplorinc)
  • #21590 (Safegcd-based modular inverses in MuHash3072 by sipa)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@theuni
Copy link
Member Author

theuni commented Jan 19, 2024

Here's a godbolt test that shows how various compilers perform: https://gcc.godbolt.org/z/nTadqEP83

To test with/without builtins, comment/un-comment the DISABLE_BUILTIN_BSWAPS define at line 14.

From my tests:

  • clang: all c++20 supporting versions (>= 10.0) optimize without the need for builtins
  • gcc: no version optimizes without the builtins
  • msvc: versions >= 19.37 optimize without builtins

@theuni
Copy link
Member Author

theuni commented Jan 23, 2024

Added a quick note about std::byteswap and c++23.

@DrahtBot
Copy link
Contributor

Guix builds (on x86_64)

File commit e69796c
(master)
commit ea6e27c
(master and this pull)
SHA256SUMS.part e3cbeb9e75c9ac85... 99dcf4de09376e75...
*-aarch64-linux-gnu-debug.tar.gz 49b7d884d40ea63b... be15822b12796d24...
*-aarch64-linux-gnu.tar.gz 7ba0932cedca0b55... e73549352dae7a29...
*-arm-linux-gnueabihf-debug.tar.gz ff1cc0114023daa7... 047df3ec91266906...
*-arm-linux-gnueabihf.tar.gz c9f35eb8d6b70596... aeb1de17fe3d8b73...
*-arm64-apple-darwin-unsigned.tar.gz c7b8e86e6f571427... 117f1a412fcdf132...
*-arm64-apple-darwin-unsigned.zip 64263106e26d5f38... 698d08ae6a55eb6b...
*-arm64-apple-darwin.tar.gz c451a701b31666a6... 160ddcae70fd28b7...
*-powerpc64-linux-gnu-debug.tar.gz 8eaed3b25e35a166... 91eb43803645a409...
*-powerpc64-linux-gnu.tar.gz 92192d5dcd2abc5d... 0e4f93325d2c1d4c...
*-powerpc64le-linux-gnu-debug.tar.gz 7ba12a5e155f1f31... 865994169509c581...
*-powerpc64le-linux-gnu.tar.gz 783da5e58254e1c7... 338e87d38582f88d...
*-riscv64-linux-gnu-debug.tar.gz 123752c651ab537b... 36fde55fe604836a...
*-riscv64-linux-gnu.tar.gz 6816d44ba62480e0... d06f75cb9639c743...
*-x86_64-apple-darwin-unsigned.tar.gz 651a9ae2176cf7e1... 7da6af545e133a49...
*-x86_64-apple-darwin-unsigned.zip 32787903bc08d809... e6f37321a45e076f...
*-x86_64-apple-darwin.tar.gz 878b96f61a83b5d5... 8fad7f75ceced2b9...
*-x86_64-linux-gnu-debug.tar.gz 7a7b930773e6723f... 7d4987ee307d625a...
*-x86_64-linux-gnu.tar.gz d012051198112116... 637d9dce06441d0f...
*.tar.gz 05ffca50f20c6f8c... f1e9711f7f4b1d8b...
guix_build.log b307741814adae42... 3d60205e28f729b9...
guix_build.log.diff 6b4c07847d5b2078...

@fanquake
Copy link
Member

Concept ACK - I think moving to using the builtins is ok. With the eventual plan to migrate to std::byteswap.

@aureleoules everytime I visit the coverage/benchmarks for this PR, I get a 500 error. Can you take a look?

@aureleoules
Copy link
Member

@fanquake thanks for reaching out, I fixed the issue.

@theuni
Copy link
Member Author

theuni commented Jan 24, 2024

@aureleoules Thanks! These numbers really don't make much sense to me and don't match what I've seen locally, but I'll work on trying to reproduce.

@theuni
Copy link
Member Author

theuni commented Jan 24, 2024

From some more local tests, it looks like it's the clz changes slowing things down for whatever reason. Still investigating.

@theuni
Copy link
Member Author

theuni commented Feb 1, 2024

I've dropped the clz changes as a test and kept only the endian/byteswap change. Locally on my machine the benchmarks look the same before and after.

If the corecheck benchmarks look better I'll update the title/description here.

Edit: looks good.

@theuni
Copy link
Member Author

theuni commented Feb 1, 2024

Aha, I finally tracked down the culprit! No wonder the benchmarks weren't making sense!

The problem was the removal of this from crypto/common.h:

#if defined(HAVE_CONFIG_H)
#include <config/bitcoin-config.h>
#endif

Turns out, sha256.cpp was relying on that. Adding that include to sha256.cpp (where it belongs) fixes my local benchmarks.

I'm going to put the clz changes back to match the PR title/description, but this time with the include fixed up. Hopefully after that the benchmarks will make sense and all will be good :)

I suppose there might be other compilation units that were depending on the include indirectly as well. I'll figure out a way to check for that.

@theuni
Copy link
Member Author

theuni commented Feb 1, 2024

Converted to a draft while I'm still messing with this.

I added a new commit to add the bitcoin-config.h include where it's necessary. I'm not done investigating that yet, but it's at least needed for chacha20poly1305.cpp as well.

The benchmarks still show some regressions, though it's not nearly as bad as it was before. Will continue poking.

@maflcko
Copy link
Member

maflcko commented Feb 5, 2024

I've created #26972 to track ideas on how to detect those silent build issues, or at least make them easier to spot.

@theuni
Copy link
Member Author

theuni commented Feb 5, 2024

I've pushed a commit which temporarily puts the includes back in the low-level headers for the sake of addressing the benchmarks first, setting aside the possibility of missing defines.

@aureleoules's benchmarks show 2 small regressions, though I am unable to reproduce those locally.

My tests show: ./bench/bench_bitcoin -min-time=5000 -filter="AddrManGetAddr|PrevectorDeserializeNontrivial"
master (3d52ced):

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          224,138.83 |            4,461.52 |    0.1% |      5.50 | `AddrManGetAddr`
|              155.96 |        6,411,753.11 |    0.1% |      5.49 | `PrevectorDeserializeNontrivial

This PR:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|          226,531.50 |            4,414.40 |    0.2% |      5.50 | `AddrManGetAddr`
|              156.91 |        6,372,901.08 |    0.1% |      5.50 | `PrevectorDeserializeNontrivial`

Essentially no change. I'm curious if @maflcko sees the same?

@DrahtBot DrahtBot removed the CI failed label Feb 5, 2024
@aureleoules
Copy link
Member

@theuni yeah the AddrManGetAddr and PrevectorDeserializeNontrivial are known to be flaky. I'll filter them out from the UI while I find a better solution.
Some benchmarks depend on I/O or randomness so it's hard a have consistent results between runs.

You can see ignored benchmarks here in the meantime: https://github.com/corecheck/frontend/blob/master/src/routes/%5Bowner%5D/%5Brepo%5D/pulls/%5Bnumber%5D/Benchmarks.svelte#L334

@theuni
Copy link
Member Author

theuni commented Feb 5, 2024

@aureleoules Thanks, that's helpful.

@theuni
Copy link
Member Author

theuni commented Feb 27, 2024

Updated to split the last commit as suggested by @maflcko . I also replaced static inline with just inline in places where I had introduced that.

@DrahtBot
Copy link
Contributor

Guix builds (on x86_64)

File commit 6a7ed5e
(master)
commit 88c6098
(master and this pull)
SHA256SUMS.part 1c34f1e5d39e6e9f... acb7161ac7364ecc...
*-aarch64-linux-gnu-debug.tar.gz 24e23354d1be9fc8... a20b13fcce939673...
*-aarch64-linux-gnu.tar.gz 358ad6c1b0dd8ead... f6fc3fe4b27e5ee7...
*-arm-linux-gnueabihf-debug.tar.gz 60042e298ffdd3c3... 00c47060ecb3abf4...
*-arm-linux-gnueabihf.tar.gz fed6a8b2bdf2a303... 63d75b669ff14561...
*-arm64-apple-darwin-unsigned.tar.gz 4c9a8d9d5b89a020... 8ebdbacf70159fe2...
*-arm64-apple-darwin-unsigned.zip 2cfdf3c5b0a0db30... de2f41b558ff6626...
*-arm64-apple-darwin.tar.gz 7960a667b50e1e1f... 9ca99330356a62d5...
*-powerpc64-linux-gnu-debug.tar.gz 25e57cb4560928de... 9d2312f9cba01279...
*-powerpc64-linux-gnu.tar.gz fccab49cbdd030c6... 623776d643b66d7a...
*-powerpc64le-linux-gnu-debug.tar.gz 0dad8bf978d0ac93... 14d5019dca08f8bd...
*-powerpc64le-linux-gnu.tar.gz 561cdb98080d9fe7... 5054cf4bbbe05f01...
*-riscv64-linux-gnu-debug.tar.gz bf9f2e103904eb7f... 769f2383f6b80be5...
*-riscv64-linux-gnu.tar.gz 4f166d86b4597cbd... 0d2dc190d20b4fa3...
*-x86_64-apple-darwin-unsigned.tar.gz 7ed87d2ffa2a3860... e23e5a8d91623708...
*-x86_64-apple-darwin-unsigned.zip 999c3327d8d208c2... 7135d75a120bccf8...
*-x86_64-apple-darwin.tar.gz 99c857d9cd288193... 107e29285c059475...
*-x86_64-linux-gnu-debug.tar.gz c4aa5e9c880bc333... 65794f06ace89f5d...
*-x86_64-linux-gnu.tar.gz 63f3a64c51429d91... b9b4b601b9369cb3...
*.tar.gz d9e5a8d2f97b2900... e882728fa5d23050...
guix_build.log eacd4c991584db23... f88d51ce2cfa8976...
guix_build.log.diff da8f5ab257e07d01...

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 1d051e8 📛

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: ACK 1d051e86098e721a47cebf7cfd73d82aa414c05e 📛
6Obp6pvdrMxN7MXEPLAuvZe0UCy9RzNtRu/VNpLvSGe6P6thBW12J6z7gacvZBa1MGOdmgJFjbeCHc8MjWTyDA==

src/compat/byteswap.h Outdated Show resolved Hide resolved
Rather than a complicated set of tests to decide which bswap functions to
use, always prefer the compiler built-ins when available.

These builtins and fallbacks can all be removed once we're using c++23, which
adds std::byteswap.
These replace our platform-specific mess in favor of c++20 endian detection
via std::endian and internal byteswap functions when necessary.

They no longer rely on autoconf detection.
@maflcko
Copy link
Member

maflcko commented Feb 28, 2024

ACK 86b7f28 📘

Show signature

Signature:

untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: ACK 86b7f28d6c507155a9d3a15487ee883989b88943 📘
kQ4T5XBTsSlLG85FEzyZHq8zuOHIKVd6qy22hl3UGR99D8vEpfnGCcvGnQHmd2MS2GYEuSKzOYeFDt9w2Ch8Bw==

fanquake added a commit that referenced this pull request Feb 28, 2024
…n with c++20 concept

ad7584d serialization: replace char-is-int8_t autoconf detection with c++20 concept (Cory Fields)

Pull request description:

  Doesn't depend on #29263, but it's really only relevant after that one's merged.

  This removes the only remaining autoconf macro in our serialization code (after #29263), so it can now be used trivially and safely out-of-tree.

  ~Our code does not currently contain any concepts, but couldn't find any discussion or docs about avoiding them. I guess we'll see if this blows up our c-i.~
  Edit: Ignore this. ajtowns pointed out that we're already using a few concepts.

  This was introduced in #13580. Please check my logic on this as I'm unable to test on a SmartOS system. Even better would be a confirmation from someone who can build there.

ACKs for top commit:
  Empact:
    Code review ACK ad7584d

Tree-SHA512: 1faf65c900700efb1cf3092c607a2230321b393cb2f029fbfb94bc8e50df1dabd7a9e4b91e3b34f0d2f3471aaf18ee7e56d91869db5c5f4bae84da95443e1120
Copy link
Member

@fanquake fanquake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 86b7f28 - we can finish pruning out the __builtin_clz* checks/usage once the minisketch code has been updated. This is more good cleanup pre-CMake & for the kernal.

@fanquake fanquake merged commit 8da62a1 into bitcoin:master Mar 1, 2024
16 checks passed
Christewart added a commit to Christewart/bitcoin that referenced this pull request Mar 6, 2024
…ESSTHANOREQUAL64, OP_GREATERTHAN64, OP_GREATERTHANOREQUAL64, OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64

Remove liquid args

WIP

Get simple OP_1 functional test case working

Get tests for arithmetic and comparison opcodes working

Get all functional tests passing

Rename test case to Arithmetic64bitTest

Rename file to feature_64bit_arithmetic_opcodes.py, add it to test_runner.py

Get tests passing in feature_taproot.py

Remove unused push_le4

Revert test fixture setup

Cleanup

Fix linting

test: Add leaf_version parameter to taproot_tree_helper()

Fix bug

Fix bugs

Fix compile

Fix missing sigversion checks

Fix htole64 -> htole64_internal due to bitcoin#29263
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 8, 2024
THis change mirrors changes from bitcoin#29263.
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 8, 2024
This change mirrors changes from bitcoin#29263.
fanquake added a commit that referenced this pull request Mar 8, 2024
8e17f00 build, msvc: Cleanup `bitcoin_config.h.in` (Hennadii Stepanov)

Pull request description:

  This PR mirrors changes from #29263 into the MSVC build system.

ACKs for top commit:
  fanquake:
    ACK 8e17f00

Tree-SHA512: b8e5cca015ff112c2969a60436524e97007ff2c559b3c12425d0549af694b16248311cc3e7c33f798bc095a679933641496836bb846eee6a2a377956ef53f56e
kevkevinpal pushed a commit to kevkevinpal/bitcoin that referenced this pull request Mar 13, 2024
kevkevinpal pushed a commit to kevkevinpal/bitcoin that referenced this pull request Mar 13, 2024
Christewart added a commit to Christewart/bitcoin that referenced this pull request Mar 13, 2024
…ESSTHANOREQUAL64, OP_GREATERTHAN64, OP_GREATERTHANOREQUAL64, OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64

Remove liquid args

WIP

Get simple OP_1 functional test case working

Get tests for arithmetic and comparison opcodes working

Get all functional tests passing

Rename test case to Arithmetic64bitTest

Rename file to feature_64bit_arithmetic_opcodes.py, add it to test_runner.py

Get tests passing in feature_taproot.py

Remove unused push_le4

Revert test fixture setup

Cleanup

Fix linting

test: Add leaf_version parameter to taproot_tree_helper()

Fix bug

Fix bugs

Fix compile

Fix missing sigversion checks

Fix htole64 -> htole64_internal due to bitcoin#29263
hebasto added a commit to hebasto/bitcoin that referenced this pull request Mar 17, 2024
33a454e fixup! cmake: Check system symbols (Hennadii Stepanov)
3cb2e65 fixup! cmake: Check system headers (Hennadii Stepanov)

Pull request description:

  This PR backports build system changes from bitcoin#29263.

ACKs for top commit:
  pablomartin4btc:
    ACK 33a454e

Tree-SHA512: 1793c6504a7190134c0ce075e959d22c4a3640d54a4d141f5117975bed267952cc8c7da488426e48022eba1eb77d3353783d77a20907b0cfa183e0b68d824133
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this pull request Apr 6, 2024
Christewart added a commit to Christewart/bitcoin that referenced this pull request May 13, 2024
…ESSTHANOREQUAL64, OP_GREATERTHAN64, OP_GREATERTHANOREQUAL64, OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64

Remove liquid args

WIP

Get simple OP_1 functional test case working

Get tests for arithmetic and comparison opcodes working

Get all functional tests passing

Rename test case to Arithmetic64bitTest

Rename file to feature_64bit_arithmetic_opcodes.py, add it to test_runner.py

Get tests passing in feature_taproot.py

Remove unused push_le4

Revert test fixture setup

Cleanup

Fix linting

test: Add leaf_version parameter to taproot_tree_helper()

Fix bug

Fix bugs

Fix compile

Fix missing sigversion checks

Fix htole64 -> htole64_internal due to bitcoin#29263
Christewart added a commit to Christewart/bitcoin that referenced this pull request May 13, 2024
…ESSTHANOREQUAL64, OP_GREATERTHAN64, OP_GREATERTHANOREQUAL64, OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64

Remove liquid args

WIP

Get simple OP_1 functional test case working

Get tests for arithmetic and comparison opcodes working

Get all functional tests passing

Rename test case to Arithmetic64bitTest

Rename file to feature_64bit_arithmetic_opcodes.py, add it to test_runner.py

Get tests passing in feature_taproot.py

Remove unused push_le4

Revert test fixture setup

Cleanup

Fix linting

test: Add leaf_version parameter to taproot_tree_helper()

Fix bug

Fix bugs

Fix compile

Fix missing sigversion checks

Fix htole64 -> htole64_internal due to bitcoin#29263
Christewart added a commit to Christewart/bitcoin that referenced this pull request May 13, 2024
…ESSTHANOREQUAL64, OP_GREATERTHAN64, OP_GREATERTHANOREQUAL64, OP_SCRIPTNUMTOLE64, OP_LE64TOSCRIPTNUM, OP_LE32TOLE64

Remove liquid args

WIP

Get simple OP_1 functional test case working

Get tests for arithmetic and comparison opcodes working

Get all functional tests passing

Rename test case to Arithmetic64bitTest

Rename file to feature_64bit_arithmetic_opcodes.py, add it to test_runner.py

Get tests passing in feature_taproot.py

Remove unused push_le4

Revert test fixture setup

Cleanup

Fix linting

test: Add leaf_version parameter to taproot_tree_helper()

Fix bug

Fix bugs

Fix compile

Fix missing sigversion checks

Fix htole64 -> htole64_internal due to bitcoin#29263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants