Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: use -muse-unaligned-vector-move for Windows builds #28151

Merged
merged 1 commit into from Sep 5, 2023

Conversation

fanquake
Copy link
Member

@fanquake fanquake commented Jul 25, 2023

We currently work around a longstanding GCC issue with aligned vector instructions, by patching the behaviour we want into GCC (see discussion in #24736). Possibly in response to the GCC thread (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c40), a new option was introduced into the binutils assembler with the 2.38 release:

x86: Add -muse-unaligned-vector-move to assembler

Unaligned load/store instructions on aligned memory or register are as
fast as aligned load/store instructions on modern Intel processors.  Add
a command-line option, -muse-unaligned-vector-move, to x86 assembler to
encode encode aligned vector load/store instructions as unaligned
vector load/store instructions.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 25, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK theuni

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

@DrahtBot DrahtBot changed the title [WIP] guix: use -muse-unaligned-vector-move for Windows builds [WIP] guix: use -muse-unaligned-vector-move for Windows builds Jul 25, 2023
@fanquake
Copy link
Member Author

cc @theuni

@theuni
Copy link
Member

theuni commented Jul 25, 2023

Concept ACK. The patch we've taken from Debian is described as:

replacing the aligned instruction with the unaligned one the hacky patch below got created, just replacing vmovapd by vmovupd. Not considering any side effects and maybe other instructions with alignment requirements.

Which appears to be what the new assembler option does as well. Using the supported switch is much better than the hacky patch.

What's up with the single aligned case though? Is that maybe coming from somewhere where we explicitly specify alignment?

@theuni
Copy link
Member

theuni commented Jul 25, 2023

Any reason not to add it to our configure directly rather than guix?

@DrahtBot
Copy link
Contributor

Guix builds

File commit e35fb7b
(master)
commit e590bc2
(master and this pull)
SHA256SUMS.part 85ece518a8ab8bb8... d0a52d153d899e55...
*-aarch64-linux-gnu-debug.tar.gz ec06911cbfa43bcb... f5d30fe5bc686d43...
*-aarch64-linux-gnu.tar.gz 9cf832869ea5eec6... 74547f380b8ebb65...
*-arm-linux-gnueabihf-debug.tar.gz ce0894bf1cb47527... a9c25bd6f128ba38...
*-arm-linux-gnueabihf.tar.gz 3260f94e70f77d08... 876aef59b552b1ff...
*-arm64-apple-darwin-unsigned.dmg eb924e766c7c030b...
*-arm64-apple-darwin-unsigned.tar.gz 1aaa58cb290326c5...
*-arm64-apple-darwin.tar.gz ce1ee32821b949a9...
*-powerpc64-linux-gnu-debug.tar.gz e0072e370893cd22...
*-powerpc64-linux-gnu.tar.gz 69b80b8ce06f5c8e...
*-powerpc64le-linux-gnu-debug.tar.gz 36160eed83f577ac... 374aeccc6c19c180...
*-powerpc64le-linux-gnu.tar.gz ea7524ac63244d44... 2b71c4aadb7837fe...
*-riscv64-linux-gnu-debug.tar.gz 278b77020585377a... 691bdddee95f8ba2...
*-riscv64-linux-gnu.tar.gz fe51f12924ee6954... fd2665f478718a25...
*-x86_64-apple-darwin-unsigned.dmg 3c96ead660b80011...
*-x86_64-apple-darwin-unsigned.tar.gz 5d1a5aca6189a136...
*-x86_64-apple-darwin.tar.gz cdcf1d209ecadba4...
*-x86_64-linux-gnu-debug.tar.gz 6679dd3d862109bc... 4ec1c2b4508b2d2a...
*-x86_64-linux-gnu.tar.gz 2de809754a1009c5... f550c8f278e0a492...
*.tar.gz 33df1d211b914d3a... c7cfd1278124b629...
guix_build.log 551a5b95262a29c5... 799518d1848f5118...
guix_build.log.diff aef8e939042efead...

@hebasto
Copy link
Member

hebasto commented Aug 2, 2023

Tested Guix build on Windows:

C:\Users\hebasto\Desktop\pr28151\bitcoin-6c3f6a9d015c-win64\bitcoin-6c3f6a9d015c>bin\bitcoind.exe -signet
*** stack smashing detected ***:  terminated

@fanquake
Copy link
Member Author

fanquake commented Aug 3, 2023

Any reason not to add it to our configure directly rather than guix?

I think that's actually all we can do here, adding this (to configure) as a best-effort for anyone using an unpatched compiler (where it'd be most beneficial when building bitcoind.exe), while also retaining our GCC patching in Guix. Anything else would potentially leave us with unwanted code from dependencies/the rest of the toolchain.

@theuni
Copy link
Member

theuni commented Aug 4, 2023

Any reason not to add it to our configure directly rather than guix?

I think that's actually all we can do here, adding this (to configure) as a best-effort for anyone using an unpatched compiler (where it'd be most beneficial when building bitcoind.exe), while also retaining our GCC patching in Guix.

Are you sure those two aren't mutually exclusive? If so, yeah, that sounds reasonable. Though I think we should make an effort to produce working binaries without the patch first, just to make sure the feature is actually working as intended. Then re-add the patch for belt-and-suspenders if possible.

Thinking this through further, surely depends needs to be built with the flag as well? And possibly worse if toolchain objects need it too, but I guess depends might be enough?

Edit: Ok, right, this is avx-only. Ignore most of the above.

@fanquake
Copy link
Member Author

Edit: Ok, right, this is avx-only. Ignore most of the above.

Yea. It seems hard to (generally) produce working Windows binaries outside of Guix, because I assume most Windows toolchains are broken with the same bug (unless they are all being patched similar to Debian/Ubuntu).

So the best we can do is retain our GCC patching, so the entire toolchain/libs/bins are built correctly, and then in configure, we could pass the assembler flags as a best-effort, to try and produce working binaries when building outside of Guix (and I don't think addition of the flags to the Guix build (from configure) would be an issue, as it would only be swapping out instructions that shouldn't be appearing in any case).

@fanquake fanquake force-pushed the win_muse_unaligned_vector_move branch from 6c3f6a9 to 83d8592 Compare August 23, 2023 13:49
@fanquake fanquake changed the title [WIP] guix: use -muse-unaligned-vector-move for Windows builds build: use -muse-unaligned-vector-move for Windows builds Aug 23, 2023
@fanquake fanquake force-pushed the win_muse_unaligned_vector_move branch from 83d8592 to 77f33df Compare August 26, 2023 07:57
We currently work around a longstanding GCC issue with aligned vector
instructions, in our release builds, by patching the behaviour we want
into GCC (see discussion in bitcoin#24736).

A new option now exists in the binutils assembler,
`-muse-unaligned-vector-move`, which should also achieve the behaviour
we want (at least for our code). This was added in the 2.38 release,
see
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=c8480b58e1968f209b6365af7422678f348222c2.
```bash
x86: Add -muse-unaligned-vector-move to assembler

Unaligned load/store instructions on aligned memory or register are as
fast as aligned load/store instructions on modern Intel processors.  Add
a command-line option, -muse-unaligned-vector-move, to x86 assembler to
encode encode aligned vector load/store instructions as unaligned
vector load/store instructions.
```

Even if we introduce this option into our build system, we'll have to
maintain our GCC patching, as we want all code that ends up in the
binary, to avoid these instructions. However, there may be some value in
adding the option, as it could be an improvement for someone building
(bitcoind.exe) with an unpatched compiler.
@fanquake fanquake force-pushed the win_muse_unaligned_vector_move branch from 77f33df to 96f2cf8 Compare August 30, 2023 15:34
@theuni
Copy link
Member

theuni commented Aug 30, 2023

ACK 96f2cf8.

I verified locally that trying to use an assembler that doesn't understand the new option results in a compile failure:

 $ g++ -I. -std=c++17 -c test.cpp -o test.o -Wa,-muse-unaligned-vector-move || echo failure
as: unrecognized option '-muse-unaligned-vector-move'
failure

I also considered suggesting moving this to avx flags, but I guess it doesn't hurt to use it everywhere. It may end up having an effect on sha-ni code as well, for example.

@fanquake
Copy link
Member Author

fanquake commented Sep 5, 2023

TODO:
Note that there is usage of vmovapd in this branch and the current release binaries. Need to follow up.

I might have either imagined this, or been looking at binaries that I'd broken, as I cannot seem to find any instructions in a Guix built ecab855. In any case, opened #28413, because we could add a test for the non-existence, while we continue to patch this behaviour out.

@fanquake fanquake merged commit 9d3b216 into bitcoin:master Sep 5, 2023
15 checks passed
@fanquake fanquake deleted the win_muse_unaligned_vector_move branch September 5, 2023 12:45
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Sep 8, 2023
…dows builds

96f2cf8 build: use -muse-unaligned-vector-move for Windows (fanquake)

Pull request description:

  We currently work around a longstanding GCC issue with aligned vector instructions, by patching the behaviour we want into GCC (see discussion in bitcoin#24736). Possibly in response to the GCC thread (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c40), a new option was [introduced into the binutils assembler](https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=c8480b58e1968f209b6365af7422678f348222c2) with the 2.38 release:
  ```
  x86: Add -muse-unaligned-vector-move to assembler

  Unaligned load/store instructions on aligned memory or register are as
  fast as aligned load/store instructions on modern Intel processors.  Add
  a command-line option, -muse-unaligned-vector-move, to x86 assembler to
  encode encode aligned vector load/store instructions as unaligned
  vector load/store instructions.
  ```

ACKs for top commit:
  theuni:
    ACK 96f2cf8.

Tree-SHA512: f5aaa125922bb17bbb51454103b3394d293184214b0dea554c36c2f130488a3ede2f39678044ec846fa0fdf4cd441d4227f4565c29d380f5a73b50bf6f3b9a67
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants