Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround a miscompilation in clang 12 (XCode 13) #2803

Merged
merged 1 commit into from Oct 20, 2021

Conversation

reneme
Copy link
Collaborator

@reneme reneme commented Sep 24, 2021

This works around the issue investigated in #2802. For optimization levels -O2 and higher, clang 12 (and XCode 13) miscompile the SHA-3 implementation. Anecdotal evidence suggests that this happens only on macOS.

The workaround pulls the initial XOR operations in SHA3_round() into an extra function and blocks inlining for the affected compilers. Without the __attribute__(noinline) the issue would persist, despite the refactoring. Therefore, I hope that other compilers won't see a performance penalty due to this workaround.

We cannot rule out that other platforms suffer from the same (presumed) clang bug, so I suggest to apply the workaround solely based on the clang version and not the platform.

Side note: To me, the implementation in sha3_bmi2.cpp looks exactly equal to the standard implementation. I'm I missing something?

@codecov-commenter
Copy link

codecov-commenter commented Sep 24, 2021

Codecov Report

Merging #2803 (b0182cc) into master (d17ffc7) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2803      +/-   ##
==========================================
- Coverage   92.33%   92.32%   -0.01%     
==========================================
  Files         568      568              
  Lines       63091    63098       +7     
  Branches     6179     6178       -1     
==========================================
+ Hits        58255    58256       +1     
- Misses       4804     4810       +6     
  Partials       32       32              
Impacted Files Coverage Δ
src/lib/hash/sha3/sha3.cpp 100.00% <100.00%> (ø)
src/lib/hash/sha3/sha3_bmi2/sha3_bmi2.cpp 100.00% <100.00%> (ø)
src/fuzzer/ressol.cpp 85.71% <0.00%> (-7.15%) ⬇️
src/lib/pk_pad/emsa_raw/emsa_raw.cpp 71.42% <0.00%> (-2.86%) ⬇️
src/lib/asn1/der_enc.cpp 80.74% <0.00%> (-2.49%) ⬇️
src/lib/pubkey/elgamal/elgamal.cpp 95.94% <0.00%> (-1.36%) ⬇️
src/lib/pubkey/mce/mceliece_key.cpp 83.76% <0.00%> (-1.05%) ⬇️
src/lib/math/numbertheory/numthry.cpp 94.59% <0.00%> (-0.55%) ⬇️
src/fuzzer/mode_padding.cpp 100.00% <0.00%> (+6.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d17ffc7...b0182cc. Read the comment docs.

mouse07410 added a commit to mouse07410/botan that referenced this pull request Sep 24, 2021
@randombit
Copy link
Owner

Side note: To me, the implementation in sha3_bmi2.cpp looks exactly equal to the standard implementation. I'm I missing something?

There should be a comment about this as there is for the SHA-2 BMI2 versions. Basically we just compile the same code twice once with BMI2 enabled and that is sufficient for GCC/Clang to do the right thing.

@randombit
Copy link
Owner

First: thanks for tracking down this issue as well as reporting it upstream to LLVM. Compiler bugs like this are never fun.

I am not so happy adding this complicated mess but unfortunately I cannot find a cleaner way outside of the very big hammer of __attribute__((optnone)) - unfortunately Clang does not support GCC's optimize function attribute so there is no other way to disable the vectorizer for a single function than to just turn off all optimizations. (I had hoped min_size attribute would turn off the vectorizer as a side effect, but alas no).

@randombit randombit merged commit 2dd45b5 into randombit:master Oct 20, 2021
@reneme
Copy link
Collaborator Author

reneme commented Oct 22, 2021

I am not so happy adding this complicated mess unfortunately I cannot find a cleaner way outside of the very big hammer [...]

In the context of the LLVM ticket, they found that -mno-sse4.1 also produces a working binary. It can also be reproduced on Compiler Explorer. As an alternative solution we might get away with disabling SSE 4.1 on this one compilation unit when using Clang 12. I'm not sure how much work would be require to get this into the build scripts though.

Nevertheless, do we need this back ported for 2.18.2, as well? I suppose people on latest macOS building botan via conan would be grateful... Other clang 12 users as welll, of course.

@randombit
Copy link
Owner

@reneme Yes we should backport this. Can you create a PR for this? I'd like to include this in 2.18.2 if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants