Workaround a miscompilation in clang 12 (XCode 13) #2803

reneme · 2021-09-24T13:40:52Z

This works around the issue investigated in #2802. For optimization levels -O2 and higher, clang 12 (and XCode 13) miscompile the SHA-3 implementation. Anecdotal evidence suggests that this happens only on macOS.

The workaround pulls the initial XOR operations in SHA3_round() into an extra function and blocks inlining for the affected compilers. Without the __attribute__(noinline) the issue would persist, despite the refactoring. Therefore, I hope that other compilers won't see a performance penalty due to this workaround.

We cannot rule out that other platforms suffer from the same (presumed) clang bug, so I suggest to apply the workaround solely based on the clang version and not the platform.

Side note: To me, the implementation in sha3_bmi2.cpp looks exactly equal to the standard implementation. I'm I missing something?

codecov-commenter · 2021-09-24T14:27:03Z

Codecov Report

Merging #2803 (b0182cc) into master (d17ffc7) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #2803      +/-   ##
==========================================
- Coverage   92.33%   92.32%   -0.01%     
==========================================
  Files         568      568              
  Lines       63091    63098       +7     
  Branches     6179     6178       -1     
==========================================
+ Hits        58255    58256       +1     
- Misses       4804     4810       +6     
  Partials       32       32

Impacted Files	Coverage Δ
src/lib/hash/sha3/sha3.cpp	`100.00% <100.00%> (ø)`
src/lib/hash/sha3/sha3_bmi2/sha3_bmi2.cpp	`100.00% <100.00%> (ø)`
src/fuzzer/ressol.cpp	`85.71% <0.00%> (-7.15%)`	⬇️
src/lib/pk_pad/emsa_raw/emsa_raw.cpp	`71.42% <0.00%> (-2.86%)`	⬇️
src/lib/asn1/der_enc.cpp	`80.74% <0.00%> (-2.49%)`	⬇️
src/lib/pubkey/elgamal/elgamal.cpp	`95.94% <0.00%> (-1.36%)`	⬇️
src/lib/pubkey/mce/mceliece_key.cpp	`83.76% <0.00%> (-1.05%)`	⬇️
src/lib/math/numbertheory/numthry.cpp	`94.59% <0.00%> (-0.55%)`	⬇️
src/fuzzer/mode_padding.cpp	`100.00% <0.00%> (+6.34%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d17ffc7...b0182cc. Read the comment docs.

… fixes the problem.

randombit · 2021-10-20T12:22:14Z

Side note: To me, the implementation in sha3_bmi2.cpp looks exactly equal to the standard implementation. I'm I missing something?

There should be a comment about this as there is for the SHA-2 BMI2 versions. Basically we just compile the same code twice once with BMI2 enabled and that is sufficient for GCC/Clang to do the right thing.

randombit · 2021-10-20T12:58:38Z

First: thanks for tracking down this issue as well as reporting it upstream to LLVM. Compiler bugs like this are never fun.

I am not so happy adding this complicated mess but unfortunately I cannot find a cleaner way outside of the very big hammer of __attribute__((optnone)) - unfortunately Clang does not support GCC's optimize function attribute so there is no other way to disable the vectorizer for a single function than to just turn off all optimizations. (I had hoped min_size attribute would turn off the vectorizer as a side effect, but alas no).

reneme · 2021-10-22T11:59:06Z

I am not so happy adding this complicated mess unfortunately I cannot find a cleaner way outside of the very big hammer [...]

In the context of the LLVM ticket, they found that -mno-sse4.1 also produces a working binary. It can also be reproduced on Compiler Explorer. As an alternative solution we might get away with disabling SSE 4.1 on this one compilation unit when using Clang 12. I'm not sure how much work would be require to get this into the build scripts though.

Nevertheless, do we need this back ported for 2.18.2, as well? I suppose people on latest macOS building botan via conan would be grateful... Other clang 12 users as welll, of course.

randombit · 2021-10-22T13:03:30Z

@reneme Yes we should backport this. Can you create a PR for this? I'd like to include this in 2.18.2 if possible.

reneme force-pushed the fix/workaround_clang12_bug branch from e0212ca to b0182cc Compare September 24, 2021 13:43

mouse07410 added a commit to mouse07410/botan that referenced this pull request Sep 24, 2021

Remove low-optimization workaround, as there's PR randombit#2803 that…

195bb5c

… fixes the problem.

workaround a miscompilation issue in clang 12 (XCode 13)

cb078d8

reneme force-pushed the fix/workaround_clang12_bug branch from b0182cc to cb078d8 Compare October 19, 2021 12:03

randombit merged commit 2dd45b5 into randombit:master Oct 20, 2021

reneme mentioned this pull request Oct 22, 2021

Backport: Workaround a miscompilation in clang 12 (XCode 13) #2826

Merged

tobiasbrunner mentioned this pull request Oct 26, 2021

Redefinition of xor_CNs due to SHA-3 Workaround #2829

Closed

reneme mentioned this pull request Oct 28, 2021

FIX: amalgamation build (SHA3 BMI2) #2834

Merged

reneme deleted the fix/workaround_clang12_bug branch June 7, 2022 08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround a miscompilation in clang 12 (XCode 13) #2803

Workaround a miscompilation in clang 12 (XCode 13) #2803

reneme commented Sep 24, 2021

codecov-commenter commented Sep 24, 2021 •

edited

randombit commented Oct 20, 2021

randombit commented Oct 20, 2021

reneme commented Oct 22, 2021

randombit commented Oct 22, 2021

Workaround a miscompilation in clang 12 (XCode 13) #2803

Workaround a miscompilation in clang 12 (XCode 13) #2803

Conversation

reneme commented Sep 24, 2021

codecov-commenter commented Sep 24, 2021 • edited

Codecov Report

randombit commented Oct 20, 2021

randombit commented Oct 20, 2021

reneme commented Oct 22, 2021

randombit commented Oct 22, 2021

codecov-commenter commented Sep 24, 2021 •

edited