Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misuse of __AVX2__ etc., likely to cause miscompilation with GCC14 #2163

Closed
mixal-iirec opened this issue Apr 5, 2024 · 3 comments · Fixed by #2164
Closed

misuse of __AVX2__ etc., likely to cause miscompilation with GCC14 #2163

mixal-iirec opened this issue Apr 5, 2024 · 3 comments · Fixed by #2164

Comments

@mixal-iirec
Copy link

singleheader/simdjson.cpp contains:

#define SIMDJSON_CAN_ALWAYS_RUN_HASWELL ((SIMDJSON_IMPLEMENTATION_HASWELL) && (SIMDJSON_IS_X86_64) && (__AVX2__) && (__BMI__) && (__PCLMUL__) && (__LZCNT__))

/* ... */

#if !SIMDJSON_CAN_ALWAYS_RUN_HASWELL
SIMDJSON_TARGET_REGION("avx2,bmi,pclmul,lzcnt,popcnt")
#endif

/* Code to be compiled with haswell instruction sets. */

#if !SIMDJSON_CAN_ALWAYS_RUN_HASWELL
SIMDJSON_UNTARGET_REGION
#endif

Expanded:

#if ! ((SIMDJSON_IMPLEMENTATION_HASWELL) && (SIMDJSON_IS_X86_64) && (__AVX2__) && (__BMI__) && (__PCLMUL__) && (__LZCNT__))
_Pragma("GCC push_options")
_Pragma("GCC target(\"avx2,bmi,pclmul,lzcnt,popcnt\")")
#endif

/* Code to be compiled with haswell instruction sets. */

#if ! ((SIMDJSON_IMPLEMENTATION_HASWELL) && (SIMDJSON_IS_X86_64) && (__AVX2__) && (__BMI__) && (__PCLMUL__) && (__LZCNT__))
_Pragma("GCC pop_options")
#endif

Simplified:

#if ! __AVX2__
#pragma GCC push_options
#pragma GCC target("avx2")
#endif

/* Code to be compiled with haswell instruction sets. */

#if ! __AVX2__
#pragma GCC pop_options
#endif

GCC 13 and below defines __AVX2__ only when specified on command line.
This has been considered a bug, and was fixed for GCC14 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87299

In GCC 14, __AVX2__ is also defined with #pragma GCC target("avx2"). Thus the second __AVX2__ check will be always false and pop_options will never be called.
So all functions in the rest of the source file will be compiled with this instruction set added.

This is the underlying issue behind #2116


The fix should hopefully be straightforward (though I do not plan to fix it myself).
SIMDJSON_CAN_ALWAYS_RUN_HASWELL and any value guarding SIMDJSON_UNTARGET_REGION should not change in time:

#if ((SIMDJSON_IMPLEMENTATION_HASWELL) && (SIMDJSON_IS_X86_64) && (__AVX2__) && (__BMI__) && (__PCLMUL__) && (__LZCNT__))
#define SIMDJSON_CAN_ALWAYS_RUN_HASWELL 1
#else
#define SIMDJSON_CAN_ALWAYS_RUN_HASWELL 0
#endif
@lemire
Copy link
Member

lemire commented Apr 5, 2024

Thanks for the thorough analysis. We will apply your recommendation.

Note that GCC 14 is not released. Even so, I find that simdjson builds fine with GCC 14 under the current Fedora Rawhide:

$ gcc --version
gcc (GCC) 14.0.1 20240328 (Red Hat 14.0.1-0)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -c simdjson.cpp  -std=c++17

I have removed the 'bug' marker. We only support released compilers.

Thanks again for the excellent analysis.

@mixal-iirec
Copy link
Author

Note that GCC 14 is not released. Even so, I find that simdjson builds fine with GCC 14 under the current Fedora

Sorry that I haven't made it clear. Current version builds which is the bad case.
#2116 only made it easier to find.

But the resulting binary is wrong, for example after

gcc -c simdjson.cpp -O2 -std=c++17
objdump -S simdjson.o

<_ZNK8simdjson8westmere14implementation13validate_utf8EPKcm>
contains vpternlogq which is avx512 instruction, unsuported by westmere.

@lemire
Copy link
Member

lemire commented Apr 6, 2024

Thanks. The issue should be solved in any case in the current release. If not, just open a new pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants