Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cooperlake and sapphire rapids march flags on clang #4192

Merged
merged 1 commit into from
Aug 21, 2023

Conversation

imciner2
Copy link
Contributor

The march=cooperlake and march=sapphirerapids flags were never getting added when building with Clang targetting those architectures. Instead it was falling back to the skylake AVX512 implementation. This was causing issues when building configurations such as

CC=clang make TARGET=COOPERLAKE BUILD_BFLOAT16=1

because there was either no march on the compiler command, or it was march=skylake-avx512 -mavx512f, neither of which would support the bfloat intrinsics needed, leading to errors like

../kernel/x86_64/stobf16_microk_cooperlake.c:49:20: error: ../kernel/x86_64/sbgemm_tcopy_4_cooperlake.calways_inline function '_mm512_maskz_loadu_ps' requires target feature 'avx512f', but would be inlined into function 'tobf16_accl_kernel' that is compiled without support for 'avx512f':106:21
        __m512 a = _mm512_maskz_loadu_ps(*((__mmask16*) &align_mask16), &in[0]);
                   ^: 

or

In file included from ../kernel/x86_64/tobf16.c:42:
../kernel/x86_64/stobf16_microk_cooperlake.c:46:84: error: always_inline function '_mm512_cvtneps_pbh' requires target feature 'avx512bf16', but would be inlined into function 'tobf16_accl_kernel' that is compiled without support for 'avx512bf16'
        _mm256_mask_storeu_epi16(&out[0], *((__mmask16*) &align_mask16), (__m256i) _mm512_cvtneps_pbh(a));
                                                                                   ^

Clang added support for these two architectures in Clang 9 and Clang 12, so introduce new checks for those versions to enable the appropriate march flag, and fallback to skylake otherwise.

The march=cooperlake and march=sapphirerapids flags were never getting
added when building with Clang targetting those architectures. Instead
it was falling back to the skylake AVX512 implementation.

Clang added support for these two architectures in Clang 9 and Clang 12,
so introduce new checks for those versions to enable the appropriate
march flag, and fallback to skylake otherwise.
@martin-frbg
Copy link
Collaborator

Thanks - seems I completely forgot to push the x86_64 changes when I "fixed" Arm64 compilation with clang recently, and your fix is cleaner anyway.

@martin-frbg martin-frbg merged commit 12ede72 into OpenMathLib:develop Aug 21, 2023
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants