Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow detecting SIMD types based on compiler macros #6188

Merged
merged 6 commits into from
Oct 4, 2023

Conversation

masterleinad
Copy link
Contributor

@masterleinad masterleinad commented Jun 6, 2023

We have been noticing problems after merging SIMD pull requests that show that our CI support is pretty slim.
This pull request proposes allows Kokkos_ARCH_NATIVE add a CMake option that allows enabling SIMD support macro-based.

@masterleinad
Copy link
Contributor Author

Retest this please.

@masterleinad masterleinad force-pushed the simd_native branch 2 times, most recently from 0260101 to 57a8502 Compare June 13, 2023 18:08
@masterleinad masterleinad marked this pull request as ready for review June 14, 2023 03:22
@masterleinad masterleinad requested a review from ldh4 June 14, 2023 03:22
@masterleinad masterleinad added this to the Release 4.1 milestone Jun 14, 2023
simd/src/Kokkos_SIMD_AVX2.hpp Outdated Show resolved Hide resolved
using host_abi_set = abi_set<simd_abi::scalar, simd_abi::avx512_fixed_size<8>>;
using data_type_set = data_types<std::int32_t, std::uint32_t, std::int64_t,
std::uint64_t, double>;
#elif defined(KOKKOS_ARCH_AVX2)
#elif defined(KOKKOS_ARCH_AVX2) || \
(defined(KOKKOS_ENABLE_IMPL_SIMD_NATIVE) && defined(__AVX2__))
using host_abi_set = abi_set<simd_abi::scalar, simd_abi::avx2_fixed_size<4>>;
using data_type_set =
data_types<std::int32_t, std::int64_t, std::uint64_t, double>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does KOKKOS_ENABLE_IMPL_SIMD_NATIVE not work for NEON> (line 162)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are already auto-detecting the architecture for NEON. We could discuss if we would hide that behind KOKKOS_ENABLE_IMPL_SIMD_NATIVE or not given that we don't have an explicit architecture option for NEON.

The focus at the moment is on all the changes not related to the new option and I will split the pull request/remove the changes adding testing before merging so that we can review the proposed option separately.

@masterleinad masterleinad requested a review from ibaned June 15, 2023 02:23
dalg24
dalg24 previously requested changes Jun 15, 2023
Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split things up:
Move the fix the host device annotation on simd_mask operations to their own PR. This patch may be a candidate for cherry-picking into the 4.1 RC branch

Make sure you discuss the Impl::{value,mask} friend functions being replaced by impl_* member functions. (Probably belongs to its own PR as well)

simd/unit_tests/TestSIMD.cpp Outdated Show resolved Hide resolved
simd/src/Kokkos_SIMD_AVX512.hpp Outdated Show resolved Hide resolved
simd/src/Kokkos_SIMD_Common.hpp Outdated Show resolved Hide resolved
@ibaned
Copy link
Contributor

ibaned commented Jun 15, 2023

It's not super clear what this PR does, but I also don't see anything wrong with it

@masterleinad
Copy link
Contributor Author

Depends on #6223.

@ajpowelsnl
Copy link
Contributor

@ldh4 , @ndellingwood #6232 is relevant to this issue.

Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that multiple of these macros might be defined.
What if both __AVX__ and __AVX2__ are defined?
Currently that would define both macros KOKKOS_ARCH_AVX and KOKKOS_ARCH_AVX2 which would never happen before.

@masterleinad
Copy link
Contributor Author

Disabling support for AVX512 when using NVHPC must happen after the new code for ARCH_NATIVE (which might enable it again), see the failures in https://cloud.cees.ornl.gov/jenkins-ci/blue/organizations/jenkins/Kokkos/detail/Kokkos/14078/pipeline:

/usr/bin/ld: CMakeFiles/Kokkos_UnitTest_SIMD.dir/TestSIMD.cpp.o: in function `void host_check_math_op_all_loaders<Kokkos::Experimental::simd_abi::avx512_fixed_size<8>, plus, int, int>(plus, unsigned long, int const*, int const*)':

/var/jenkins/workspace/Kokkos/simd/unit_tests/include/SIMDTesting_Utilities.hpp:122: undefined reference to `__builtin_ia32_kandqi'

/usr/bin/ld: /var/jenkins/workspace/Kokkos/simd/unit_tests/include/SIMDTesting_Utilities.hpp:122: undefined reference to `__builtin_ia32_kandqi'

/usr/bin/ld: CMakeFiles/Kokkos_UnitTest_SIMD.dir/TestSIMD.cpp.o: in function `void host_check_math_op_all_loaders<Kokkos::Experimental::simd_abi::avx512_fixed_size<8>, plus, int, int>(plus, unsigned long, int const*, int const*)':

/var/jenkins/workspace/Kokkos/simd/src/Kokkos_SIMD_AVX512.hpp:72: undefined reference to `__builtin_ia32_kxorqi'

/usr/bin/ld: /var/jenkins/workspace/Kokkos/simd/src/Kokkos_SIMD_AVX512.hpp:72: undefined reference to `__builtin_ia32_kxorqi'
[...]

@masterleinad
Copy link
Contributor Author

Currently that would define both macros KOKKOS_ARCH_AVX and KOKKOS_ARCH_AVX2 which would never happen before.

Right, but we only use these preprocessor definitions in Kokkos_SIMD.hpp and the CI shows that defining multiple of these doesn't cause issues. In fact, it allows users to select more SIMD types than before which should help some with #5677.
I don't think defining multiple of these would cause issues for users. We can of course take a step back and only define one preprocessor definition for more uniformity.

@masterleinad
Copy link
Contributor Author

I noticed that we didn't set the internal macros when rerunning CMake and the compiler options hadn't changed. The last commit makes sure that we do that now again (like in my initial version).

@masterleinad
Copy link
Contributor Author

Retest this please.

@dalg24
Copy link
Member

dalg24 commented Sep 5, 2023

Retest this please.
Not sure about the ROCm 5.6 failure

@dalg24
Copy link
Member

dalg24 commented Sep 6, 2023


[ 94%] Building CXX object core/unit_test/CMakeFiles/Kokkos_CoreUnitTest_HIP.dir/hip/TestHIP_SubView_c10.cpp.o
fatal error: error in backend: Instruction Combining seems stuck in an infinite loop after 1000 iterations.
clang-16: error: clang frontend command failed with exit code 70 (use -v to see invocation)
AMD clang version 16.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.6.0 23243 be997b2f3651a41597d7a41441fff8ade4ac59ac)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-5.6.0/llvm/bin
clang-16: note: diagnostic msg: Error generating preprocessed source(s).
make[2]: *** [simd/unit_tests/CMakeFiles/Kokkos_UnitTest_SIMD.dir/build.make:79: simd/unit_tests/CMakeFiles/Kokkos_UnitTest_SIMD.dir/TestSIMD.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:5168: simd/unit_tests/CMakeFiles/Kokkos_UnitTest_SIMD.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 95%] Building CXX object core/unit_test/CMakeFiles/Kokkos_CoreUnitTest_HIP.dir/hip/TestHIP_SubView_c11.cpp.o

@masterleinad
Copy link
Contributor Author

Not sure about the ROCm 5.6 failure

Let's see if we can get the failure reproduced (and print the detected architecture in that case).

@Rombur
Copy link
Member

Rombur commented Sep 12, 2023

Retest this please

@masterleinad
Copy link
Contributor Author

Looks like the problem with ROCm 5.6.0 not being able to compile the SIMD tests with AVX2 persists. I disabled AVX2 support in that case now as a workaround. Unfortunately, I don't have access to that compiler version to easily debug this further.

@masterleinad
Copy link
Contributor Author

#6449 should be merged first.

@masterleinad masterleinad marked this pull request as draft September 18, 2023 12:33
@masterleinad masterleinad dismissed dalg24’s stale review September 28, 2023 15:22

Requested changes were addressed, the respective changes were separated form this pull request.

@crtrott crtrott merged commit 4ce289b into kokkos:develop Oct 4, 2023
28 checks passed
@masterleinad masterleinad mentioned this pull request Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants