Added AVX10_2 and AVX10_2_512 targets#2395
Added AVX10_2 and AVX10_2_512 targets#2395copybara-service[bot] merged 1 commit intogoogle:masterfrom
Conversation
|
Here is the CMake command line to configure to build only the HWY_AVX10_2 target (but not any other targets) with Clang 19 (where Here is the CMake command line to configure to build only the HWY_AVX10_2_512 target (but not any other targets) with Clang 19 (where Here is the CMake command line to configure to build the HWY_AVX10_2 and HWY_AVX10_2_512 targets (but not any other targets) with Clang 19 (where |
jan-wassenberg
left a comment
There was a problem hiding this comment.
Very nice! Thanks for adding this already. Just one small typo, but we'll go ahead and try to import/run CI in case other changes are required.
| // includes base.h and shared-inl.h. | ||
| #include "hwy/ops/x86_256-inl.h" | ||
| #else | ||
| // For AVX3/AVX10 targets that support 512-byte vectors. Already includes base.h |
Added preliminary support for the HWY_AVX10_2 (AVX10.2 with 256-bit vectors) and HWY_AVX10_2_512 (AVX10.2 with 512-bit vectors) targets.
Added CPUID detection for AVX10.1 and AVX10.2 support in hwy/targets.cc.
Also added a new hwy/ops/x86_avx3-inl.h header, and moved some of the AVX3/AVX10-specific ops that have dependencies on hwy/ops/x86_512-inl.h if HWY_MAX_BYTES == 64 into the new hwy/ops/x86_avx3-inl.h.
Also moved some of the AVX3/AVX3_DL-specific ops that operate on 256-bit or smaller vectors into hwy/ops/x86_128-inl.h and hwy/ops/x86_256-inl.h to support AVX10.2 targets that do not support 512-bit vectors.
Also refactored some of the AVX3 target macros in hwy/ops/set_macros-inl.h as follows:
-mevex512option, and is otherwise defined as an empty macro-mevex512option)-mevex512option)-mevex512option)To compile and run the Google Highway unit tests for the HWY_AVX10_2 and HWY_AVX10_2_512 targets, Clang 19 or later and Intel SDE 9.44 or later are needed.
There are some compilation issues with compiling the HWY_AVX10_2 and HWY_AVX10_2_512 with Clang 18 and GCC 14, even with both Clang 18 and GCC 14 supporting the
-mno-evex512option, including a compiler crash when compiling the matvec_test.cc for the HWY_AVX10_2 target with Clang 18 and compiler warnings that are emitted by GCC 14 when casting a int16_t to __bf16 when compiling with-march=sapphirerapids.The HWY_AVX10_2 and HWY_AVX10_2_512 are not included in HWY_ATTAINABLE_TARGETS_X86 by default due to compiler issues with GCC or Clang 18.