avx512_int32.hpp uses _mm512_mask_i64scatter_epi32 incorrectly #1294

jeffhammond · 2022-07-12T06:00:40Z

You are passing a 512-bit register when a 256-bit register is required.

void _mm512_mask_i64scatter_epi32 (void* base_addr, 
                                   __mmask8 k,
                                   __m512i vindex,
                                   __m256i a,        // <- NOTICE THIS ONE
                                   int scale)

(from docs)

It is used incorrectly in policy/tensor/arch/avx512/avx512_int32.hpp here:

      /*!
       * @brief Store partial register to consecutive memory locations
       *
       */
      RAJA_INLINE
      self_type const &store_strided_n(element_type *ptr, camp::idx_t stride, camp::idx_t N) const{
        // AVX512F
        _mm512_mask_i64scatter_epi32(ptr,
                                     createMask(N),
                                     createStridedOffsets(stride),
                                     m_value,        // <- NOTICE THIS ONE
                                     sizeof(element_type));
        return *this;
      }

You declare the argument here:

    public:
      using register_type = __m512i;
    private:
      register_type m_value;

This bug was found by NVC++, which refuses to compile this code.

The text was updated successfully, but these errors were encountered:

jeffhammond · 2022-07-12T06:01:51Z

Also, please expand tabs in this file. Not all of us use tab = 2 spaces, and this file is unreadable with the defaults of some editors.

ajkunen · 2022-07-12T15:54:35Z

@jeffhammond thanks for bug report! we'll take care of both those issues.

eoseret · 2022-08-10T11:41:26Z

Hi,
If I understand correctly, just do as for gathers (that compiles correctly): replace i64 with i32 (size of index elements) and then compilation will be fine.
4 occurrences: store_strided and store_strided_n in both avx512_float.hpp and avx512_int32.hpp
Git-patch enclosed, feel free to look at it.
raja_AVX512_scatter_error_git_patch.txt

rchen20 · 2023-05-16T22:11:41Z

Closing this issue because it is fixed here #1339.

ajkunen self-assigned this Jul 12, 2022

ajkunen added bug compilation vectorization API labels Jul 12, 2022

rchen20 self-assigned this Aug 11, 2022

rchen20 closed this as completed May 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avx512_int32.hpp uses _mm512_mask_i64scatter_epi32 incorrectly #1294

avx512_int32.hpp uses _mm512_mask_i64scatter_epi32 incorrectly #1294

jeffhammond commented Jul 12, 2022

jeffhammond commented Jul 12, 2022

ajkunen commented Jul 12, 2022

eoseret commented Aug 10, 2022

rchen20 commented May 16, 2023

avx512_int32.hpp uses _mm512_mask_i64scatter_epi32 incorrectly #1294

avx512_int32.hpp uses _mm512_mask_i64scatter_epi32 incorrectly #1294

Comments

jeffhammond commented Jul 12, 2022

jeffhammond commented Jul 12, 2022

ajkunen commented Jul 12, 2022

eoseret commented Aug 10, 2022

rchen20 commented May 16, 2023