Build fails on GCC 5.4 with "invalid register operand for `vmovdqu'" #58

erijo · 2020-02-10T22:23:35Z

When trying to build the C implementation of BLAKE3 on Ubuntu 16.04 LTS (used by Travis) the build fails when compiling blake3_avx512.c:

gcc -O3 -Wall -Wextra -std=c11 -pedantic  -c blake3_avx512.c -o blake3_avx512.o -mavx512f -mavx512vl
/tmp/ccaq5stm.s: Assembler messages:
/tmp/ccaq5stm.s:3763: Error: invalid register operand for `vmovdqu'
/tmp/ccaq5stm.s:3765: Error: invalid register operand for `vmovdqu'
Makefile:40: recipe for target 'blake3_avx512.o' failed
make: *** [blake3_avx512.o] Error 1

Looking at the generated assembler code, GCC generates vmovdqu %ymm17, (%rax) which is an invalid instruction as far as I can tell (VEX encoded instead of EVEX and can thus only access ymm0-ymm15). So it looks to be a compiler bug. But, if I add -mavx512bw GCC instead uses vmovdqu8 %ymm17, (%rax) which compiles.

I assume that get_cpu_features() should be updated to if -mavx512bw is to be used, but other than that, is there any downside with using it?

This error is seen in the CI of ccache, see ccache/ccache#519.

The text was updated successfully, but these errors were encountered:

sneves · 2020-02-10T23:58:53Z

This is indeed a bug in GCC, present up to 6.4. Another way to fix it, without invoking newer instruction sets, is to replace the stores here by:

    _mm256_mask_storeu_epi32(&out[0 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[0]));
    _mm256_mask_storeu_epi32(&out[1 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[1]));
    _mm256_mask_storeu_epi32(&out[2 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[2]));
    _mm256_mask_storeu_epi32(&out[3 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[3]));
    _mm256_mask_storeu_epi32(&out[4 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[4]));
    _mm256_mask_storeu_epi32(&out[5 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[5]));
    _mm256_mask_storeu_epi32(&out[6 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[6]));
    _mm256_mask_storeu_epi32(&out[7 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[7]));
    _mm256_mask_storeu_epi32(&out[8 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[8]));
    _mm256_mask_storeu_epi32(&out[9 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[9]));
    _mm256_mask_storeu_epi32(&out[10 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[10]));
    _mm256_mask_storeu_epi32(&out[11 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[11]));
    _mm256_mask_storeu_epi32(&out[12 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[12]));
    _mm256_mask_storeu_epi32(&out[13 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[13]));
    _mm256_mask_storeu_epi32(&out[14 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[14]));
    _mm256_mask_storeu_epi32(&out[15 * sizeof(__m256i)], (__mmask8)-1, _mm512_castsi512_si256(padded[15]));

dlegaultbbry · 2020-02-11T12:39:08Z

In case it helps anyone, I had something similar happen which I think is this bug:

https://www.mail-archive.com/bug-binutils@gnu.org/msg30569.html
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=97ed31ae00ea83410f9daf61ece8a606044af365

In any case, using the -mavx512bw option also resolved my issue until a time I can update the toolchain we use.

/tmp/ccBsfqcL.s: Assembler messages:
/tmp/ccBsfqcL.s:49783: Error: unsupported instruction `vmovdqu'
/tmp/ccBsfqcL.s:49819: Error: unsupported instruction `vmovdqu'
/tmp/ccBsfqcL.s:49871: Error: unsupported instruction `vmovdqu'
/tmp/ccBsfqcL.s:49878: Error: unsupported instruction `vmovdqu'
/tmp/ccBsfqcL.s:49885: Error: unsupported instruction `vmovdqu'

erijo · 2020-02-12T20:48:53Z

The new assembly implementations added in b6b3c27 also seems to work.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85328 Fixes #58.

…n CentOS 8 GCC 8.3.1 20191121 (Red Hat 8.3.1-5) was erroring out with: Error: unsupported instruction `vmovdqu' BLAKE3-team/BLAKE3#58 advised this was previously caused by a compiler bug up to 6.3, but that clearly isn't applicable here. But, the fix/workaround posted in comment BLAKE3-team/BLAKE3#58 (comment) works.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85328 Fixes BLAKE3-team#58.

erijo mentioned this issue Feb 12, 2020

Build fails with clang 7: invalid operand for instruction #60

Closed

sneves added a commit that referenced this issue Feb 13, 2020

Work around GCC bug 85328 by forcing trivially masked stores.

63937c2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85328 Fixes #58.

sneves mentioned this issue Feb 13, 2020

Work around compiler bugs #62

Merged

oconnor663 closed this as completed in #62 Feb 13, 2020

oconnor663 closed this as completed in 207915a Feb 13, 2020

kevingoh pushed a commit to ITS-AT-dev/BLAKE3 that referenced this issue Oct 23, 2023

Work around GCC bug 85328 by forcing trivially masked stores.

6593f82

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85328 Fixes BLAKE3-team#58.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build fails on GCC 5.4 with "invalid register operand for `vmovdqu'" #58

Build fails on GCC 5.4 with "invalid register operand for `vmovdqu'" #58

erijo commented Feb 10, 2020

sneves commented Feb 10, 2020

dlegaultbbry commented Feb 11, 2020

erijo commented Feb 12, 2020

Build fails on GCC 5.4 with "invalid register operand for `vmovdqu'" #58

Build fails on GCC 5.4 with "invalid register operand for `vmovdqu'" #58

Comments

erijo commented Feb 10, 2020

sneves commented Feb 10, 2020

dlegaultbbry commented Feb 11, 2020

erijo commented Feb 12, 2020