Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simd setup: amd64 target doesn't account for AVX support only #7222

Closed
BGazotti opened this issue Mar 14, 2024 · 8 comments
Closed

simd setup: amd64 target doesn't account for AVX support only #7222

BGazotti opened this issue Mar 14, 2024 · 8 comments
Labels

Comments

@BGazotti
Copy link

Building kitty 0.33.0 on a FX-6300 CPU (which supports AVX but not AVX2) fails.

Error message:

Compiling kitty/simd-string-256.c ...
x86_64-pc-linux-gnu-gcc -MMD -DNDEBUG -DHAS_COPY_FILE_RANGE -march=bdver2 -pipe -O2 -ftree-vectorize -fsched-pressure -DPRIMARY_VE
RSION=4000 -DSECONDARY_VERSION=33 -DXT_VERSION="0.33.0" -Wextra -Wfloat-conversion -Wno-missing-field-initializers -Wall -Wstrict-
prototypes -std=c11 -fwrapv -fstack-protector-strong -pipe -fvisibility=hidden -D_FORTIFY_SOURCE=2 -fno-plt -fPIC -march=bdver2 -p
ipe -O2 -ftree-vectorize -fsched-pressure -fcf-protection=full -pthread -I/usr/include/libpng16 -I/usr/include/freetype2 -I/usr/in
clude/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/harfbuzz -I/usr/include/freetype2 -I/usr/inclu
de/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/python3.12 -mavx2 -mno-vzeroupper -c kitty/simd-string-256.c -o build/fas
t_data_types-kitty-simd-string-256.c.o
In file included from /usr/include/simde/arm/neon/dot.h:38,
                 from /usr/include/simde/arm/neon.h:71,
                 from kitty/simd-string-impl.h:36,
                 from kitty/simd-string-256.c:9:
/usr/include/simde/arm/neon/paddl.h: In function ‘simde_vpaddlq_u16’:
/usr/include/simde/arm/neon/paddl.h:289:9: error: ‘simde_uint32x4_private’ has no member named ‘
sse_m128i’
  289 |       r_.sse_m128i = _mm_haddd_epu16(a_.sse_m128i);
      |         ^
/usr/include/simde/arm/neon/paddl.h:289:40: error: ‘simde_uint16x8_private’ has no member named ‘
sse_m128i’
  289 |       r_.sse_m128i = _mm_haddd_epu16(a_.sse_m128i);

Editing setup.py to remove the assumption that amd64 arches have AVX2, falling back to AVX instead, and removing the #include <simde/arm/neon.h> in simd-string-impl.hseem to "fix" it - as in allow the build to finish.

Should I just stick to SSE? If so, how to specify that at configure time?

Thanks!

@BGazotti BGazotti added the bug label Mar 14, 2024
@kovidgoyal
Copy link
Owner

I am confused, the CPU your are using is irrelevant here, this is an
error during build not run. SIMD variants are chosen at runtime not
buildtime. At build time both SSE4.2, and AVX2 code is built which one
is used is determined at runtime. That error comes from a simde include,
so likely some issue between your build toolchain and simde.

@ionenwks
Copy link

ionenwks commented Mar 14, 2024

Sounds like a simde bug to me, it fails when -mxop is enabled (which is done with -march=bdver2, or -march=x86-64-v2 -mxop) -- Edit: or perhaps due to something else xop enables, haven't looked too closely

simde-features.h ends up hitting some different path and I think(?) it ends up never setting SIMDE_X86_SSE2_NATIVE which leads to m128i being missing (in simde/arm/neon/types.h):

  #if defined(SIMDE_X86_SSE2_NATIVE)
    __m128i m128i;
  #endif

@ionenwks
Copy link

fwiw CPPFLAGS="-DSIMDE_X86_XOP_NO_NATIVE=1" seems to work as a workaround with -march=x86-64-v2 -mxop (haven't tried bdver2 given that makes me hit illegal instructions).

@kovidgoyal
Copy link
Owner

@BGazotti maybe just remove the -march=bdver2 from your cflags and you should be fine. Do you have any idea where thats coming from? And of course report the issue to simde and hopefully they will fix it.

@ionenwks
Copy link

@BGazotti maybe just remove the -march=bdver2 from your cflags and you should be fine. Do you have any idea where thats coming from? And of course report the issue to simde and hopefully they will fix it.

I came here because I got a downstream bug report on Gentoo, but then they closed it (after seeing the unrelated -mavx2) and came here before I could reply, and most users on Gentoo will do -march=native or =something by default for their entire system.

That aside I applied a workaround for Gentoo, ideally should be looked at in simde though.

@mr-c
Copy link

mr-c commented Mar 25, 2024

@BGazotti Hello, I'm the SIMDe maintainer and I came across this report.

I have a fix at simd-everywhere/simde#1154 but I was unable to test it using qemu 8.2 ; can you confirm that it works for you? Using -march=native and/or -march=bdver2; thanks!

@ionenwks
Copy link

ionenwks commented Mar 25, 2024

I have a fix at simd-everywhere/simde#1154 but I was unable to test it using qemu 8.2 ; can you confirm that it works for you? Using -march=native and/or -march=bdver2; thanks!

From my end I was able to reproduce with -march=x86-64-v2 -mxop (bdver2 gave illegal instructions at build time without qemu) and can confirm that the PR applied to 0.8.0 fixes it for that scenario, thanks. I could apply it to our Gentoo package until next release.

@ionenwks
Copy link

...not that i can confirm that the resulting kitty binary works fine given that gives me illegal instructions without qemu too (I do not have xop, so not that's not surprising). But I'm not too worried there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants