Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace configure's ARCH... with Q_PROCESSOR_... v2 #487

Merged
merged 10 commits into from Feb 18, 2022

Conversation

ulmus-scott
Copy link
Contributor

Was in #470

Separated from other changes.

HAVE_INTRINSICS_NEON

check_code cc arm_neon.h "int16x8_t test = vdupq_n_s16(0)" && enable intrinsics_neon

with 16 registers, instead of the original 8 for the x86 extension.

I have not investigated whether or not the extra registers are used.
It's only used once, so it could be a function static.
Q_PROCESSOR_ARM_64 is undocumented, but has existed since 2013, so I have duplicated
the checks for GCC/clang and MSVC.
This should be equivalent to the check from configure.

x86_64 assumes <emmintrin.h> exists; however, I'm not sure if it would also be acceptable
to assume <arm_neon.h> exists when compiling for ARM.
@linuxdude42 linuxdude42 merged commit b074c2a into MythTV:master Feb 18, 2022
@ulmus-scott ulmus-scott deleted the ARCHv2 branch February 18, 2022 21:38
@linuxdude42
Copy link
Contributor

Commit 93de0f8 with the HAVE_INTRINSICS_NEON change is causing build failures on an rpi3. I'm seeing the following errors:

In file included from mythdeinterlacer.cpp:25:
/usr/lib/gcc/arm-linux-gnueabihf/10/include/arm_neon.h: In function ‘void BlendSIMD8x4(unsigned char*, int, int, int, int, unsigned char*, int, bool)’:
/usr/lib/gcc/arm-linux-gnueabihf/10/include/arm_neon.h:903:1: error: inlining failed in call to ‘always_inline’ ‘uint16x8_t vrhaddq_u16(uint16x8_t, uint16x8_t)’: target specific option mismatch
  903 | vrhaddq_u16 (uint16x8_t __a, uint16x8_t __b)
      | ^~~~~~~~~~~
mythdeinterlacer.cpp:617:32: note: called from here
  617 |                     vrhaddq_u16(*reinterpret_cast<uint16x8_t*>(&below[col]), mid);
      |                     ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from mythdeinterlacer.cpp:25:
/usr/lib/gcc/arm-linux-gnueabihf/10/include/arm_neon.h:903:1: error: inlining failed in call to ‘always_inline’ ‘uint16x8_t vrhaddq_u16(uint16x8_t, uint16x8_t)’: target specific option mismatch
  903 | vrhaddq_u16 (uint16x8_t __a, uint16x8_t __b)
      | ^~~~~~~~~~~
mythdeinterlacer.cpp:615:32: note: called from here
  615 |                     vrhaddq_u16(*reinterpret_cast<uint16x8_t*>(&above[col]), mid);
      |                     ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[2]: *** [Makefile:18542: obj/mythdeinterlacer.o] Error 1

This change doesn't have the additional compilation test that configure does on line 5568, which is apparently necessary.

@garybuhrmaster
Copy link
Contributor

Commit 93de0f8 with the HAVE_INTRINSICS_NEON change is causing build failures on an rpi3.

And was first reported failed by the armv7 buildbot worker on irc on 2022-02-18.

@ulmus-scott
Copy link
Contributor Author

The builder @garybuhrmaster referenced uses -march=armv7-a.

GCC may want -march=armv7-a+simd or -mfpu=neon. It won't auto-vectorize NEON, so should be safe.
https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options

The used intrinsic is valid: https://developer.arm.com/architectures/instruction-sets/intrinsics/#q=vrhaddq_u16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants