Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(simd.h): Address NEON issues #4143

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Feb 7, 2024

Primarily, recent changes (PR #4071) to vint4::store for the NEON case appear to have some type mismatches, which apple clang on ARM-based Mac (including our CI) seems ok with, but which is generating type errors on other ARM Linux platforms.

I think the types were weird here, so I tightened it up to get the types right for temporary variables in that function. That's the primary fix here.

Secondarily, I modified simd.h and the CMake setup so that build option USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I realized that USE_SIMD=0 was not disabling NEON, so there was no way for a NEON platform to completely disable SIMD if they needed to.)

Fixes #4111

Primarily, recent changes (PR 4071) to vint4::store for the NEON case
appear to have some type mismatches, which apple clang on ARM-based
Mac (including our CI) seems ok with, but which is generating type
errors on other ARM Linux platforms.

I think the types were weird here, so I tightened it up to get the
types right for temporary variables in that function. That's the
primary fix here.

Secondarily, I modfied simd.h and the CMake setup so that build option
USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I
realized that USE_SIMD=0 was not disabling NEON, so there was no way
for a NEON platform to completely disable SIMD if they needed to.)

Signed-off-by: Larry Gritz <lg@larrygritz.com>
@lgritz
Copy link
Collaborator Author

lgritz commented Feb 7, 2024

Fixes #4111

@lgritz
Copy link
Collaborator Author

lgritz commented Feb 7, 2024

@aras-p Could you take a look and see what you think? This is addressing code you recently added. I think that functionally your code was doing the right operations (and I believe it is tested by simd_test in the testsuite), but it seems that some compilers accepted your odd typing, whereas other compilers seem distressed by the implied bitcasts and wanted the right neon intrinsics to make that explicit.

@lgritz lgritz merged commit 94f1c49 into AcademySoftwareFoundation:master Feb 8, 2024
24 of 25 checks passed
lgritz added a commit to lgritz/OpenImageIO that referenced this pull request Feb 8, 2024
Primarily, recent changes (PR AcademySoftwareFoundation#4071) to vint4::store for the NEON case
appear to have some type mismatches, which apple clang on ARM-based Mac
(including our CI) seems ok with, but which is generating type errors on
other ARM Linux platforms.

I think the types were weird here, so I tightened it up to get the types
right for temporary variables in that function. That's the primary fix
here.

Secondarily, I modified simd.h and the CMake setup so that build option
USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I
realized that USE_SIMD=0 was not disabling NEON, so there was no way for
a NEON platform to completely disable SIMD if they needed to.)

Fixes AcademySoftwareFoundation#4111

Signed-off-by: Larry Gritz <lg@larrygritz.com>
@aras-p
Copy link
Contributor

aras-p commented Feb 8, 2024

Right, sorry about that! Clang is indeed more relaxed wrt NEON intrinsics compared to Gcc or Msvc. I should have checked on those. (but also, I guess, would be useful to have those compilers for Arm in OIIO CI?)

@lgritz lgritz deleted the lg-noneon branch February 10, 2024 16:27
1div0 pushed a commit to 1div0/OpenImageIO that referenced this pull request Feb 24, 2024
Primarily, recent changes (PR AcademySoftwareFoundation#4071) to vint4::store for the NEON case
appear to have some type mismatches, which apple clang on ARM-based Mac
(including our CI) seems ok with, but which is generating type errors on
other ARM Linux platforms.

I think the types were weird here, so I tightened it up to get the types
right for temporary variables in that function. That's the primary fix
here.

Secondarily, I modified simd.h and the CMake setup so that build option
USE_SIMD=0 will disable NEON in the same way that it disables SSE. (I
realized that USE_SIMD=0 was not disabling NEON, so there was no way for
a NEON platform to completely disable SIMD if they needed to.)

Fixes AcademySoftwareFoundation#4111

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Peter Kovář <peter.kovar@reflexion.tv>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] OIIO 2.5.7.0 fails to build on Fedora only on aarch64
2 participants