Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AVX-512 round function #4119

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

AngryLoki
Copy link
Contributor

Description

This PR fixes vfloat16 round function. Intrinsic _mm512_roundscale_ps was
used incorrectly, and caused failure on Zen4 CPU.

/var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579:
FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3]))
	values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4'

In old code _mm512_roundscale_ps (a, (1<<4) | 3) meant the following:

[0001] - Number of fixed points to preserve
[0] - Use MSCSR exception mask
[0] - Select mode from imm
[11] - Truncate mode

Effectively enabling rounding to nearest 0.5, not to integer.

References:

Tests

  • This fixes test_simd

Checklist:

  • I have read the contribution guidelines.
  • I have updated the documentation, if applicable.
  • I have ensured that the change is tested somewhere in the testsuite
    (adding new test cases if necessary).
  • If I added or modified a C++ API call, I have also amended the
    corresponding Python bindings (and if altering ImageBufAlgo functions, also
    exposed the new functionality as oiiotool options).
  • My code follows the prevailing code style of this project. If I haven't
    already run clang-format before submitting, I definitely will look at the CI
    test that runs clang-format and fix anything that it highlights as being
    nonconforming.

Copy link

linux-foundation-easycla bot commented Jan 19, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

AngryLoki added a commit to AngryLoki/gentoo that referenced this pull request Jan 19, 2024
Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
Copy link
Collaborator

@lgritz lgritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me, thanks for the fix!

@lgritz lgritz merged commit b850a07 into AcademySoftwareFoundation:master Jan 20, 2024
25 checks passed
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request Jan 21, 2024
This PR fixes vfloat16 round function. Intrinsic `_mm512_roundscale_ps`
was used incorrectly, and caused failure on Zen4 CPU.

```
/var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579:
FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3]))
	values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4'
``` 

In old code `_mm512_roundscale_ps (a, (1<<4) | 3)` meant the following:
```
[0001] - Number of fixed points to preserve
[0] - Use MSCSR exception mask
[0] - Select mode from imm
[11] - Truncate mode
```
Effectively enabling rounding to nearest 0.5, not to integer.

References:
* https://www.felixcloutier.com/x86/vrndscalepd#fig-5-29
*
https://stackoverflow.com/questions/50854991/instrinsic-mm512-round-ps-is-missing-for-avx512


Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
DarkDefender pushed a commit to DarkDefender/gentoo that referenced this pull request Jan 26, 2024
DarkDefender pushed a commit to DarkDefender/gentoo that referenced this pull request Jan 26, 2024
1div0 pushed a commit to 1div0/OpenImageIO that referenced this pull request Feb 24, 2024
This PR fixes vfloat16 round function. Intrinsic `_mm512_roundscale_ps`
was used incorrectly, and caused failure on Zen4 CPU.

```
/var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579:
FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3]))
	values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4'
```

In old code `_mm512_roundscale_ps (a, (1<<4) | 3)` meant the following:
```
[0001] - Number of fixed points to preserve
[0] - Use MSCSR exception mask
[0] - Select mode from imm
[11] - Truncate mode
```
Effectively enabling rounding to nearest 0.5, not to integer.

References:
* https://www.felixcloutier.com/x86/vrndscalepd#fig-5-29
*
https://stackoverflow.com/questions/50854991/instrinsic-mm512-round-ps-is-missing-for-avx512

Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
Signed-off-by: Peter Kovář <peter.kovar@reflexion.tv>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants