Fix AVX-512 round function #4119

AngryLoki · 2024-01-19T15:44:58Z

Description

This PR fixes vfloat16 round function. Intrinsic _mm512_roundscale_ps was
used incorrectly, and caused failure on Zen4 CPU.

/var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579:
FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3]))
	values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4'

In old code _mm512_roundscale_ps (a, (1<<4) | 3) meant the following:

[0001] - Number of fixed points to preserve
[0] - Use MSCSR exception mask
[0] - Select mode from imm
[11] - Truncate mode

Effectively enabling rounding to nearest 0.5, not to integer.

References:

Tests

This fixes test_simd

Checklist:

I have read the contribution guidelines.
I have updated the documentation, if applicable.
I have ensured that the change is tested somewhere in the testsuite
(adding new test cases if necessary).
If I added or modified a C++ API call, I have also amended the
corresponding Python bindings (and if altering ImageBufAlgo functions, also
exposed the new functionality as oiiotool options).
My code follows the prevailing code style of this project. If I haven't
already run clang-format before submitting, I definitely will look at the CI
test that runs clang-format and fix anything that it highlights as being
nonconforming.

linux-foundation-easycla · 2024-01-19T15:45:02Z

The committers listed above are authorized under a signed CLA.

✅ login: AngryLoki (e114063)

See also: AcademySoftwareFoundation/OpenImageIO#4119 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

lgritz

This looks right to me, thanks for the fix!

This PR fixes vfloat16 round function. Intrinsic `_mm512_roundscale_ps` was used incorrectly, and caused failure on Zen4 CPU. ``` /var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579: FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3])) values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4' ``` In old code `_mm512_roundscale_ps (a, (1<<4) | 3)` meant the following: ``` [0001] - Number of fixed points to preserve [0] - Use MSCSR exception mask [0] - Select mode from imm [11] - Truncate mode ``` Effectively enabling rounding to nearest 0.5, not to integer. References: * https://www.felixcloutier.com/x86/vrndscalepd#fig-5-29 * https://stackoverflow.com/questions/50854991/instrinsic-mm512-round-ps-is-missing-for-avx512 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

See also: AcademySoftwareFoundation/OpenImageIO#4119 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

This PR fixes vfloat16 round function. Intrinsic `_mm512_roundscale_ps` was used incorrectly, and caused failure on Zen4 CPU. ``` /var/tmp/portage/media-libs/openimageio-2.5.5.0-r1/work/OpenImageIO-2.5.5.0/src/libutil/simd_test.cpp:1579: FAILED: round(F) == mkvec<VEC>(std::round(F[0]), std::round(F[1]), std::round(F[2]), std::round(F[3])) values were '-1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4 -1.5 0 1.5 4' and '-2 0 2 4 -2 0 2 4 -2 0 2 4 -2 0 2 4' ``` In old code `_mm512_roundscale_ps (a, (1<<4) | 3)` meant the following: ``` [0001] - Number of fixed points to preserve [0] - Use MSCSR exception mask [0] - Select mode from imm [11] - Truncate mode ``` Effectively enabling rounding to nearest 0.5, not to integer. References: * https://www.felixcloutier.com/x86/vrndscalepd#fig-5-29 * https://stackoverflow.com/questions/50854991/instrinsic-mm512-round-ps-is-missing-for-avx512 Signed-off-by: Sv. Lockal <lockalsash@gmail.com> Signed-off-by: Peter Kovář <peter.kovar@reflexion.tv>

AngryLoki added a commit to AngryLoki/gentoo that referenced this pull request Jan 19, 2024

media-libs/openimageio: Fix AVX-512 round function

d9aed20

See also: AcademySoftwareFoundation/OpenImageIO#4119 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

AngryLoki mentioned this pull request Jan 19, 2024

media-libs/openimageio: Fix AVX-512 round function DarkDefender/gentoo#2

Merged

Fix AVX-512 round function

e114063

Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

AngryLoki force-pushed the fix-avx512-round branch from 171c342 to e114063 Compare January 19, 2024 16:06

lgritz approved these changes Jan 20, 2024

View reviewed changes

lgritz merged commit b850a07 into AcademySoftwareFoundation:master Jan 20, 2024
25 checks passed

DarkDefender pushed a commit to DarkDefender/gentoo that referenced this pull request Jan 26, 2024

media-libs/openimageio: Fix AVX-512 round function

07f2ff7

See also: AcademySoftwareFoundation/OpenImageIO#4119 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

DarkDefender pushed a commit to DarkDefender/gentoo that referenced this pull request Jan 26, 2024

media-libs/openimageio: Fix AVX-512 round function

40cfe53

See also: AcademySoftwareFoundation/OpenImageIO#4119 Signed-off-by: Sv. Lockal <lockalsash@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AVX-512 round function #4119

Fix AVX-512 round function #4119

AngryLoki commented Jan 19, 2024

linux-foundation-easycla bot commented Jan 19, 2024 •

edited

lgritz left a comment

Fix AVX-512 round function #4119

Fix AVX-512 round function #4119

Conversation

AngryLoki commented Jan 19, 2024

Description

Tests

Checklist:

linux-foundation-easycla bot commented Jan 19, 2024 • edited

lgritz left a comment

Choose a reason for hiding this comment

linux-foundation-easycla bot commented Jan 19, 2024 •

edited