Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix underflow error in AVX512 implementation of ufunc exp/f64 #18933

Merged
merged 2 commits into from
May 7, 2021

Conversation

seiko2plus
Copy link
Member

@seiko2plus seiko2plus commented May 7, 2021

closes #18932, related to #18920, #18916.

@seiko2plus seiko2plus changed the title BUG: Fix underflow error in AVX512 of ufunc exp BUG: Fix underflow error in AVX512 implementation of ufunc exp/f64 May 7, 2021
@seiko2plus seiko2plus added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label May 7, 2021
@seiko2plus
Copy link
Member Author

seiko2plus commented May 7, 2021

@seberg, CI(use_wheel (pull_request)) ufunc tests complaining about:

 >               assert_array_equal(res_num.astype("O"), res_obj)
E               AssertionError: 
E               Arrays are not equal
E               
E               Mismatched elements: 1 / 1 (100%)
E               Max absolute difference: 2.7755575615628914e-17
E               Max relative difference: 1.1489924412333298e-16
E                x: array([-0.2415644752704905], dtype=object)
E                y: array([-0.24156447527049046], dtype=object)

MyFloat    = <class 'numpy.core.tests.test_ufunc.TestUfuncGenericLoops.test_unary_PyUFunc_O_O_method_full.<locals>.MyFloat'>
num_arr    = array([0.78539816])
obj_arr    = array([0.7853981633974483], dtype=object)
res_num    = array([-0.24156448])
res_obj    = array([-0.24156447527049046], dtype=object)
self       = <numpy.core.tests.test_ufunc.TestUfuncGenericLoops object at 0x7f897731c950>
ufunc      = <ufunc 'log'>
val        = 0.7853981633974483

why we don't use assert_array_almost_equal in here:

res_obj = ufunc(obj_arr)
assert_array_equal(res_num.astype("O"), res_obj)

@mattip
Copy link
Member

mattip commented May 7, 2021

Yes, that seems too strong a requirement.

@seiko2plus
Copy link
Member Author

@mattip, Should I replace it with assert_array_almost_equal? I don't think this error is related to this patch.

@seiko2plus
Copy link
Member Author

close/open for another round.

@seiko2plus seiko2plus closed this May 7, 2021
@seiko2plus seiko2plus reopened this May 7, 2021
@charris
Copy link
Member

charris commented May 7, 2021

LGTM. Go ahead and fix the failing test so everything starts working again.

@seberg
Copy link
Member

seberg commented May 7, 2021

Oh, you changed the test to almost-equal. I had problems elsewhere and changed it to 0-D, but this is just as well. EDIT: Sorry, that was unclear. That was in a work-in-progress, lets go with this! :)

(I think I may have changed 0-stride vs. contiguous-stride for 0-D, which probably can change whether the SIMD loop is used sometimes?)

@seiko2plus seiko2plus merged commit cd73ab7 into numpy:main May 7, 2021
@seberg
Copy link
Member

seberg commented May 7, 2021

Thanks for the quick fix, Sayed!

@seiko2plus
Copy link
Member Author

seiko2plus commented May 7, 2021

@seberg,

which probably can change whether the SIMD loop is used sometimes?)

That probably what happened, but avoid using almost_equal is kinda strict especially for CPUs that don't have native support of FMA/Rounding.

@@ -776,7 +776,7 @@ AVX512F_exp_DOUBLE(npy_double * op,
nearzero_mask = _mm512_kxor(nearzero_mask, nan_mask);
overflow_mask = _mm512_kor(overflow_mask,
_mm512_kxor(xmax_mask, inf_mask));
underflow_mask = _mm512_kor(underflow_mask, xmax_mask);
underflow_mask = _mm512_kor(underflow_mask, xmin_mask);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yikes! /me hides in shame.
Thanks @seiko2plus for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New CI failures.
5 participants