-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add max values comparison for floating point #18863
Conversation
See |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your code here is following x86 behavior rather than IEEE, since If "one" of the two arguments is NaN, the value of the other argument is returned.
Our SIMD interface tries not to differentiate between one architecture and another. in other words, it tries to provide the maximum possible performance for all architectures.
Therefore, I suggest the following:
- make
npyv_max_[f32, f64]
map directly to vmaxq_[f32, f64], _mm[,256,512]max[ps, pd], vec_max(VSX) which not oby the standards and also not provide a standardized behavior for NaN. - add new intrinics npyv_maxp_[f32, f64] that follow IEEE behavior.
note: extra p for precise
You will also need to add a testing unit case for both intrinsic which requires the following:
- bring the new intrinics to Python world through SIMD module, check _simd/_simd.dispatch.c.src
- add the new test cases into core/tests/test_simd.py
Thank you!
6f2bf34
to
190f2b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
several fixes and improvements
Thank you @seiko2plus ! I learned a lot from your reviews. I have fixed them, please take a look. |
8c861e2
to
49948f9
Compare
95bcd72
to
7e5669b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Thank you!
SIMD intrinsics for comparing double/single precision has been added. Now only NEON and SSE ones are added.
Thank you @howjmay |
SIMD intrinsics for comparing double/single precision has been
added. Now only NEON and SSE ones are added.