-
-
Notifications
You must be signed in to change notification settings - Fork 55.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OE-27 "Wide Universal Intrinsics" discussion #11022
Comments
@seiko2plus, I've added link to your #10708 |
* core:OE-27 prepare universal intrinsics to expand (#11022) * core:OE-27 prepare universal intrinsics to expand (#11022) * core: Add universal intrinsics for AVX2 * updated implementation of wide univ. intrinsics; converted several OpenCV HAL functions: sqrt, invsqrt, magnitude, phase, exp to the wide universal intrinsics. * converted log to universal intrinsics; cleaned up the code a bit; added v_lut_deinterleave intrinsics. * core: Add universal intrinsics for AVX2 * fixed multiple compile errors * fixed many more compile errors and hopefully some test failures * fixed some more compile errors * temporarily disabled IPP to debug exp & log; hopefully fixed Doxygen complains * fixed some more compile errors * fixed v_store(short*, v_float16&) signatures * trying to fix the test failures on Linux * fixed some issues found by alalek * restored IPP optimization after the patch with AVX wide intrinsics has been properly tested * restored IPP optimization after the patch with AVX wide intrinsics has been properly tested
@vpisarev I plan to port in OpenCV repo my assembly SSEx and intrinsic AVX-2 implementations of some distances for FLANN. |
@pemmanuelviel There is page about universal intrinsics in OpenCV Documentation. |
@alalek Thank you for the link. This is the doc for the "not-wide" universal intrinsics I was mentioning. |
Wide universal intrinsics are implemented for AVX2 and AVX512 architectures and are already used in core and impgproc modules. Unfortunately there is no special documentation for wide universal intrinsics. However they were implemented in accordance with OE-27 Actually there are just a few changes to universal intrinsics idea:
WUI always use the most wide vector size available for selected instruction set(i.e. if AVX512 support is enabled vector length for WUI will be 512-bit) |
the feature request about evolution proposal OE-27
The text was updated successfully, but these errors were encountered: