New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - Convert scalars-to-vector constructors in SIMD backend to lookup table constructors #20688
Conversation
…okup table constructors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work @Nicholas-Ho-arm. I only have a minor comment about the docs for the new interface. BTW, I think you should mark the description of the PR with the usual tags that enable more than the standard configurations in CI:
force_builders=linux, docs, ARMv8, ARMv7, Custom
build_image:Custom=javascript
buildworker:Custom=linux-4, linux-6
Please add these lines in your original message.
@alalek / @vpisarev / @asmorkalov - this is the approach we could take to make the SIMD abstraction in HAL more generic to support VLA SVE. Of course, there are other things that needs to be done.
Please let us know if this is an acceptable approach. As we have shown in https://godbolt.org/z/sohTbGE3a, this is generating the same code as the original lane-based constructors.
Kind regards,
Francesco and Nicholas
@@ -412,6 +412,14 @@ template<typename _Tp, int n> struct v_reg | |||
s[12] = s12; s[13] = s13; s[14] = s14; s[15] = s15; | |||
} | |||
|
|||
v_reg(_Tp* data, int* idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here describing the new interface?
v_uint16x16(ushort v0, ushort v1, ushort v2, ushort v3, | ||
ushort v4, ushort v5, ushort v6, ushort v7, | ||
ushort v8, ushort v9, ushort v10, ushort v11, | ||
ushort v12, ushort v13, ushort v14, ushort v15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not acceptable.
You break already existed API.
Again, OpenCV SIMD UI is C++ based with helper custom register types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is not acceptable.
You break already existed API.
Hi @alalek - yes, I understand that there are concerns in doing this. However, I think I have provided enough data with examples in 20562 and 20640 that shows that the change will not affect code generation.
Again, OpenCV SIMD UI is C++ based with helper custom register types.
Yes, and that should be changed too. I think that we can make the C++ UI to use fixed size vectors and scalable size vectors depending on the target architecture. I totally get your concerns: it is a big change that seems to be too far away from the architectures that OpenCV supports.
Is there a way we could agree an exploratory development of a C++ vector API for scalable vectors? I'd like to propose we create a development branch where to do this. We will merge it into the main product only when we think we have reached a level of stability and performance that is good for the library and the developers using it. Do you think this would be a good approach?
I am happy also to explore other proposals. I'd like to avoid the one you suggested to create a version of OpenCV that uses SVE for fixed lengths vectors. The cost of this approach is definitely lower on the short term, but it will be more expensive on the long term. We'll have to change the C++ UI anyway if it turns out we want to create a version of OpenCV that runs on every SVE implementation. Also, please notice that the size of the vector registers in SVE can also be set by the OS/hypervisor at program start: this means that in theory, for hardware that have an Nx128 bit SVE implementation, we could end up having to support all vector sizes from 128x1, 128x2, .... 128xN.
Finally, I think that the concept of scalable vector is not as niche as it might look. There are several players other than Arm that are using scalable vectors in their architectures.
@alalek @asmorkalov @vpisarev - I'd like to hear what you think on these topics and on the approach I am suggesting.
Kind regards,
Francesco
This is a work in progress patch which showcases the changes suggested by @fpetrogalli in #20562. The current SIMD types make use of scalars-to-vector constructors to build a SIMD vector. To prepare for Scalable Vector Extension(SVE), they are now replaced with lookup table constructors. This facilitates vector length agnostic code while making sure that current compilers can still build them. Currently only SIMD types with 16 lanes have been changed as the nature of this patch is exploratory and not intended for merging.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.