Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Convert scalars-to-vector constructors in SIMD backend to lookup table constructors #20688

Closed

Conversation

Nicholas-Ho-arm
Copy link
Contributor

@Nicholas-Ho-arm Nicholas-Ho-arm commented Sep 10, 2021

This is a work in progress patch which showcases the changes suggested by @fpetrogalli in #20562. The current SIMD types make use of scalars-to-vector constructors to build a SIMD vector. To prepare for Scalable Vector Extension(SVE), they are now replaced with lookup table constructors. This facilitates vector length agnostic code while making sure that current compilers can still build them. Currently only SIMD types with 16 lanes have been changed as the nature of this patch is exploratory and not intended for merging.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake
force_builders=linux, docs, ARMv8, ARMv7, Custom
build_image:Custom=javascript
buildworker:Custom=linux-4, linux-6

Copy link
Contributor

@fpetrogalli fpetrogalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work @Nicholas-Ho-arm. I only have a minor comment about the docs for the new interface. BTW, I think you should mark the description of the PR with the usual tags that enable more than the standard configurations in CI:

force_builders=linux, docs, ARMv8, ARMv7, Custom
build_image:Custom=javascript
buildworker:Custom=linux-4, linux-6

Please add these lines in your original message.

@alalek / @vpisarev / @asmorkalov - this is the approach we could take to make the SIMD abstraction in HAL more generic to support VLA SVE. Of course, there are other things that needs to be done.

Please let us know if this is an acceptable approach. As we have shown in https://godbolt.org/z/sohTbGE3a, this is generating the same code as the original lane-based constructors.

Kind regards,

Francesco and Nicholas

@@ -412,6 +412,14 @@ template<typename _Tp, int n> struct v_reg
s[12] = s12; s[13] = s13; s[14] = s14; s[15] = s15;
}

v_reg(_Tp* data, int* idx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment here describing the new interface?

v_uint16x16(ushort v0, ushort v1, ushort v2, ushort v3,
ushort v4, ushort v5, ushort v6, ushort v7,
ushort v8, ushort v9, ushort v10, ushort v11,
ushort v12, ushort v13, ushort v14, ushort v15)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not acceptable.

You break already existed API.


Again, OpenCV SIMD UI is C++ based with helper custom register types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not acceptable.

You break already existed API.

Hi @alalek - yes, I understand that there are concerns in doing this. However, I think I have provided enough data with examples in 20562 and 20640 that shows that the change will not affect code generation.

Again, OpenCV SIMD UI is C++ based with helper custom register types.

Yes, and that should be changed too. I think that we can make the C++ UI to use fixed size vectors and scalable size vectors depending on the target architecture. I totally get your concerns: it is a big change that seems to be too far away from the architectures that OpenCV supports.

Is there a way we could agree an exploratory development of a C++ vector API for scalable vectors? I'd like to propose we create a development branch where to do this. We will merge it into the main product only when we think we have reached a level of stability and performance that is good for the library and the developers using it. Do you think this would be a good approach?

I am happy also to explore other proposals. I'd like to avoid the one you suggested to create a version of OpenCV that uses SVE for fixed lengths vectors. The cost of this approach is definitely lower on the short term, but it will be more expensive on the long term. We'll have to change the C++ UI anyway if it turns out we want to create a version of OpenCV that runs on every SVE implementation. Also, please notice that the size of the vector registers in SVE can also be set by the OS/hypervisor at program start: this means that in theory, for hardware that have an Nx128 bit SVE implementation, we could end up having to support all vector sizes from 128x1, 128x2, .... 128xN.

Finally, I think that the concept of scalable vector is not as niche as it might look. There are several players other than Arm that are using scalable vectors in their architectures.

@alalek @asmorkalov @vpisarev - I'd like to hear what you think on these topics and on the approach I am suggesting.

Kind regards,

Francesco

@asmorkalov asmorkalov added feature platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc RFC labels Sep 12, 2021
@asmorkalov
Copy link
Contributor

#22179 brought new Universal Intrinsics API for scalable vector types. The PR is not relevant any more I suppose. @vpisarev Please add details, if I missed something important.

@asmorkalov asmorkalov closed this Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc RFC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants