-
Notifications
You must be signed in to change notification settings - Fork 721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSS] hyperscan support ARM #197
Comments
Hi team, @xiangwang1 , @fatchanghao , @Nor7th Thanks |
Current Hyperscan is specifically designed and optimized for Intel CPUs, including the selection of algorithms and utilization of SIMD instructions. I think there could be potential performance hit if the work is only about porting x86 instructions to corresponding ones in ARM NEON. From Intel's perspective, we are not in a position to port Hyperscan to ARM. We may consider it unless there're common interests in the community where developers other than us could push this forward and prove it as a viable path to take. |
Hi @xiangwang1 , I think that would be good if you can consider. I will explain more according to your feedback.
So we need the community to help review the plan(design), and give us good suggestion to how to make this done, including small part of platform judgment script modification, some others we don't realized and etc.. Thanks. |
Hi @xiangwang1 , Thanks |
We have successfully ported hyperscan to the ARM platform(aslo MIPS,NO SIMD instructions support,Performance improvement is not high), and it turns out that this is not difficult. But we didn't do much optimization work, you guys can go deeper. |
@codecat007 hi, I'm interested about the hyperscan porting and have some try in past days. |
Thanks for concern here. ;-) |
@zzqcn You can use the simd library(just like simde: https://github.com/nemequ/simde) to implement an middle layer for simd fuction calls. |
@codecat007 Thanks for your reply. I converted SSE to Neon intrinsics via sse2neon, but the compiled hyperscan this way has runtime bugs on ARM. I will try simde instead. |
I think we need to wait for maintainer team member to consider and reply for the following steps. Hope hyperscan team member could give some good advices. Maybe @xiangwang1 |
@codecat007 With sse2neon and simde's help, I ported hyperscan 4.6.0 to ARM. It's basically working, with some bugs. I build and run the unit test (just in unit/hyperscan/), then 3476 test cases PASSED and 169 FAILED. The failed cases: hyperscan_test_result.txt Did you do some tests for your porting? Thanks for any suggestions. |
I have ported hyperscan v5.2.0 to ARMv7 with simde. All 3746 unit test cases PASSED (run and test with qemu-arm). My fork: https://github.com/zzqcn/hyperscan, and my commit: zzqcn@249178a I don't known much about SSE, Neon, etc, so any suggestion or code review is helpful for me. |
Hi guys, seems a post #212 in hyperscan and willing to support both of x86 and aarch64. @xiangwang1 @fatchanghao @Nor7th |
cc author to join this discussion. @tqltech |
@zzqcn Hi, I wonder the performance of the ported hyperscan. Does the added middle layer(simde) has a heavy impact on performance? Hope for your reply, thanks. |
@daveMmd On native x86 processors with AVX (for example) the usage of the SIMD Everywhere header-only library is optimized out by the compiler into the existing direct calls to the AVX intrinsics. On non-X86 platforms, the SIMD Everywhere headers enable the code to run where it wouldn't before, and often using the SIMD intrinsics of that non-X86 processor. |
Thanks! Though the code is enabled to run both with SIMD intrinsics on x86 and non-x86 processors, the data structure is tailored for x86 processor. Thus I think there can be performance penalty on non-x86 processors. I just wonder how big the penalty is. |
Good point, "SIMD Everywhere" doesn't prevent the addition of architecture specific variations later, but means you get a functional version today, which is nice for applications that have a hard dependency on hyperscan. |
@mr-c @daveMmd We have modified hyperscan for armv8 processors. Improve the performance by using the NEON instructions, inline assembly, data alignment, instruction alignment, memory data prefetching, static branch prediction, code structure optimization, etc. The optimized hyperscan performance is about 80% of x86. The repository:https://github.com/kunpengcompute/hyperscan |
@tqltech Awesome! It must be a big work! |
@tqltech I am curious to know if you have you measured your optimizations against what was done in the Marvell port for aarch64? |
@hulksmaaash I used the performance test tool hsbench that comes with hyperscan to measure the optimization results. |
Hi team, @xiangwang1 , @fatchanghao , @Nor7th For now, does the team have any plan on aarch64 support of hyperscan upstream? |
@Yikun I believe the answer is still the same from last year (#197 (comment)). I am curious, what is your interest in having aarch64 support for hyperscan? If there is enough external interest then I may be able to gather internal engineering support to justify the work and on-going maintenance. |
For our use case, we use Hyperscan on Linux, macOS, and Windows. With Macs beginning the transition to Apple silicon based on ARM, we are obviously interested in support for the architecture so we can continue our cross platform work using Hyperscan. It is understandable why Intel may not be interested in supporting the architecture, but I would counter that adding support will ensure that the project continues to be a viable option for people that must support multiple platforms, instead of having them look for alternatives that they can use across the platforms they must support. |
@hulksmaaash Thanks for the reply. I got some info from our product team, some friend are using Hyperscan on Linux in Kunpeng Server (which is the aarch64 based server). We also know there are some case in Amazon EC2 A1 Instances. So we think the aarch64 support is really necessary. |
FYI, there is an Arm sponsored effort (see below) now to port and optimize hyperscan for Arm. The work as only just begun, but the end goal is to work with the maintainers to have the updates merged, and then continue to provide support for the aarch64 architecture as both the project and architecture progresses. |
For those interested, the first PR has been submitted that separates the architecture specific code to pave the way for adding aarch64 support....and any other future architectural specific code. |
FYI - aarch64 port has been completed here: with further NEON SIMD optimizations to come. PR will be submitted soon. |
PR for ARMv8 support submitted here: #287 |
FYI, we have been informed that the project maintainers have
and will
We will consider the best path forward to ensure Hyperscan will work for users who desire support for non-x86 architectures, and update those who express interest. |
Oh well, what a surprise. Progress train moves forward, RIP Intel. |
I was hoping this got portable to ARM too :/ |
It did ;-) https://github.com/VectorCamp/vectorscan |
Hi hyperscan team,
I'm an newbee for hyperscan project. I'm so excited to have a conversation with you.
We have a plan to make hyperscan to support ARM64 function. And we will propose a series of PRs to make this happen, including hardware platform logical judgement code ,ARM NEON instruction set support and etc.. We won't propose intrusive changes to existing code. Now the detailed design are still uncertain, just a draft. Hope community can take part in the detailed feature design at the beginning.
But before the whole work begins, we want to know community attitude about this. We hope the kind feedback from your side.
Thanks very much.
The text was updated successfully, but these errors were encountered: