New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize SM2 on aarch64 #20754
Optimize SM2 on aarch64 #20754
Conversation
any benchmark data? |
Hi @zzl360, I've updated the benchmark data. Hi @paulidale @t8m @tom-cosgrove-arm, could you please review it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits.
That's quite a performance boost and one large file.
Hi @InfoHunter, could you please review this patch? |
ping for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this optimization should be excluded if OPENSSL_SMALL_FOOTPRINT is defined. Given the large table size.
It would be a bonus but aarch64 isn't normally associated with small devices. |
This PR is in a state where it requires action by @openssl/otc but the last update was 30 days ago |
(As I understand it, @docularxu is still reviewing, and there should be an update to the PR from the OP) |
7cdb799
to
4898f5c
Compare
(removed the |
I'm from Linaro and specializing in Arm platform optimization. The latest version from @xu-yi-zhou effectively addresses the concerns I had previously raised. Also, I verified this version on Apple silicon M2, and the results align with the performance improvements mentioned earlier:
With these, I would recommend merging this patch into the master branch. |
Hi all, I'm thinking about whether to implement constant-time point multiplication and modular inversion to avoid side channel attack. I'm trying the following to fix side channel,
After these changes, the performance of If the side channel protection is necessary, I still need to implement a Welcome to discsuss. |
By-the-way: thank you @docularxu for investing your time and sharing your valuable opinion. Your help is very much appreciated! |
Yes, I consider anti-side-channel should be a 'must' stuff prior to performance. |
OK, I will do that as soon as possible. |
Hi all, I'm trying to add an option |
Yes, please. Add |
My understanding of this request is that the large table would be included in default AArch64 builds, but can be disabled for smaller builds by configuring with This seems to only use the large table if |
@tom-cosgrove-arm I think you are right, and I also forgot to add the increased size of libcrypto in CHANGES.md, I will fix these. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more nits
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request is ready to merge |
Merged to master branch. Thank you for your contribution. |
Signed-off-by: Xu Yizhou <xuyizhou1@huawei.com> Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #20754)
Hello, I tested using the default compilation options on Kunpeng-920 2.6GHz hardware and obtained a signature performance of 25057 times/s and a signature verification performance of 7042 times/s. May I ask if you added other options to the data obtained during compilation |
no other options, try using the |
This patch optimizes SM2 for ARM processor using A64 instruction and precomputation table, which can speed up SM2 sign about 10 times and SM2 verify about 3 times. A new configure option
no-sm2-precomp
has been added to disable the precomputed table for point multiplicatin of the base point.Perf data on Kunpeng-920 2.6GHz hardware looks like this:
Signed-off-by: Xu Yizhou xuyizhou1@huawei.com
Checklist