Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize AES-CTR for ARM Neoverse V1 and V2. #22733

Closed
wants to merge 1 commit into from

Conversation

fisheryu-arm
Copy link
Contributor

Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
V2, to fully utilize their AES pipeline resources.

Improvement on ARM Neoverse V1.

Package Size(Bytes)	16	32	64	128	256	1024
Improvement(%)		3.93	-0.45	11.30	4.31	12.48	37.66
Package Size(Bytes)	1500	8192	16384	61440	65536
Improvement(%)		37.16	38.90	39.89	40.55	40.41
Checklist
  • documentation is added or updated
  • tests are added or updated

@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 15, 2023
@fisheryu-arm
Copy link
Contributor Author

According to optimization guide of Neoverse V1 & V2, at least 8 data chunks should be interleaved for aes max performance.
In practice, interleaving-12-data-chunks implementation performs best.
In this patch, we implement interleaving-12-data-chunks for AES-CTR, to fully utilize AES pipeline resources of Neoverse V1 & V2.

@t8m t8m added branch: master Merge to master branch approval: review pending This pull request needs review by a committer approval: otc review pending This pull request needs review by an OTC member tests: exempted The PR is exempt from requirements for testing triaged: performance The issue/pr reports/fixes a performance concern labels Nov 15, 2023
@t8m t8m removed approval: review pending This pull request needs review by a committer approval: otc review pending This pull request needs review by an OTC member labels Nov 15, 2023
@t8m
Copy link
Member

t8m commented Nov 15, 2023

The CI failure indicates that the code won't work on arm-linux-gnueabihf. Can you please exclude it appropriately?

@fisheryu-arm
Copy link
Contributor Author

The CI failure indicates that the code won't work on arm-linux-gnueabihf. Can you please exclude it appropriately?

Yes, I'm modifying my code.

    Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
    V2, to fully utilize their AES pipeline resources.

    Improvement on ARM Neoverse V1.

    Package Size(Bytes)	16	32	64	128	256	1024
    Improvement(%)	3.93	-0.45	11.30	4.31	12.48	37.66
    Package Size(Bytes)	1500	8192	16384	61440	65536
    Improvement(%)	37.16	38.90	39.89	40.55	40.41

Change-Id: Ifb8fad9af22476259b9ba75132bc3d8010a7fdbd
@zorrorffm
Copy link
Contributor

We(Tom and I) have reviewed this patch internally, it looks good to us.

@t8m t8m added approval: review pending This pull request needs review by a committer approval: otc review pending This pull request needs review by an OTC member labels Nov 20, 2023
@t8m
Copy link
Member

t8m commented Nov 20, 2023

It would be nice if @tom-cosgrove-arm approves this formally here.

@zorrorffm
Copy link
Contributor

zorrorffm commented Nov 21, 2023

Sure, and just a heads-up, @tom-cosgrove-arm is on vacation, we suppose he will make comments next week.

Copy link
Contributor

@tom-cosgrove-arm tom-cosgrove-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been reviewed and approved internally (apologies, have been away on vacation)

@tom-cosgrove-arm tom-cosgrove-arm removed the approval: review pending This pull request needs review by a committer label Nov 27, 2023
@t8m t8m added approval: done This pull request has the required number of approvals and removed approval: otc review pending This pull request needs review by an OTC member labels Nov 28, 2023
@openssl-machine openssl-machine added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels Nov 29, 2023
@openssl-machine
Copy link
Collaborator

This pull request is ready to merge

@t8m
Copy link
Member

t8m commented Nov 29, 2023

Merged to the master branch. Thank you for your contribution.

@t8m t8m closed this Nov 29, 2023
openssl-machine pushed a commit that referenced this pull request Nov 29, 2023
    Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
    V2, to fully utilize their AES pipeline resources.

    Improvement on ARM Neoverse V1.

    Package Size(Bytes)	16	32	64	128	256	1024
    Improvement(%)	3.93	-0.45	11.30	4.31	12.48	37.66
    Package Size(Bytes)	1500	8192	16384	61440	65536
    Improvement(%)	37.16	38.90	39.89	40.55	40.41

Change-Id: Ifb8fad9af22476259b9ba75132bc3d8010a7fdbd

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #22733)
wanghao75 pushed a commit to openeuler-mirror/openssl that referenced this pull request Dec 7, 2023
    Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
    V2, to fully utilize their AES pipeline resources.

    Improvement on ARM Neoverse V1.

    Package Size(Bytes)	16	32	64	128	256	1024
    Improvement(%)	3.93	-0.45	11.30	4.31	12.48	37.66
    Package Size(Bytes)	1500	8192	16384	61440	65536
    Improvement(%)	37.16	38.90	39.89	40.55	40.41

Change-Id: Ifb8fad9af22476259b9ba75132bc3d8010a7fdbd

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from openssl/openssl#22733)

Signed-off-by: fly2x <fly2x@hitls.org>
wbeck10 pushed a commit to wbeck10/openssl that referenced this pull request Jan 8, 2024
    Unroll AES-CTR loops to a maximum 12 blocks for ARM Neoverse V1 and
    V2, to fully utilize their AES pipeline resources.

    Improvement on ARM Neoverse V1.

    Package Size(Bytes)	16	32	64	128	256	1024
    Improvement(%)	3.93	-0.45	11.30	4.31	12.48	37.66
    Package Size(Bytes)	1500	8192	16384	61440	65536
    Improvement(%)	37.16	38.90	39.89	40.55	40.41

Change-Id: Ifb8fad9af22476259b9ba75132bc3d8010a7fdbd

Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from openssl#22733)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Merge to master branch severity: fips change The pull request changes FIPS provider sources tests: exempted The PR is exempt from requirements for testing triaged: performance The issue/pr reports/fixes a performance concern
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants