Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86_64 assembly pack: "optimize" for Knights Landing. #4006

Closed

Conversation

dot-asm
Copy link
Contributor

@dot-asm dot-asm commented Jul 24, 2017

"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau romain.dolbeau@atos.net. Kudos!

This is minimalistic backpoint of 64d92d7

1.1.0 and 1.0.2-sprecific follow-up to #3972.

"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

This is minimalistic backpoint of 64d92d7
@dot-asm dot-asm added branch: 1.0.2 Merge to OpenSSL_1_0_2-stable branch 1.1.0 labels Jul 24, 2017
Thanks to David Benjamin for spotting typo in Knights Landing detection!
@dot-asm
Copy link
Contributor Author

dot-asm commented Jul 24, 2017

This needs re-approval because of typo fixed in #4009.

@richsalz
Copy link
Contributor

re-approval

@levitte levitte added the approval: done This pull request has the required number of approvals label Jul 25, 2017
levitte pushed a commit that referenced this pull request Jul 25, 2017
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

This is minimalistic backpoint of 64d92d7

Thanks to David Benjamin for spotting typo in Knights Landing detection!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #4006)
levitte pushed a commit that referenced this pull request Jul 25, 2017
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

This is minimalistic backpoint of 64d92d7

Thanks to David Benjamin for spotting typo in Knights Landing detection!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #4006)

(cherry picked from commit 738a9dd)
@dot-asm dot-asm closed this Jul 25, 2017
pracj3am pushed a commit to cdn77/openssl that referenced this pull request Aug 22, 2017
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.

Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau <romain.dolbeau@atos.net>. Kudos!

This is minimalistic backpoint of 64d92d7

Thanks to David Benjamin for spotting typo in Knights Landing detection!

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from openssl#4006)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: done This pull request has the required number of approvals branch: 1.0.2 Merge to OpenSSL_1_0_2-stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants