New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x86_64 assembly pack: "optimize" for Knights Landing. #4006
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! This is minimalistic backpoint of 64d92d7
richsalz
approved these changes
Jul 24, 2017
Thanks to David Benjamin for spotting typo in Knights Landing detection!
This needs re-approval because of typo fixed in #4009. |
re-approval |
levitte
added
the
approval: done
This pull request has the required number of approvals
label
Jul 25, 2017
levitte
pushed a commit
that referenced
this pull request
Jul 25, 2017
"Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! This is minimalistic backpoint of 64d92d7 Thanks to David Benjamin for spotting typo in Knights Landing detection! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from #4006)
levitte
pushed a commit
that referenced
this pull request
Jul 25, 2017
"Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! This is minimalistic backpoint of 64d92d7 Thanks to David Benjamin for spotting typo in Knights Landing detection! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from #4006) (cherry picked from commit 738a9dd)
pracj3am
pushed a commit
to cdn77/openssl
that referenced
this pull request
Aug 22, 2017
"Optimize" is in quotes because it's rather a "salvage operation" for now. Idea is to identify processor capability flags that drive Knights Landing to suboptimial code paths and mask them. Two flags were identified, XSAVE and ADCX/ADOX. Former affects choice of AES-NI code path specific for Silvermont (Knights Landing is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are effectively mishandled at decode time. In both cases we are looking at ~2x improvement. Hardware used for benchmarking courtesy of Atos, experiments run by Romain Dolbeau <romain.dolbeau@atos.net>. Kudos! This is minimalistic backpoint of 64d92d7 Thanks to David Benjamin for spotting typo in Knights Landing detection! Reviewed-by: Rich Salz <rsalz@openssl.org> (Merged from openssl#4006)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
approval: done
This pull request has the required number of approvals
branch: 1.0.2
Merge to OpenSSL_1_0_2-stable branch
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
"Optimize" is in quotes because it's rather a "salvage operation"
for now. Idea is to identify processor capability flags that
drive Knights Landing to suboptimial code paths and mask them.
Two flags were identified, XSAVE and ADCX/ADOX. Former affects
choice of AES-NI code path specific for Silvermont (Knights Landing
is of Silvermont "ancestry"). And 64-bit ADCX/ADOX instructions are
effectively mishandled at decode time. In both cases we are looking
at ~2x improvement.
Hardware used for benchmarking courtesy of Atos, experiments run by
Romain Dolbeau romain.dolbeau@atos.net. Kudos!
This is minimalistic backpoint of 64d92d7
1.1.0 and 1.0.2-sprecific follow-up to #3972.