Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8259065: Optimize MessageDigest.getInstance #1933

Closed
wants to merge 10 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Jan 4, 2021

By caching default constructors used in java.security.Provider::newInstanceUtil in a ClassValue, we can reduce the overhead of allocating instances in a variety of places, e.g., MessageDigest::getInstance, without compromising thread-safety or security.

On the provided microbenchmark MessageDigest.getInstance(digesterName) improves substantially for any digesterName - around -90ns/op and -120B/op:

Benchmark                                                                  (digesterName)  Mode  Cnt     Score     Error   Units
GetMessageDigest.getInstance                                                          md5  avgt   30   293.929 ±  11.294   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                      md5  avgt   30   424.028 ±   0.003    B/op
GetMessageDigest.getInstance                                                        SHA-1  avgt   30   322.928 ±  16.503   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                    SHA-1  avgt   30   688.039 ±   0.003    B/op
GetMessageDigest.getInstance                                                      SHA-256  avgt   30   338.140 ±  13.902   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                  SHA-256  avgt   30   640.037 ±   0.002    B/op
GetMessageDigest.getInstanceWithProvider                                              md5  avgt   30   312.066 ±  12.805   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                          md5  avgt   30   424.029 ±   0.003    B/op
GetMessageDigest.getInstanceWithProvider                                            SHA-1  avgt   30   345.777 ±  16.669   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                        SHA-1  avgt   30   688.040 ±   0.003    B/op
GetMessageDigest.getInstanceWithProvider                                          SHA-256  avgt   30   371.134 ±  18.485   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                      SHA-256  avgt   30   640.039 ±   0.004    B/op

Patch:

Benchmark                                                                  (digesterName)  Mode  Cnt     Score     Error   Units
GetMessageDigest.getInstance                                                          md5  avgt   30   210.629 ±   6.598   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                      md5  avgt   30   304.021 ±   0.002    B/op
GetMessageDigest.getInstance                                                        SHA-1  avgt   30   229.161 ±   8.158   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                    SHA-1  avgt   30   568.030 ±   0.002    B/op
GetMessageDigest.getInstance                                                      SHA-256  avgt   30   260.013 ±  15.032   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                                  SHA-256  avgt   30   520.030 ±   0.002    B/op
GetMessageDigest.getInstanceWithProvider                                              md5  avgt   30   231.928 ±  10.455   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                          md5  avgt   30   304.020 ±   0.002    B/op
GetMessageDigest.getInstanceWithProvider                                            SHA-1  avgt   30   247.178 ±  11.209   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                        SHA-1  avgt   30   568.029 ±   0.002    B/op
GetMessageDigest.getInstanceWithProvider                                          SHA-256  avgt   30   265.625 ±  10.465   ns/op
GetMessageDigest.getInstanceWithProvider:·gc.alloc.rate.norm                      SHA-256  avgt   30   520.030 ±   0.003    B/op

See: https://cl4es.github.io/2021/01/04/Investigating-MD5-Overheads.html#reflection-overheads for context.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1933/head:pull/1933
$ git checkout pull/1933

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 4, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 4, 2021

@cl4es The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security security-dev@openjdk.org label Jan 4, 2021
@cl4es cl4es marked this pull request as ready for review January 5, 2021 10:14
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 5, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 5, 2021

Webrevs

…x inefficient synchronization in ProviderConfig. Store EngineDescriptor in Service instead of looking it up every time.
@cl4es
Copy link
Member Author

cl4es commented Jan 6, 2021

I refactored and optimized the lookup code further, getting rid of a number of bottlenecks:

  • Cache Constructors in Provider.Service instead of via a ClassValue.
  • Also cache the impl Class, wrap Class and Constructor in WeakReference if not loaded by the null classloader (many builtins will be)
  • Cache EngineDescription in Service, avoiding a lookup on the hot path
  • We were hitting a synchronized method in ProviderConfig.getProvider(). The provider field is volatile already, so I used the double-check idiom here to avoid synchronization on the hot path
  • ServiceKey.hashCode using Objects.hash was cause for allocation, simplified and optimized it.
Benchmark                                                      (digesterName)  Mode  Cnt     Score    Error   Units
GetMessageDigest.getInstance                                              MD5  avgt   30   143.803 ±  5.431   ns/op
GetMessageDigest.getInstance:·gc.alloc.rate.norm                          MD5  avgt   30   280.015 ±  0.001    B/op

@cl4es
Copy link
Member Author

cl4es commented Jan 6, 2021

Since much of the cost is now the creation of the MessageDigest itself, I added a microbenchmark to stat this overhead:

Benchmark                                                        (digesterName)  Mode  Cnt     Score     Error   Units
GetMessageDigest.cloneInstance                                              MD5  avgt   30   124.922 ±   5.412   ns/op
GetMessageDigest.cloneInstance:·gc.alloc.rate.norm                          MD5  avgt   30   280.015 ±   0.001    B/op

That means there's no added allocation overhead of calling MessageDigest.getInstance(digesterName) compared to cloning an existing instance - which means we get almost all of the benefits without resorting to tricks as caching and cloning an instance at call sites such as the one in UUID::nameUUIDFromBytes. The remaining 20ns/op difference should be negligible.

@cl4es cl4es changed the title 8259065: java.security.Provider should cache default constructors 8259065: Optimize MessageDigest.getInstance Jan 6, 2021
@schlosna
Copy link
Contributor

schlosna commented Jan 6, 2021

Nice speedup for all MessageDigest.getInstance and Provider.getService invocations and improves on removing the scalability bottlenecks of Provider.getService beyond JDK-7092821 (and related JDK-8080273 & JDK-8172827). This will also allow removing some longstanding MessageDigest#clone workarounds such as in Guava google/guava#1197 .

@valeriepeng
Copy link

Thanks for looking into this. I will take a look.


return md;
GetInstance.Instance instance = GetInstance.getInstance("MessageDigest",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another Security.getImpl call inside getInstance(String algorithm, String provider) method. For consistency sake, we should update it also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. The Security.getImpl has a branch checking null provider which is pointless here since we guard against a null provider before that call, so we can reuse the same pattern here.

@@ -1106,16 +1069,15 @@ private ServiceKey(String type, String algorithm, boolean intern) {
this.algorithm = intern ? algorithm.intern() : algorithm;
}
public int hashCode() {
return Objects.hash(type, algorithm);
return type.hashCode() ^ algorithm.hashCode();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change really necessary? It's faster to compute with ^, but does the generated hash values are as distinct as using Objects.hash()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid any suspicion we'd generate a worse hash here I've reverted this to inline what would be generated going through Objects.hash: 31*31 + type.hashCode()*31 + algorithm.hashCode()

}
Class<?> clazz = null;
if (cache instanceof WeakReference<?> ref){
clazz = (ref == null) ? null : (Class<?>)ref.get();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref should not be null if the instanceof check is true?

}
Constructor<?> con = null;
if (cache instanceof WeakReference<?> ref){
con = (ref == null) ? null : (Constructor<?>)ref.get();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref should not be null if the instanceof check is true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catches - these 2 null checks can be removed

Copy link
Member Author

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing, Valerie. I've addressed all your comments.


return md;
GetInstance.Instance instance = GetInstance.getInstance("MessageDigest",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. The Security.getImpl has a branch checking null provider which is pointless here since we guard against a null provider before that call, so we can reuse the same pattern here.

@@ -1106,16 +1069,15 @@ private ServiceKey(String type, String algorithm, boolean intern) {
this.algorithm = intern ? algorithm.intern() : algorithm;
}
public int hashCode() {
return Objects.hash(type, algorithm);
return type.hashCode() ^ algorithm.hashCode();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid any suspicion we'd generate a worse hash here I've reverted this to inline what would be generated going through Objects.hash: 31*31 + type.hashCode()*31 + algorithm.hashCode()

}
Constructor<?> con = null;
if (cache instanceof WeakReference<?> ref){
con = (ref == null) ? null : (Constructor<?>)ref.get();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catches - these 2 null checks can be removed

cl4es added a commit to cl4es/jdk that referenced this pull request Jan 7, 2021
@openjdk
Copy link

openjdk bot commented Jan 7, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8259065: Optimize MessageDigest.getInstance

Reviewed-by: valeriep

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 64 new commits pushed to the master branch:

  • 56a354e: Merge
  • 677802d: 8258484: AIX build fails in Harfbuzz with XLC 16.01.0000.0006
  • 1973fbe: 8039278: console.sh failed Automatically with exit code 1
  • acdd90b: 8258972: unexpected compilation error with generic sealed interface
  • c1fb521: 8259227: C2 crashes with SIGFPE due to a division that floats above its zero check
  • 484e23b: 8258657: Doc build is broken by use of new language features
  • 4a478b8: 8250903: jdk/jfr/javaagent/TestLoadedAgent.java fails with Mismatch in TestEvent count
  • b996ccc: 8259373: c1 and jvmci runtime code use ResetNoHandleMark incorrectly
  • 555641e: Merge
  • 4f914e2: 8249633: doclint reports missing javadoc for JavaFX property methods that have a property description
  • ... and 54 more: https://git.openjdk.java.net/jdk/compare/a6c088131bfeec1ce44e43b513ce3f857fa6e44f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 7, 2021
@cl4es
Copy link
Member Author

cl4es commented Jan 8, 2021

Thanks for reviewing, Valerie!

/integrate

@openjdk openjdk bot closed this Jan 8, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 8, 2021
@openjdk
Copy link

openjdk bot commented Jan 8, 2021

@cl4es Since your change was applied there have been 65 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

Pushed as commit fc1d2a1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated security security-dev@openjdk.org
3 participants