New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8276660: Scalability bottleneck in java.security.Provider.getService() #6513
Conversation
Removed the synchronized block inside Provider.getService() method which becomes a performance bottleneck when queried with unsupported services under high contention situations.
👋 Welcome back valeriep! A progress list of the required criteria for merging this PR into |
@valeriepeng The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
if (!checkLegacy(key)) return null; | ||
|
||
Object o = super.remove(key); | ||
if (o != null && o instanceof String && key instanceof String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
o != null
seems redundant here. o instanceof String
will be false in case of o == null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, removed.
|
||
boolean result = super.remove(key, value); | ||
if (result && key instanceof String && value instanceof String) { | ||
parseLegacy((String)key, (String)value, OPType.REMOVE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to use pattern matching for instanceof to avoid manual casts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do so. Thanks for the suggestion.
if (!legacyMap.isEmpty()) { | ||
Service s = legacyMap.get(key); | ||
if (s != null && !s.isValid()) { | ||
legacyMap.remove(key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's better to use legacyMap.remove(key, value);
here. What if another thread put some other value by the same key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can do that.
if (legacyMap != null && !legacyMap.isEmpty()) { | ||
return legacyMap.get(key); | ||
|
||
if (!legacyMap.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this check is redundant. get() will return null
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I suppose so.
} | ||
if (serviceSet == null) { | ||
ensureLegacyParsed(); | ||
if (serviceSet == null || legacyChanged || servicesChanged) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serviceSet
should be at least made volatile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments. I'm more concerned about the parseLegacy()
method which is called everywhere. Without the synchronized keyword, is it safe to call into this method by multiple threads at the same time? Do we have tests around this?
// Map<ServiceKey,Service> | ||
// used for services added via putService(), initialized on demand | ||
private transient Map<ServiceKey,Service> serviceMap; | ||
|
||
// For backward compatibility, the registration ordering of | ||
// SecureRandom (RNG) algorithms needs to be preserved for | ||
// "new SecureRandom()" calls when this provider is used | ||
// NOTE: may need extra mechanism for providers to indicate their | ||
// preferred ordering of SecureRandom algorithms since registration | ||
// ordering info is lost once serialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the ordering info lost once serialized? Weren't all entries re-added again in their original order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serialized bytes are just the mappings, i.e. key + value pairs. There are no ordering info associated with the key + value pair. IIRC, the particular thing about SecureRandom is that the first registration of SecureRandom is deemed to be the most preferred. However, if given only the serialized bytes, the entries are added based on the resulting order stored by the parent class, not necessarily the ordering of the initial insertion/add calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. How about serializing prngAlgos
as a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one way to address this particular issue. However, it also breaks the serialization compatibility and we are somewhat bound to having 'prngAlgos' as part of the serialized fields. Since no one reported this as being an issue (yet), I wonder if it's worthwhile to address it this way and break the serialization compatibility. There are other possible approaches such as defining an alias or attribute for default SecureRandom algorithm/impl, so I just add a note here instead of serializing 'prngAlgos'.
if (legacyStrings == null) { | ||
legacyStrings = new LinkedHashMap<>(); | ||
} else { | ||
legacyChanged = true; | ||
} | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better put this "return" line into the else block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
} | ||
parseLegacy(sk, sv, OPType.REPLACE); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are going through all the entries, should we also clean up the legacy sets and restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean simply wipe out the legacyMap and just do ADD instead of REPLACE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since you are simply iterating through all the entries instead of only the changed ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will do.
Hmm, the parseLegacy() method just processes the key/value legacy String-String mapping into corresponding service object (or its update) and store into the legacy service object (now typed as ConcurrentHashMap). I don't see any particular field which would require the "synchronized" keyword. There is test/jdk/java/security/Provider/GetServiceRace.java testing the legacy put() and getService() race condition. Not sure if it covers the scenarios you have in mind. Under most if not all usages, the legacy put() calls take place in provider constructor, unlike the getService() calls which may be called by different threads. |
Consider this case, two threads are changing a value at the same time. Since the method is not synchonized, thread1 might finish the first part of the method ( |
Well, then the synchronized keyword should be put on the public methods instead of the internal parseLegacy() method, no? Otherwise, the super.xxx() and the internal legacyMap may not updated in sync. The public methods did have the synchronized keywords which I removed since the putService() call isn't synchronized either (and it updates the serviceMap first and then stores the String-String mapping into super.xxx()). The main performance bottleneck is in getService(), so I can add back the "synchronized" keywords to other public methods if you are concerned. |
Since all legacy registration are done eagerly, I assume the original |
Set<Service> set = new LinkedHashSet<>(); | ||
if (!serviceMap.isEmpty()) { | ||
set.addAll(serviceMap.values()); | ||
} | ||
if (legacyMap != null && !legacyMap.isEmpty()) { | ||
if (!legacyMap.isEmpty()) { | ||
set.addAll(legacyMap.values()); | ||
} | ||
serviceSet = Collections.unmodifiableSet(set); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you reset legacyChanged
as well here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good catch!
private static final String ALIAS_PREFIX = "Alg.Alias."; | ||
private static final String ALIAS_PREFIX_LOWER = "alg.alias."; | ||
private static final int ALIAS_LENGTH = ALIAS_PREFIX.length(); | ||
|
||
private void parseLegacyPut(String name, String value) { | ||
private static enum OPType { | ||
ADD, REMOVE, REPLACE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that REPLACE
can also be used for adding things, for example, it's always used by implPut()
. Do you want to add a comment on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, on second thought, I am inclined to consolidate ADD and REPLACE. Probably clearer this way. No comment needed if just keeping one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@valeriepeng This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 228 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
/integrate |
Going to push as commit 9b74749.
Your commit was automatically rebased without conflicts. |
@valeriepeng Pushed as commit 9b74749. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Can you tell me which version of jdk fixes these bugs |
You mean this bug? Click on the "Issue" link and you can it was included in JDK 18. |
serviceSet = null; | ||
} | ||
if (serviceSet == null) { | ||
ensureLegacyParsed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @valeriepeng! I believe that with this change, getServices()
will return invalid legacy services. Before we called ensureLegacyParsed()
, which eventually called removeInvalidServices()
. In getService(String, String)
, we are now explicitly checking for isValid()
to keep the old behavior. Shouldn't we do something similar here as well? Am I missing something or is this an intended change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, could be. Let me check into it and I will have to file a separate bug to address this since the changes have already been integrated. Thanks for the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this! Please let me if you open a new bug so we can track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed https://bugs.openjdk.org/browse/JDK-8292739
A PR should be out in a day or two.
It is observed that when running crypto benchmark with large number of threads, a lot of time is spent on the synchronized block inside the Provider.getService() method. The cause for this is that Provider.getService() method first uses the 'serviceMap' field to find the requested service. However, when the requested service is not supported by this provider, e.g. requesting Cipher.RSA from SUN provider, the impl continues to try searching the legacy registrations whose processing is guarded by the "synchronized" keyword. When apps use getInstance() calls without the provider argument, Provider class has to iterate through existing providers trying to find one that supports the requested service.
Now that the parent class of Provider no longer synchronizes all of its methods, Provider class should follow suit and de-synchronize its methods. Parsing of the legacy registration is done eagerly (at the time of put(...) calls) instead of lazily (at the time of getService(...) calls). This also makes "legacyStrings" redundant as the registration is parsed and stored directly into "legacyMap".
The bug reporter has confirmed that the changes resolve the performance bottleneck and all regression tests pass.
Please review and thanks in advance,
Valerie
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6513/head:pull/6513
$ git checkout pull/6513
Update a local copy of the PR:
$ git checkout pull/6513
$ git pull https://git.openjdk.java.net/jdk pull/6513/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6513
View PR using the GUI difftool:
$ git pr show -t 6513
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6513.diff