Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8276660: Scalability bottleneck in java.security.Provider.getService() #6513

Closed
wants to merge 7 commits into from

Conversation

valeriepeng
Copy link

@valeriepeng valeriepeng commented Nov 23, 2021

It is observed that when running crypto benchmark with large number of threads, a lot of time is spent on the synchronized block inside the Provider.getService() method. The cause for this is that Provider.getService() method first uses the 'serviceMap' field to find the requested service. However, when the requested service is not supported by this provider, e.g. requesting Cipher.RSA from SUN provider, the impl continues to try searching the legacy registrations whose processing is guarded by the "synchronized" keyword. When apps use getInstance() calls without the provider argument, Provider class has to iterate through existing providers trying to find one that supports the requested service.

Now that the parent class of Provider no longer synchronizes all of its methods, Provider class should follow suit and de-synchronize its methods. Parsing of the legacy registration is done eagerly (at the time of put(...) calls) instead of lazily (at the time of getService(...) calls). This also makes "legacyStrings" redundant as the registration is parsed and stored directly into "legacyMap".

The bug reporter has confirmed that the changes resolve the performance bottleneck and all regression tests pass.

Please review and thanks in advance,
Valerie


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8276660: Scalability bottleneck in java.security.Provider.getService()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6513/head:pull/6513
$ git checkout pull/6513

Update a local copy of the PR:
$ git checkout pull/6513
$ git pull https://git.openjdk.java.net/jdk pull/6513/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6513

View PR using the GUI difftool:
$ git pr show -t 6513

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6513.diff

Removed the synchronized block inside Provider.getService() method which
becomes a performance bottleneck when queried with unsupported services
under high contention situations.
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 23, 2021

👋 Welcome back valeriep! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 23, 2021
@openjdk
Copy link

openjdk bot commented Nov 23, 2021

@valeriepeng The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security security-dev@openjdk.org label Nov 23, 2021
@mlbridge
Copy link

mlbridge bot commented Nov 23, 2021

Webrevs

if (!checkLegacy(key)) return null;

Object o = super.remove(key);
if (o != null && o instanceof String && key instanceof String) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o != null seems redundant here. o instanceof String will be false in case of o == null

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, removed.


boolean result = super.remove(key, value);
if (result && key instanceof String && value instanceof String) {
parseLegacy((String)key, (String)value, OPType.REMOVE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to use pattern matching for instanceof to avoid manual casts

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do so. Thanks for the suggestion.

if (!legacyMap.isEmpty()) {
Service s = legacyMap.get(key);
if (s != null && !s.isValid()) {
legacyMap.remove(key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's better to use legacyMap.remove(key, value); here. What if another thread put some other value by the same key?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do that.

if (legacyMap != null && !legacyMap.isEmpty()) {
return legacyMap.get(key);

if (!legacyMap.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this check is redundant. get() will return null anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I suppose so.

}
if (serviceSet == null) {
ensureLegacyParsed();
if (serviceSet == null || legacyChanged || servicesChanged) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serviceSet should be at least made volatile

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will fix.

Copy link
Contributor

@wangweij wangweij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. I'm more concerned about the parseLegacy() method which is called everywhere. Without the synchronized keyword, is it safe to call into this method by multiple threads at the same time? Do we have tests around this?

// Map<ServiceKey,Service>
// used for services added via putService(), initialized on demand
private transient Map<ServiceKey,Service> serviceMap;

// For backward compatibility, the registration ordering of
// SecureRandom (RNG) algorithms needs to be preserved for
// "new SecureRandom()" calls when this provider is used
// NOTE: may need extra mechanism for providers to indicate their
// preferred ordering of SecureRandom algorithms since registration
// ordering info is lost once serialized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the ordering info lost once serialized? Weren't all entries re-added again in their original order?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serialized bytes are just the mappings, i.e. key + value pairs. There are no ordering info associated with the key + value pair. IIRC, the particular thing about SecureRandom is that the first registration of SecureRandom is deemed to be the most preferred. However, if given only the serialized bytes, the entries are added based on the resulting order stored by the parent class, not necessarily the ordering of the initial insertion/add calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about serializing prngAlgos as a list?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one way to address this particular issue. However, it also breaks the serialization compatibility and we are somewhat bound to having 'prngAlgos' as part of the serialized fields. Since no one reported this as being an issue (yet), I wonder if it's worthwhile to address it this way and break the serialization compatibility. There are other possible approaches such as defining an alias or attribute for default SecureRandom algorithm/impl, so I just add a note here instead of serializing 'prngAlgos'.

if (legacyStrings == null) {
legacyStrings = new LinkedHashMap<>();
} else {
legacyChanged = true;
}
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better put this "return" line into the else block.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

}
parseLegacy(sk, sv, OPType.REPLACE);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are going through all the entries, should we also clean up the legacy sets and restart?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean simply wipe out the legacyMap and just do ADD instead of REPLACE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since you are simply iterating through all the entries instead of only the changed ones.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will do.

@valeriepeng
Copy link
Author

Some comments. I'm more concerned about the parseLegacy() method which is called everywhere. Without the synchronized keyword, is it safe to call into this method by multiple threads at the same time? Do we have tests around this?

Hmm, the parseLegacy() method just processes the key/value legacy String-String mapping into corresponding service object (or its update) and store into the legacy service object (now typed as ConcurrentHashMap). I don't see any particular field which would require the "synchronized" keyword. There is test/jdk/java/security/Provider/GetServiceRace.java testing the legacy put() and getService() race condition. Not sure if it covers the scenarios you have in mind. Under most if not all usages, the legacy put() calls take place in provider constructor, unlike the getService() calls which may be called by different threads.

@wangweij
Copy link
Contributor

Consider this case, two threads are changing a value at the same time. Since the method is not synchonized, thread1 might finish the first part of the method (super.replace) earlier than thread2, but it finishes the second part (parseLegacy) later than thread2. At the end, the internal entrySet has thread2's value but the legacy map has thread1's value.

@valeriepeng
Copy link
Author

Consider this case, two threads are changing a value at the same time. Since the method is not synchonized, thread1 might finish the first part of the method (super.replace) earlier than thread2, but it finishes the second part (parseLegacy) later than thread2. At the end, the internal entrySet has thread2's value but the legacy map has thread1's value.

Well, then the synchronized keyword should be put on the public methods instead of the internal parseLegacy() method, no? Otherwise, the super.xxx() and the internal legacyMap may not updated in sync. The public methods did have the synchronized keywords which I removed since the putService() call isn't synchronized either (and it updates the serviceMap first and then stores the String-String mapping into super.xxx()). The main performance bottleneck is in getService(), so I can add back the "synchronized" keywords to other public methods if you are concerned.

@wangweij
Copy link
Contributor

wangweij commented Dec 1, 2021

Since all legacy registration are done eagerly, I assume the original ensureLegacyParsed() method should be super fast now. Maybe we don't need to change any synchronized keyword.

Set<Service> set = new LinkedHashSet<>();
if (!serviceMap.isEmpty()) {
set.addAll(serviceMap.values());
}
if (legacyMap != null && !legacyMap.isEmpty()) {
if (!legacyMap.isEmpty()) {
set.addAll(legacyMap.values());
}
serviceSet = Collections.unmodifiableSet(set);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you reset legacyChanged as well here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch!

private static final String ALIAS_PREFIX = "Alg.Alias.";
private static final String ALIAS_PREFIX_LOWER = "alg.alias.";
private static final int ALIAS_LENGTH = ALIAS_PREFIX.length();

private void parseLegacyPut(String name, String value) {
private static enum OPType {
ADD, REMOVE, REPLACE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that REPLACE can also be used for adding things, for example, it's always used by implPut(). Do you want to add a comment on this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, on second thought, I am inclined to consolidate ADD and REPLACE. Probably clearer this way. No comment needed if just keeping one.

Copy link
Contributor

@wangweij wangweij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented Dec 8, 2021

@valeriepeng This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8276660: Scalability bottleneck in java.security.Provider.getService()

Reviewed-by: weijun

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 228 new commits pushed to the master branch:

  • fb6d611: 8278276: G1: Refine naming of G1GCParPhaseTimesTracker::_must_record
  • d7ad546: 8276422: Add command-line option to disable finalization
  • ec7cb6d: 8276447: Deprecate finalization-related methods for removal
  • 3c2951f: 8275771: JDK source code contains redundant boolean operations in jdk.compiler and langtools
  • 3d61372: 8278363: Create extented container test groups
  • 716c2e1: 8278368: ProblemList tools/jpackage/share/MultiNameTwoPhaseTest.java on macosx-x64
  • a8a1fbc: 8278068: Fix next-line modifier (snippet markup)
  • 061017a: 8273175: Add @SInCE tags to the DocTree.Kind enum constants
  • d7c283a: 8275233: Incorrect line number reported in exception stack trace thrown from a lambda expression
  • 3955b03: 8277328: jdk/jshell/CommandCompletionTest.java failures on Windows
  • ... and 218 more: https://git.openjdk.java.net/jdk/compare/05a9a51dbfc46eb52bc28f1f9a618c75ee2597e9...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 8, 2021
@valeriepeng
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Dec 8, 2021

Going to push as commit 9b74749.
Since your change was applied there have been 244 commits pushed to the master branch:

  • 2478158: 8277361: java/nio/channels/Channels/ReadXBytes.java fails with OOM error
  • 8af3b27: 8277850: C2: optimize mask checks in counted loops
  • 3e93e0b: 8276769: -Xshare:auto should tolerate problems in the CDS archive
  • 79165b7: 8278324: Update the --generate-cds-archive jlink plugin usage message
  • 40d726b: 8278310: Improve logging in CDS DynamicLoaderConstraintsTest.java
  • e4852c6: 8277998: runtime/cds/appcds/loaderConstraints/DynamicLoaderConstraintsTest.java#custom-cl-zgc failed "assert(ZAddress::is_marked(addr)) failed: Should be marked"
  • 37921e3: 8269258: java/net/httpclient/ManyRequestsLegacy.java failed with connection timeout
  • fd8cb2d: 8278346: java/nio/file/Files/probeContentType/Basic.java fails on Linux SLES15 machine
  • e5cb84e: 8278336: Use int64_t to represent byte quantities consistently in JfrObjectAllocationSample
  • 54993b1: 8278309: [windows] use of uninitialized OSThread::_state
  • ... and 234 more: https://git.openjdk.java.net/jdk/compare/05a9a51dbfc46eb52bc28f1f9a618c75ee2597e9...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Dec 8, 2021
@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 8, 2021
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 8, 2021
@openjdk
Copy link

openjdk bot commented Dec 8, 2021

@valeriepeng Pushed as commit 9b74749.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@valeriepeng valeriepeng deleted the JDK-8276660 branch December 8, 2021 17:57
@Gubaidan
Copy link

Gubaidan commented Jul 25, 2022

Consider this case, two threads are changing a value at the same time. Since the method is not synchonized, thread1 might finish the first part of the method (super.replace) earlier than thread2, but it finishes the second part (parseLegacy) later than thread2. At the end, the internal entrySet has thread2's value but the legacy map has thread1's value.

Well, then the synchronized keyword should be put on the public methods instead of the internal parseLegacy() method, no? Otherwise, the super.xxx() and the internal legacyMap may not updated in sync. The public methods did have the synchronized keywords which I removed since the putService() call isn't synchronized either (and it updates the serviceMap first and then stores the String-String mapping into super.xxx()). The main performance bottleneck is in getService(), so I can add back the "synchronized" keywords to other public methods if you are concerned.

Can you tell me which version of jdk fixes these bugs

@wangweij
Copy link
Contributor

Can you tell me which version of jdk fixes these bugs

You mean this bug? Click on the "Issue" link and you can it was included in JDK 18.

serviceSet = null;
}
if (serviceSet == null) {
ensureLegacyParsed();
Copy link
Contributor

@zapster zapster Aug 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @valeriepeng! I believe that with this change, getServices() will return invalid legacy services. Before we called ensureLegacyParsed(), which eventually called removeInvalidServices(). In getService(String, String), we are now explicitly checking for isValid() to keep the old behavior. Shouldn't we do something similar here as well? Am I missing something or is this an intended change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could be. Let me check into it and I will have to file a separate bug to address this since the changes have already been integrated. Thanks for the comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this! Please let me if you open a new bug so we can track it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed https://bugs.openjdk.org/browse/JDK-8292739
A PR should be out in a day or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated security security-dev@openjdk.org
5 participants