Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8241248: NullPointerException in sun.security.ssl.HKDF.extract(HKDF.java:93) #3664

Closed
wants to merge 5 commits into from

Conversation

alexeybakhtin
Copy link
Contributor

@alexeybakhtin alexeybakhtin commented Apr 23, 2021

Hello All,

Could you please review the fix for the JDK-8241248?
The issue happens during the TLSv1.3 handshake without server stateless session resumption in case of server receives several parallel requests with the same pre_shared_key.
The main idea of the fix is to remove resuming session from the session cache in the early stage.

JBS: https://bugs.openjdk.java.net/browse/JDK-8241248
Webrev 8u: http://cr.openjdk.java.net/~abakhtin/8241248/webrev.v0/

The test from the bug report using OpenSSL is passed ( -Djdk.tls.server.enableSessionTicketExtension=false )
javax/net/ssl and sun/security/ssl jtreg tests passed

Regards
Alexey


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8241248: NullPointerException in sun.security.ssl.HKDF.extract(HKDF.java:93)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3664/head:pull/3664
$ git checkout pull/3664

Update a local copy of the PR:
$ git checkout pull/3664
$ git pull https://git.openjdk.java.net/jdk pull/3664/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3664

View PR using the GUI difftool:
$ git pr show -t 3664

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3664.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 23, 2021

👋 Welcome back abakhtin! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 23, 2021
@openjdk
Copy link

openjdk bot commented Apr 23, 2021

@alexeybakhtin The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security security-dev@openjdk.org label Apr 23, 2021
@mlbridge
Copy link

mlbridge bot commented Apr 23, 2021

Webrevs

@normanmaurer
Copy link

normanmaurer commented Apr 26, 2021

Hi @normanmaurer, thanks for making a comment in an OpenJDK project!

All comments and discussions in the OpenJDK Community must be made available under the OpenJDK Terms of Use. If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please Use "Add GitHub user normanmaurer for the summary.

If you are not an OpenJDK Author, Committer or Reviewer, simply check the box below to accept the OpenJDK Terms of Use for your comments.

Your comment will be automatically restored once you have accepted the OpenJDK Terms of Use.

@jnimeh
Copy link
Member

jnimeh commented Apr 27, 2021

I think this looks good. Thank you.
Since you've done all the work on this one, it seems fitting that you'd become the owner of the issue in JBS. Also this might be a noreg-hard candidate since the failure is intermittent and requires putting load on a server in order to run into the issue, correct?

@openjdk
Copy link

openjdk bot commented Apr 27, 2021

@alexeybakhtin This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8241248: NullPointerException in sun.security.ssl.HKDF.extract(HKDF.java:93)

Reviewed-by: jnimeh, xuelei

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 265 new commits pushed to the master branch:

  • 0f925d1: 8266015: Implement AdapterHandlerLibrary lookup fast-path for common adapters
  • 69b96f9: 8266610: Method RandomAccessFile#length() returns 0 for block devices on linux.
  • 9b76955: 8266249: javax/swing/JPopupMenu/7156657/bug7156657.java fails on macOS
  • 3af4efd: 8265291: Error in Javadoc for doAccessibleAction API in AccessibleJSlider class
  • be4f25b: 8266369: (se) Add wepoll based Selector
  • ff77ca8: 8266675: Optimize IntHashTable for encapsulation and ease of use
  • 04fad70: 8266765: [BACKOUT] JDK-8255493 Support for pre-generated java.lang.invoke classes in CDS dynamic archive
  • 0790e60: 8196743: jstatd doesn't see new Java processes inside Docker container
  • c6aa8f1: 8232644: bugs in serialized-form.html
  • b5b3119: 8266589: (fs) Improve performance of Files.copy() on macOS using copyfile(3)
  • ... and 255 more: https://git.openjdk.java.net/jdk/compare/ac23870186e580079928cabf56efe6aa7d51ed63...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@jnimeh, @XueleiFan) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 27, 2021
if (s == null &&
requestedId.identity.length > SessionId.MAX_LENGTH &&
sessionCache.statelessEnabled()) {
synchronized (sessionCache) {
Copy link
Member

@XueleiFan XueleiFan Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have a test if there is a performance regression by introducing the synchronization here?

I have to check the engineGetServerSessionContext() specification and implementation (if the sessionCache is a reference?), and the session cache implementation to make sure the synchronization works (if the synchronization on sessionCache is the same as the synchronization on the cache internal implementation) . Maybe, the improvement could in the cache implementation for readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I’ve made a test that calculates total time spent by server to receive "N" connections. Every server handshake is performed in a separate thread
The client starts "T" threads. Every thread sends one initial connection and "R-1" renegotiated connections. So, the total number of connections is "N"="T"*"R"

Results for the original and JDK-8241248 code are almost identical:
"T"=10 "R"=100
Original OpenJDK: 1140 ms
JDK-8241248 code: 1090 ms

"T"=50 "R"=100
Original OpenJDK: 1164 ms
JDK-8241248 code: 1108 ms

"T"=60 "R"=100
Original OpenJDK: 1461 ms
JDK-8241248 code: 1579 ms

"T"=70 "R"=100
Original OpenJDK: 2058 ms
JDK-8241248 code: 1999 ms

"T"=80 "R"=100
Original OpenJDK: 2148 ms
JDK-8241248 code: 2125 ms

"T"=90 "R"=100
Original OpenJDK: 2540 ms
JDK-8241248 code: 2514 ms

"T"=90 "R"=100
Original OpenJDK: 3011 ms
JDK-8241248 code: 3010 ms

I can confirm that the synchronization code works. Without it, I still catch exception from different threads trying to resume the same session from the cache

Copy link
Member

@XueleiFan XueleiFan Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update.

I also expect the code easy to read and maintain in the future. But please go ahead for the integration if you don't want to make the update now. We could file an enhancement later on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some changes that @ascarpino did for RSA blinding to reduce lock contention within that cache. It doesn't look like we're running into a hot lock here based on the numbers above, but it might be worth looking at his solution down the road when we move synchronization into the Cache object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XueleiFan , @jnimeh, thank you for your suggestions. I have updated synchronization using ReentrantLock in the Cache object. The performance test showed no issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XueleiFan , @jnimeh, thank you for your suggestions. I have updated synchronization using ReentrantLock in the Cache object. The performance test showed no issues.

Thank you for considering use ReentrantLock, the "synchronized" keyword is something we are trying to avoid in the JSSE implementation. However, the reentrant lock introduced is not the lock in the Cache implementation. They are overlapped and could be conflict. The performance test looks good with just a few threads, but I'm not very sure of the through put impact if the threads goes up to 1M or more (there were complaints previously for large volume traffics).

Maybe, you could keep it simple by adding a pull() method in the Cache class (find the session and remove it from the cache if found), by using the Cache class internal synchronize methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, simple pull() can not be used in this case. We have to check if the session found in the cache can be rejoined with parameters received in ClientHello and server context. Only rejoinable sessions should be removed from the session cache.
It is possible to use simple pull() and restore session in the cache if the session is not rejoinable, but I do not like this approach. Also, it will require extending Cache with get/setExpirationTime methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, simple pull() can not be used in this case. We have to check if the session found in the cache can be rejoined with parameters received in ClientHello and server context. Only rejoinable sessions should be removed from the session cache.

For TLS 1.3, I think it may be safe to remove the session from the cache even if it is no rejoinable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XueleiFan Thank you for suggestion. The patch is updated using a simple pull()

@alexeybakhtin
Copy link
Contributor Author

I think this looks good. Thank you.
Since you've done all the work on this one, it seems fitting that you'd become the owner of the issue in JBS. Also this might be a noreg-hard candidate since the failure is intermittent and requires putting load on a server in order to run into the issue, correct?

@jnimeh ,
Thank you for review. I've assigned it to myself and added "noreg-hard" label as you suggested. The issue is intermittent and I don't know how to reproduce it without third-party libs.

Copy link
Member

@XueleiFan XueleiFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for take my comments. The code looks nice to me, except a few trivial comments.

Comment on lines 181 to 183
if (id != null)
return sessionCache.pull(new SessionId(id));
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is an internal method, it should be safe to assume that the id is non-null. I'm fine if you want to keep the non-null check, but please use braces for if-statement (see also https://www.oracle.com/java/technologies/javase/codeconventions-statements.html#449).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Added braces but I'd like to keep non-null check.

if (entry == null) {
return null;
}
V value;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may add a blank line before this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, added

Comment on lines 430 to 442
V value;
long time = (lifetime == 0) ? 0 : System.currentTimeMillis();
if (entry.isValid(time)) {
value = entry.getValue();
} else {
if (DEBUG) {
System.out.println("Ignoring expired entry");
}
value = null;
}
entry.invalidate();
return value;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may adjust the lines a little bit so as to avoid duplicated operations (see the implementation code of isValid()).

         long time = (lifetime == 0) ? 0 : System.currentTimeMillis();
         if (entry.isValid(time)) {
              V value = entry.getValue();
              entry.invalidate();
              return value;
          } else {
              if (DEBUG) {
                  System.out.println("Ignoring expired entry");
              }
              return null;
          }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep my code as-is. We still need invalidate() if entry is not valid (see remove() operation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you notice that entry.isValid() implementation has already call invalidate() if the entry is not valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't. You are right. Updated code as you suggested. Thank you.

@@ -374,7 +374,7 @@ public void consume(ConnectionContext context,
for (PskIdentity requestedId : pskSpec.identities) {
// If we are keeping state, see if the identity is in the cache
if (requestedId.identity.length == SessionId.MAX_LENGTH) {
s = sessionCache.get(requestedId.identity);
s = sessionCache.pull(requestedId.identity);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add a comment here so as we know why a pull method could be used here? For example:

-                    // If we are keeping state, see if the identity is in the cache
+                    // If we are keeping state, see if the identity is in the
+                    // cache.  Note that for TLS 1.3, we would also clean
+                    // up the cached session if it is not rejoinable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, added comments as suggested.

Copy link
Member

@XueleiFan XueleiFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you!

@alexeybakhtin
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 10, 2021
@openjdk
Copy link

openjdk bot commented May 10, 2021

@alexeybakhtin
Your change (at version 3f31dc2) is now ready to be sponsored by a Committer.

@VladimirKempik
Copy link

/sponsor

@openjdk openjdk bot closed this May 10, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 10, 2021
@openjdk
Copy link

openjdk bot commented May 10, 2021

@VladimirKempik @alexeybakhtin Since your change was applied there have been 265 commits pushed to the master branch:

  • 0f925d1: 8266015: Implement AdapterHandlerLibrary lookup fast-path for common adapters
  • 69b96f9: 8266610: Method RandomAccessFile#length() returns 0 for block devices on linux.
  • 9b76955: 8266249: javax/swing/JPopupMenu/7156657/bug7156657.java fails on macOS
  • 3af4efd: 8265291: Error in Javadoc for doAccessibleAction API in AccessibleJSlider class
  • be4f25b: 8266369: (se) Add wepoll based Selector
  • ff77ca8: 8266675: Optimize IntHashTable for encapsulation and ease of use
  • 04fad70: 8266765: [BACKOUT] JDK-8255493 Support for pre-generated java.lang.invoke classes in CDS dynamic archive
  • 0790e60: 8196743: jstatd doesn't see new Java processes inside Docker container
  • c6aa8f1: 8232644: bugs in serialized-form.html
  • b5b3119: 8266589: (fs) Improve performance of Files.copy() on macOS using copyfile(3)
  • ... and 255 more: https://git.openjdk.java.net/jdk/compare/ac23870186e580079928cabf56efe6aa7d51ed63...master

Your commit was automatically rebased without conflicts.

Pushed as commit 1603ca2.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated security security-dev@openjdk.org
5 participants