Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8274524: SSLSocket.close() hangs if it is called during the ssl handshake #7432

Closed
wants to merge 6 commits into from

Conversation

alexeybakhtin
Copy link
Contributor

@alexeybakhtin alexeybakhtin commented Feb 10, 2022

Please review the patch for the JDK-8274524

SSLSocket.close() could cause an intermittent hang of the socket read operation. It happens in case of SO_TIMEOUT is set to 0 (infinite timeout).
SSLSocket.close() reads from the socket as part of the skip() operation to prevent TCP Connection reset (see JDK-8268965). Socket reads are performed in a loop for small chunks. These read operations could cause a deadlock, in case of SO_TIMEOUT = 0
I suggest to force non-zero SO_TIMEOUT during the skip() operation to prevent such deadlock

This is a second iteration of review. Previous PR was closed without integration: #5760


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8274524: SSLSocket.close() hangs if it is called during the ssl handshake

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7432/head:pull/7432
$ git checkout pull/7432

Update a local copy of the PR:
$ git checkout pull/7432
$ git pull https://git.openjdk.java.net/jdk pull/7432/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7432

View PR using the GUI difftool:
$ git pr show -t 7432

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7432.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 10, 2022

👋 Welcome back abakhtin! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 10, 2022
@openjdk
Copy link

openjdk bot commented Feb 10, 2022

@alexeybakhtin The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security security-dev@openjdk.org label Feb 10, 2022
@mlbridge
Copy link

mlbridge bot commented Feb 10, 2022

Webrevs

try {
if (soTimeout == 0)
setSoTimeout(DEFAULT_SKIP_TIMEOUT);
Copy link
Member

@XueleiFan XueleiFan Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set will impact the socket overall behavior unexpectedly, not just the close() method.

Maybe, an input stream level synchronization is missed in the SSLSocketInputRecord method, so that logic of deplete could be interrupted.

Copy link
Contributor Author

@alexeybakhtin alexeybakhtin Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @XueleiFan,
Not quite sure What could be wrong with setting SO_TIMEOUT during the socketClose() operation. We still close the socket and SO_TIMEOUT will be restored just after skip() operation is completed. These changes could affect the parallel read of the handshake records (we hold appInput.readLock during the skip) but we still close the socket and interrupt connection in this case.
If you still think these changes are incorrect, What do you think about the most first version of the patch: d1c2a4f
This version adds a new lock in the SSLSocketInputRecord and protects the parallel execution of the read and skip operations.

Copy link
Member

@XueleiFan XueleiFan Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the issue correctly, the problem is about that the skip method cannot get the right length if the input stream could be accessed in another thread.

  1. get the available input stream bytes length;
  2. a third thread read the input stream, and the input stream get changed;
  3. skip up to the bytes length in step 1.
  4. the skip() method hangs on waiting for more bytes.

I think we could focus on on address the synchronization problem, because the problem could result in other weird behaviors we don't know yet.

For the SO_TIMEOUT, I think it is a good workaround. But I'm not sure if it will impact other behaviors or not. Besides, I know you are trying to use a small timeout, but it is still blocked and it not easy to find a number fit all situation that does not impact the performance.

As I commented in the 1st version, I'm not sure of the locks logic in the patch, which changes the behavior of SSLSocket.

I may suggest to an input stream level synchronization. I know the update could take a while, and may not be able to integrate in time. If you want to fix the issue as soon as possible, I'm OK to move ahead with your current direction, but please set the SO_TIMEOUT to as minimal as possible, for example 1 ms; and have a comment that this is a temporary/workaround solution.

Copy link
Contributor Author

@alexeybakhtin alexeybakhtin Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you again for your detailed response and comments.
Your assumption about this issue is right and I think SO_TIMEOUT should be an acceptable solution.
I've changed DEFAULT_SKIP_TIMEOUT to 1 ms and added comments about a temporary workaround.
If you don't mind, I'd like to commit it asap because this patch should be backported to the early versions.

No regressions were found on the sun/security/ssl tests.

@openjdk
Copy link

openjdk bot commented Feb 11, 2022

@alexeybakhtin This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274524: SSLSocket.close() hangs if it is called during the ssl handshake

Reviewed-by: xuelei

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1639 new commits pushed to the master branch:

  • 83ffbd2: 8277175: Add a parallel multiply method to BigInteger
  • 0786ddb: 8281535: Create a regression test for JDK-4670051
  • c5ff6e4: 8223077: module path support for dynamic CDS archive
  • 8886839: 8281622: JFR: Improve documentation of jdk.jfr.Relational
  • e75e8cd: 8279997: check_for_dynamic_dump should not exit vm
  • e73ee0c: 8281259: MutableBigInteger subtraction could be simplified
  • f399ae5: 8202836: [macosx] test java/awt/Graphics/TextAAHintsTest.java fails
  • 4ff5824: 8281100: Spurious "variable might not have been initialized" with sealed class switch
  • d254cf2: 8281638: jfr/event/allocation tests fail with release VMs after JDK-8281318 due to lack of -XX:+UnlockDiagnosticVMOptions
  • 4d64076: 8047749: javadoc for getPathBounds() in TreeUI and BasicTreeUI is incorrect
  • ... and 1629 more: https://git.openjdk.java.net/jdk/compare/79cebe2c1b1e7f43377633b62c970528cac0a786...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 11, 2022
@alexeybakhtin
Copy link
Contributor Author

alexeybakhtin commented Feb 12, 2022

/integrate

@openjdk
Copy link

openjdk bot commented Feb 12, 2022

Going to push as commit 58dae60.
Since your change was applied there have been 1643 commits pushed to the master branch:

  • aa918a6: 8281033: Improve ImageCheckboxTest to test all available LaF
  • 6fdfe04: 8281674: tools/javac/annotations/typeAnnotations/classfile/AnonymousExtendsTest.java fails with AssertionError
  • c3179a8: 8281462: Annotation toString output for enum not reusable for source input
  • 4032fe7: 8281238: TYPE_USE annotations not printed in correct position in toString output
  • 83ffbd2: 8277175: Add a parallel multiply method to BigInteger
  • 0786ddb: 8281535: Create a regression test for JDK-4670051
  • c5ff6e4: 8223077: module path support for dynamic CDS archive
  • 8886839: 8281622: JFR: Improve documentation of jdk.jfr.Relational
  • e75e8cd: 8279997: check_for_dynamic_dump should not exit vm
  • e73ee0c: 8281259: MutableBigInteger subtraction could be simplified
  • ... and 1633 more: https://git.openjdk.java.net/jdk/compare/79cebe2c1b1e7f43377633b62c970528cac0a786...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 12, 2022
@openjdk openjdk bot closed this Feb 12, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 12, 2022
@openjdk
Copy link

openjdk bot commented Feb 12, 2022

@alexeybakhtin Pushed as commit 58dae60.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated security security-dev@openjdk.org
2 participants