Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8237578: JDK-8214339 (SSLSocketImpl wraps SocketException) appears to not be fully fixed #1968

Closed

Conversation

cliveverghese
Copy link
Contributor

@cliveverghese cliveverghese commented Jan 6, 2021

This PR aims to revert some more cases where SocketExceptions are improperly being wrapped as SSLException. Some work for this was done in JDK-8235263, but that change did not cover all the cases.

As it was mentioned in JDK-8235263, some applications rely on receiving SocketException to decide if the connection should be retried. An example of this would be Apache HTTP client. This PR should ideally fix https://issues.apache.org/jira/browse/HTTPCLIENT-2032


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8237578: JDK-8214339 (SSLSocketImpl wraps SocketException) appears to not be fully fixed

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1968/head:pull/1968
$ git checkout pull/1968

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 6, 2021

👋 Welcome back cverghese! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 6, 2021
@openjdk
Copy link

openjdk bot commented Jan 6, 2021

@cliveverghese The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security security-dev@openjdk.org label Jan 6, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 6, 2021

Webrevs

Copy link
Member

@XueleiFan XueleiFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for take care of this issue. Looks good to me.

@openjdk
Copy link

openjdk bot commented Jan 6, 2021

@cliveverghese This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8237578: JDK-8214339 (SSLSocketImpl wraps SocketException) appears to not be fully fixed

Reviewed-by: xuelei, simonis

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 71 new commits pushed to the master branch:

  • 1bd015f: 8258407: Split up CompileJavaModules.gmk into make/modules/$M/Java.gmk
  • 2354882: 8250768: javac should be adapted to changes in JEP 12
  • 18a37f9: 8259368: Zero: UseCompressedClassPointers does not depend on UseCompressedOops
  • a03e22b: 8253910: UseCompressedClassPointers depends on UseCompressedOops in vmError.cpp
  • e0d748d: 8258606: os::print_signal_handlers() should resolve the function name of the handlers
  • bd34418: 8258445: Move make/templates to make/data
  • d21a0ea: 8258449: Move make/hotspot/symbols to make/data
  • 3974fd4: 8259451: Zero: skip serviceability/sa tests, set vm.hasSA to false
  • bb247b0: 8259392: Zero error reporting is broken after JDK-8255711
  • 2806bf2: 8259475: Fix bad merge in compilerOracle
  • ... and 61 more: https://git.openjdk.java.net/jdk/compare/80544e4d5f402f67171eb5aca11897f1c5c3d431...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@XueleiFan, @simonis) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 6, 2021
@michael-o
Copy link

michael-o commented Jan 7, 2021

Apache HttpComponents PMC here, thanks for the follow up!

@cliveverghese
Copy link
Contributor Author

While looking into the TrustTrustedCert.java test, we found out this change may cause the test to fails intermittently. This is because it's is not expecting a SocketException in Line 134. The root cause for this intermittent failure seems to be that when the server processes the the ClientHello message, it iterates through a list of producers to create the response message. These response messages are written to the socket by each producer, and the client can start processing these messages before all of them are sent.

If one of the earlier messages causes the client to reject the TLS handshake (in this example, a bad certificate), the client will send a TLS fatal alert and close the socket. In most scenarios, by the time the TLS fatal alert is received by the server, all the pending TLS handshake messages would´ve been written into the socket, but if the client response is fast enough, this TLS alert can reach the server before it has finished sending all the messages. The server will not wait to check if it has received a fatal alert from the client and it will attempt to send the next message anyway.

Because in this scenario the socket has already been closed by the client, a SocketException (Broken Pipe) will be generated, instead of the more appropriate SSLHandshakeException. Before this patch, those SocketExceptions were at least being wrapped as SSLException, but now, the SocketException will go all the way up. Most of the time, this might not be a problem, but there are scenarios that come to mind where this could become relevant. The simpler one to explain is MTLS, where a very similar scenario can happen but this time in the client side if the server sends the fatal while the client is still writing (for example, rejecting the client certificate before this has sent the Client Key Exchange). Again, instead of an SSLHanshakeException, we have SSLException before this patch and SocketException after. In this case, seeing a SocketException the client may decide to retry the connection (while for an SSLException or SSLHandshakeException, most likely it wouldn't). How bad this is, and how bad the potential for an extra retry is compared to the current situation when retries are not happening for transient errors, we'll leave for the discussion.

Additionally, although the step between Certificate and Server/Client Key Exchange is the most likely to trigger this, every time multiple producers are being called there is a chance for this issue to arise. The good news is the issue requires round-trip communication between server and client to be faster than processing all the producers in a batch. The aforementioned test triggers this behavior quite often as both sides of the TLS connection are on the same machine.

@XueleiFan
Copy link
Member

While looking into the TrustTrustedCert.java test, we found out this change may cause the test to fails intermittently. This is because it's is not expecting a SocketException in Line 134. The root cause for this intermittent failure seems to be that when the server processes the the ClientHello message, it iterates through a list of producers to create the response message. These response messages are written to the socket by each producer, and the client can start processing these messages before all of them are sent.

If one of the earlier messages causes the client to reject the TLS handshake (in this example, a bad certificate), the client will send a TLS fatal alert and close the socket. In most scenarios, by the time the TLS fatal alert is received by the server, all the pending TLS handshake messages would´ve been written into the socket, but if the client response is fast enough, this TLS alert can reach the server before it has finished sending all the messages. The server will not wait to check if it has received a fatal alert from the client and it will attempt to send the next message anyway.

Because in this scenario the socket has already been closed by the client, a SocketException (Broken Pipe) will be generated, instead of the more appropriate SSLHandshakeException. Before this patch, those SocketExceptions were at least being wrapped as SSLException, but now, the SocketException will go all the way up. Most of the time, this might not be a problem, but there are scenarios that come to mind where this could become relevant. The simpler one to explain is MTLS, where a very similar scenario can happen but this time in the client side if the server sends the fatal while the client is still writing (for example, rejecting the client certificate before this has sent the Client Key Exchange). Again, instead of an SSLHanshakeException, we have SSLException before this patch and SocketException after. In this case, seeing a SocketException the client may decide to retry the connection (while for an SSLException or SSLHandshakeException, most likely it wouldn't). How bad this is, and how bad the potential for an extra retry is compared to the current situation when retries are not happening for transient errors, we'll leave for the discussion.

Additionally, although the step between Certificate and Server/Client Key Exchange is the most likely to trigger this, every time multiple producers are being called there is a chance for this issue to arise. The good news is the issue requires round-trip communication between server and client to be faster than processing all the producers in a batch. The aforementioned test triggers this behavior quite often as both sides of the TLS connection are on the same machine.

Maybe, sending the fatal alter before throw the socket exception could be a safer update.

Comment on lines 1676 to 1678
// Don't close the Socket in case of timeouts, interrupts or SocketException.
if (cause instanceof InterruptedIOException ||
cause instanceof SocketException) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we still need to shutdown the connection with a fatal alter for socket exception, otherwise there might be socket leaks. Instead, the socket exception could be thrown after the fatal alert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to shutdown the socket in case of a SocketException.

Comment on lines 450 to 452
} catch (SocketException se) {
// don't change exception in case of SocketException
throw se;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, the fatal alter could be sent before thrown the socket exception.

Copy link
Contributor Author

@cliveverghese cliveverghese Jan 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client is sending the fatal,

However, the server, since it's producing the message, It's not reading from the socket to see that the client sent the bad_certificate

SERVER                                                                  CLIENT
*                                <------------                       CLIENT_HELLO
CLIENT_HELLO_CONSUMER
SERVER_HELLO_PRODUCER            ------------->                     SERVER_HELLO_CONSUMER
CERTIFICATE_PRODUCER             ------------->                     CERTIFICATE_CONSUMER
CERTIFICATE_STATUS               ------------->                     Still in CERTIFICATE_CONSUMER
START SERVER_KEY_EXCHANGE_PRODUCER
*                                <-------------                    CERTIFICATE_CONSUMER sends bad_certificate alert
*                                <-------------                    CLIENT_CLOSES_SOCKET
SERVER_KEY_EXCHANGE_PRODUCER
attempts to write to socket      --------||||
(broken_pipe exception)

Server throws a SocketException(broken_pipe) exception instead of SSLException(bad_certificate) or SSLHandshakeException(bad_certificate)

When in the producer, the server does not read from the socket, and hence does not process the bad_certificate alert from the client

The SERVER_KEY_EXCHANGE produce then attempts to write to the socket, which encounters the broken pipe.

We could, in the SSLSocketImpl::handleException, attempt to check if there is a message available in the socket. If so, read the message and throw the appropriate exception.

I could open a follow up JBS issue to address this.

A draft for the fix is
cliveverghese@93dba96

Copy link
Contributor Author

@cliveverghese cliveverghese Jan 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created a follow-up JBS issue for the issue,
https://bugs.openjdk.java.net/browse/JDK-8259516

The fix for that is available cliveverghese@a1bc711

I will wait for this to be merged to create the pull request for JDK-8259516.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jan 7, 2021
@openjdk openjdk bot removed the rfr Pull request is ready for review label Jan 9, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 9, 2021
Comment on lines 1698 to 1703
if (cause instanceof SocketException) {
conContext.teardownTransport(cause, alert, false);
throw (SocketException)cause;
}

throw conContext.fatal(alert, cause);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be not necessary to change the TransportContext by adding a new teardownTransport() method. It would be good to keep the fatal() behavior as if a fatal alter will be sent. Maybe, the exception thrown by fatal() could be replaced with the socket exception, like:

if (cause instanceof SocketException) {
     try {
        conContext.fatal(alert, cause);
     } catch (Exception) {
        // Just delivering the fatal alert, re-throw the socket exception instead.
     } finally {
        throw (SocketException)cause;
     }
} else {
      throw conContext.fatal(alert, cause);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback, I have updated as recommended. I could not add a throw in the finally block as this generates a warning. Instead i have done this

            try {
                conContext.fatal(alert, cause);
            } catch (Exception e) {
                // Just delivering the fatal alert, re-throw the socket exception instead.
            }

            throw (SocketException)cause;

Comment on lines 56 to 59
static String pathToStores = "../../../../javax/net/ssl/etc";
static String keyStoreFile = "keystore";
static String trustStoreFile = "truststore";
static String passwd = "passphrase";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In JSSE testing, we are trying to avoid the dependency on the binary key store files for a while. Would you like to check out the new template, test/jdk/javax/net/ssl/templates/SSLSocketTemplate.java? You could refer to test/jdk/sun/security/ssl/ServerHandshaker/AnonCipherWithWantClientAuth.java, or search for "extends SSLSocketTemplate" about how to use the new template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the test to use SSLSocketTemplate.

Copy link
Member

@XueleiFan XueleiFan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 10, 2021
@cliveverghese
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jan 10, 2021
@openjdk
Copy link

openjdk bot commented Jan 10, 2021

@cliveverghese
Your change (at version 98f1587) is now ready to be sponsored by a Committer.

Copy link
Member

@simonis simonis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@simonis
Copy link
Member

simonis commented Jan 11, 2021

/sponsor

@openjdk openjdk bot closed this Jan 11, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 11, 2021
@openjdk
Copy link

openjdk bot commented Jan 11, 2021

@simonis @cliveverghese Since your change was applied there have been 71 commits pushed to the master branch:

  • 1bd015f: 8258407: Split up CompileJavaModules.gmk into make/modules/$M/Java.gmk
  • 2354882: 8250768: javac should be adapted to changes in JEP 12
  • 18a37f9: 8259368: Zero: UseCompressedClassPointers does not depend on UseCompressedOops
  • a03e22b: 8253910: UseCompressedClassPointers depends on UseCompressedOops in vmError.cpp
  • e0d748d: 8258606: os::print_signal_handlers() should resolve the function name of the handlers
  • bd34418: 8258445: Move make/templates to make/data
  • d21a0ea: 8258449: Move make/hotspot/symbols to make/data
  • 3974fd4: 8259451: Zero: skip serviceability/sa tests, set vm.hasSA to false
  • bb247b0: 8259392: Zero error reporting is broken after JDK-8255711
  • 2806bf2: 8259475: Fix bad merge in compilerOracle
  • ... and 61 more: https://git.openjdk.java.net/jdk/compare/80544e4d5f402f67171eb5aca11897f1c5c3d431...master

Your commit was automatically rebased without conflicts.

Pushed as commit 01b2804.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@simonis
Copy link
Member

simonis commented Jan 11, 2021

This change caused some new test failures (e.g. in java/net/httpclient/InvalidSSLContextTest.java or javax/net/ssl/SSLSession/TestEnabledProtocols.java). I'm currently working on a PR which will fix these tests...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated security security-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

4 participants