Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StacklessSSLHandshakeException for ClosedChannelException #13099

Merged
merged 6 commits into from Jan 24, 2023

Conversation

hyperxpro
Copy link
Contributor

@hyperxpro hyperxpro commented Jan 5, 2023

Motivation:
When a channel is closed during the SSL handshake then java.nio.channels.ClosedChannelException is thrown. This is correct when we see it from a low-level transport perspective but from a high-level, this error is not detailed enough to debug. And when we log ClosedChannelException, we get this line: java.nio.channels.ClosedChannelException: null in the stack trace.

Modification:
To combat this, we should throw StacklessSSLHandshakeException with the message "Connection closed while SSL/TLS handshake was in progress" as suppressed withClosedChannelException. This will explicitly declare that the handshake failed due to connection closure.

Result:
Fixes #12000

@normanmaurer
Copy link
Member

@hyperxpro can you please revert the formatting changes ?

@hyperxpro
Copy link
Contributor Author

lmao, I have no idea how they happened, reverting. XD

@hyperxpro
Copy link
Contributor Author

;/ Whats wrong with CI?

@normanmaurer
Copy link
Member

@hyperxpro can you rebase ?

@hyperxpro
Copy link
Contributor Author

Done

@hyperxpro
Copy link
Contributor Author

Finally :3 Tests passed.

@normanmaurer
Copy link
Member

@vietj any concerns ?

@normanmaurer normanmaurer added this to the 4.1.88.Final milestone Jan 16, 2023
@idelpivnitskiy
Copy link
Member

On the second though, can you please elaborate how knowledge of the fact that the connection was closed during handshake (not before or after) is helping with debugging this case?

I'm a bit concerned that this changes existing behavior and some users potentially may already have logic that handles it correctly. To me, throwing ChannelClosedException seems correct because this is what exactly happens and the current logic of SslHandler.channelInactive(...) simply doesn't care if it happened before, in the middle, or after the handshake. If this level of detail is necessary, can we consider setting more detailed exception message for ChannelClosedException instead of changing the exception type?

@hyperxpro
Copy link
Contributor Author

The reason is that when ClosedChannelException is logged in the console, it provides no information at all about the state of the handshake and why actually it was closed. Wrapping it with SSLHandshakeException will definitely increase the contrast of logs.

This is a snap from Spring Boot WebFlux which uses Reactor Netty under the food.

2022-01-21 15:21:27.687  WARN 32600 --- [ctor-http-nio-2] .s.ApplicationProtocolNegotiationHandler : [id: 0xdeb61fa1, L:/[0:0:0:0:0:0:0:1]:8080 ! R:/[0:0:0:0:0:0:0:1]:54463] Failed to select the application-level protocol:

java.nio.channels.ClosedChannelException: null
	at io.netty.handler.ssl.SslHandler.channelInactive(SslHandler.java:1063) ~[netty-handler-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:813) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:164) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.73.Final.jar:4.1.73.Final]
	at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

I have already mentioned the issue with this log in my PR description.

@idelpivnitskiy
Copy link
Member

Thanks! I saw the issue before and looked at all the linked pages. Will it help if we replace null with more detailed message that will include the state of the handshake without changing the type?
The reasons why I'm hesitant are:

  1. Behavior change that may affect existing users.
  2. Importance of the fact that the channel is closed unexpectedly. I will appreciate if any exception generated by SslHandler tells me more about it's current state, but if the channel is closed it's more important than anything else bcz there is no way to recover it.
  3. Inconsistency with other states of SslHandler: after the proposed changes we still throw ClosedChannelException if the channel was closed before or after the handshake. It's inconsistent if we throw another type during the handshake for the same cause.
  4. Inconsistency with any other handlers that throws ClosedChannelException. We never throw Http2StreamException or WebSocketException or EncoderException with cause of ClosedChannelException in similar cases.

@hyperxpro
Copy link
Contributor Author

Changing null with the message was the first top choice but ClosedChannelException does not let me pass in a message. ;/

I agree it will break users who rely strongly on ClosedChannelException but I'm not able to find a better option. But I'm all ears for improving this.

@hyperxpro
Copy link
Contributor Author

How about stackless SSLHandshakeException was sub-cause of ClosedChannelException?

@idelpivnitskiy
Copy link
Member

I see, I just realized that ClosedChannelException doesn't let users to set a message 🤦
How about keeping the ClosedChannelException but adding a suppressed exception, describing the status of the handshake?

@hyperxpro
Copy link
Contributor Author

Yeah, that sounds better. A stackless exception cause ClosedChannelException should do the job.

Should I proceed with this?

@idelpivnitskiy
Copy link
Member

Just to double check we are on the same page:

ClosedChannelException exception = new ClosedChannelException();
if (we are in the middle of handshake) {
    exception.addSuppressed(new StacklessSSLHandshakeException(details));
}

I'm + 💯 for this

@normanmaurer
Copy link
Member

@idelpivnitskiy why not just use initCause(new ...) ? This will work as well.

@idelpivnitskiy
Copy link
Member

idelpivnitskiy commented Jan 20, 2023

It's hard to say that the cause of ClosedChannelException is SSLHandshakeException. It's the other way around. But to avoid behavior change and inconsistency for what exception is thrown, we can workaround inability to provide more info in the message by adding a suppressed exception

@normanmaurer
Copy link
Member

@idelpivnitskiy true... so @hyperxpro do what @idelpivnitskiy suggested for 4.1. I think for main we want to do the change of using SslHandshakeException

@hyperxpro
Copy link
Contributor Author

Ready @idelpivnitskiy @normanmaurer

@hyperxpro
Copy link
Contributor Author

Help help help ;/

[INFO] 
[INFO] --- animal-sniffer-maven-plugin:1.16:check (default) @ netty-handler ---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java16:1.1
Error:  /home/runner/work/netty/netty/handler/src/main/java/io/netty/handler/ssl/SslHandler.java:1070: Undefined reference: void java.nio.channels.ClosedChannelException.addSuppressed(Throwable)

@idelpivnitskiy
Copy link
Member

Ufff... Throwable#addSuppressed(...) method was added only in Java 7

@idelpivnitskiy
Copy link
Member

I see that there is io.netty.util.internal.ThrowableUtil.addSuppressed(...) utility that will help

Copy link
Member

@idelpivnitskiy idelpivnitskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for main we want to do the change of using SslHandshakeException

@normanmaurer I still feel like ClosedChannelException is more important than SslHandshakeException in this scenario. It means that the channel was closed and in most cases it doesn't matter if it was before/in-the-middle/after the handshake. I don't think my debugging flow would change in any of 3 cases.

If you would like to change that in Netty 5, we should do that consistently for any other case when we throw ClosedChannelException. There are a few related to HTTP/2, WebSockets, etc. If we go that route, that's probably worth making as a separate effort.

@hyperxpro
Copy link
Contributor Author

How about SslHandshakeException with initCause(ClosedChannelException) for Netty 5? This looks more correct because this describes that we caught a handshake error (considering we're handshaking) due to Channel closure.

WDYT?

@idelpivnitskiy
Copy link
Member

idelpivnitskiy commented Jan 23, 2023

Imho if we go that round, we have to change any other handler in the same situation. Those also have to throw Http2Exception, WebSocketHandshakeException, EncoderException (brotli), etc. with initCause(ClosedChannelException) instead of throwing just ClosedChannelException as they do today. Consistency for the win.
This is much bigger change with higher impact. Hence the question: will SslHandshakeException with initCause(ClosedChannelException) will be significantly more helpful for users to justify that effort and behavior change?

@hyperxpro
Copy link
Contributor Author

Alright, I get that now.

How about ClosedChannelException only when we're not handshaking? If we're handshaking, then SslHandshakeException with initCause(ClosedChannelException) as I described above.

But I am not sure how reliable is this dual approach.

Should we do some sort of poll and ask users what they think? Because this is really confusing. From a technical perspective, ClosedChannelException is definitely right but considering the fact that ClosedChannelException does not let us pass the Reason message.

If ClosedChannelException let us pass the Reason message then that would be best.

Should we play with reflections in this? Or actually, contribute the ability to pass the Reason message in Java upstream (Officially) and try to backport to older versions?

Contributing to Java seems overkill but actually feels right because null really makes no sense in logs.

@idelpivnitskiy
Copy link
Member

+1 on proposing more constructors for ClosedChannelException in Java. However, even if they agree, we won't be able to use it in Netty bcz it will require the latest JDK. For now, addSuppressed feels like ok-ish workaround to provide more context. But it's good proposal anyway, I would love to see that improved in Java long term.

How about ClosedChannelException only when we're not handshaking? If we're handshaking, then SslHandshakeException with initCause(ClosedChannelException) as I described above.

Will this be significantly more helpful for users compare to the approach in this PR?

To my knowledge (unless I miss anything) SslHandshakeException means there is something wrong with either an SSL configuration or SSL certificates. When this happens, the remote is required to send a fatal alert frame before closing the connection. If the channel is closing without receiving the fatal alert, unlikely changing SSL configuration will help to avoid ClosedChannelException.

Copy link
Member

@idelpivnitskiy idelpivnitskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing extra context for users!

handler/src/main/java/io/netty/handler/ssl/SslHandler.java Outdated Show resolved Hide resolved
Co-authored-by: Idel Pivnitskiy <idel.pivnitskiy@apple.com>
@hyperxpro
Copy link
Contributor Author

@franz1981 What do you feel? Can you call some of your friends who are core contributors of JDK?

I'd like some more suggestions and feedback before I actually go ahead with patch for JDK.

@hyperxpro hyperxpro changed the title Replace ChannelClosedException with SSLHandshakeException Add StacklessSSLHandshakeException for ChannelClosedException Jan 23, 2023
@hyperxpro hyperxpro changed the title Add StacklessSSLHandshakeException for ChannelClosedException Add StacklessSSLHandshakeException for ClosedChannelException Jan 24, 2023
@normanmaurer normanmaurer merged commit 52db529 into netty:4.1 Jan 24, 2023
@normanmaurer
Copy link
Member

@hyperxpro @idelpivnitskiy thanks a lot!

normanmaurer pushed a commit that referenced this pull request Jan 24, 2023
…3099)

Motivation:
When a channel is closed during the SSL handshake then `java.nio.channels.ClosedChannelException` is thrown. This is correct when we see it from a low-level transport perspective but from a high-level, this error is not detailed enough to debug. And when we log `ClosedChannelException`, we get this line: `java.nio.channels.ClosedChannelException: null` in the stack trace.

Modification:
To combat this, we should throw `StacklessSSLHandshakeException` with the message `"Connection closed while SSL/TLS handshake was in progress"` as suppressed with`ClosedChannelException`. This will explicitly declare that the handshake failed due to connection closure.

Result:
Fixes #12000

Co-authored-by: Idel Pivnitskiy <idel.pivnitskiy@apple.com>
@hyperxpro hyperxpro deleted the ssl_hsk_error branch January 24, 2023 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exception when close channel before SSL
3 participants