Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cient connection establishment is not cancellable #13843

Closed
DerGuteMoritz opened this issue Feb 14, 2024 · 3 comments · Fixed by #13849
Closed

Cient connection establishment is not cancellable #13843

DerGuteMoritz opened this issue Feb 14, 2024 · 3 comments · Fixed by #13849
Milestone

Comments

@DerGuteMoritz
Copy link

Expected behavior

When connecting a client, it should be possible to cancel the connection establishment while it is in progress (if the underlying tansport supports it).

Actual behavior

The promise returned by Bootstrap#connect is marked as uncancellable by all transport implementations so invoking cancel() on it has no effect. This is so despite all¹ transport implementations having a listener on the promise which handles cancellation (see e.g. here for the NIO transport).

Marking connect promises as uncancellable was introduced by this patch. Oddly, it claims to "Make most outbound operations cancellable" but then goes on to say "Call setUncancellable() before performing an outbound operation" which appears to do the exact opposite. I already brought this up with @trustin on Discord and he figured he did this because it's not possible to interrupt a connect() syscall once invoked. However, this is not the case for all non-blocking socket types for which this can be achieved by simply calling close() on them (which is what the listeners mentioned would above do). I have in fact tested this in practice with both the NIO and epoll transports and it works just fine. I suspect that the patch was a bit too sweeping in disabling cancellation globally like that. Also note the existence of io.netty.testsuite.transport.socket.SocketConnectionAttemptTest#testConnectCancellation which indicates that cancellation of connection attempts is/was intended to be supported.

1: Except for the deprecated OIO transport which doesn't support it due to its blocking nature.

Steps to reproduce

See below

Minimal yet complete reproducer code (or URL to code)

My (very simple) patch for enabling cancellation of connection establishment for the NIO and epoll transports is available here - it can probably also be extended to kqueue but I have no system to test this on right now. I ensured that it works as expected in two ways:

  1. I modified io.netty.testsuite.transport.socket.SocketConnectionAttemptTest#testConnectCancellation to error out in the else branch (i.e. the one that says // Cancellation not supported by the transport). The test then failed without the patch and passed with the patch.
  2. I made a connection attempt to a TCP server which never calls accept(). Via ss I observed the connection attempt being stuck in SYN-SENT state. I then invoked cancel() on the promise returned by Bootstrap#connect. Without the patch, the connection would stick around until the connect timeout (default 30 seconds) expired. With the patch, the connection immediately disappeard.

Netty version

4.1.107

JVM version (e.g. java -version)

19.0.2

OS version (e.g. uname -a)

Linux 6.7.2 (NixOS)

@DerGuteMoritz
Copy link
Author

DerGuteMoritz commented Feb 14, 2024

FWIW, I would be happy to see this changed in Netty 4 still. Context: clj-commons/aleph#714

Also, it's worth noting that the current behavior somewhat masks a resource leak in the sense that cancelling is ineffective and connections will stick around until the connect timeout expires. An application which runs on the default of 30s (which I assume is quite likely to be the case) and establishes and quickly cancels a lot of connections in a short time will unknowingly tie up a lot of file descriptors this way.

normanmaurer added a commit that referenced this issue Feb 16, 2024
Motivation:

Whe using non-blocking IO we can in fact cancel a connect() operation by just closing the underyling fd.

Modifications:

- Don't mark the promise as uncancellable before issue the non-blocking connect() call
- Add comment to explain code
- Adjust testConnectCancellation(...) to test the possiblity for connection cancellation for various transport implementations

Result:

Fixes #13843
@normanmaurer
Copy link
Member

@DerGuteMoritz thanks for reporting... This should be fixed by #13849

@normanmaurer normanmaurer added this to the 4.1.108.Final milestone Feb 16, 2024
@DerGuteMoritz
Copy link
Author

@normanmaurer Great, thanks a lot for the quick response and glad you agree to include it in the next release! 🙏

normanmaurer added a commit to netty/netty-incubator-transport-io_uring that referenced this issue Feb 16, 2024
Motivation:

Whe using io_uring we can in fact cancel a connect() operation by just closing the underyling fd.

Modifications:

Don't mark the promise as uncancellable before issue the non-blocking connect() call
Add comment to explain code

Result:

Related to netty/netty#13843 and netty/netty#13849
normanmaurer added a commit to netty/netty-incubator-transport-io_uring that referenced this issue Feb 16, 2024
Motivation:

Whe using io_uring we can in fact cancel a connect() operation by just
closing the underyling fd.

Modifications:

Don't mark the promise as uncancellable before issue the non-blocking
connect() call Add comment to explain code

Result:

Related to netty/netty#13843 and
netty/netty#13849
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants