New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets #26764
Conversation
…connectToChannels
I don’t think we should make this change in mock TCP transport. It’s not realistic that we retry connects like that. |
@jasontedor fair point, but in production code, this kind of thing would just get retried further up the stack right? => My reasoning was this:
|
If you're not binding the outgoing socket, is a port collision possible? I didn't think you got port collisions in the kernel without a |
I also don't think we are having a port collision here. this is the code that connects to a server ie. client code. @DaveCTurner that's why we don't bind there. What happens here likely is that we are exhausting the our resources here and don't have any ports left. There might be tons of connections in That said, I don't think we should retry in such a way. I am more curious why we get into this state? |
@s1monw tracked down where we leak a lot of Activating linger with timeout |
@@ -318,6 +303,7 @@ public void accept(Executor executor) throws IOException { | |||
MockChannel incomingChannel = null; | |||
try { | |||
configureSocket(incomingSocket); | |||
incomingSocket.setSoLinger(true, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hesitating to set a 0 timeout for SO_LINGER
The reason is that the abnormal termination of a socket that causes it to move into TIME_WAIT
can be caused due to an exception ie. during request parsing on the server then the server closes the connection and in this case we should really stay in TIME_WAIT
. The other case where the server closes the connection instead of the client is when we shutdown and then we can set our socket into SO_LINGER=0
since it's the right thing todo in this situation. I actually think we should do this in a broader scope. There is for instance a method TcpTransport#closeChannels(List<Channels> channels, boolean blocking)
I wonder if we can piggyback on this boolean and set SO_LINGER
to true,0
if we close our transport. It might even be a better idea to add another boolean boolean closingTransport
in this case we can do the right thing also in other transport impls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might even be a better idea to add another boolean boolean closingTransport in this case we can do the right thing also in other transport impls?
Makes sense to me to do the additional boolean
. My reasoning would be that the blocking
parameter currently seems to be more about the way handle Netty's internal threading and not any low-level Socket
behavior/setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@s1monw done, added another parameter and only made it RST
on org.elasticsearch.transport.TcpTransport#doStop
and the actual close
call of TcpTransport
leaving the failure handlers and such untouched.
This still (after removing the SO_LINGER setting I added) resolves the issue just fine, tests still don't leak any TIME_WAIT
s.
Hope this is what you had in mind :)
socket.setReuseAddress(TCP_REUSE_ADDRESS.get(settings)); | ||
ByteSizeValue tcpReceiveBufferSize = TCP_RECEIVE_BUFFER_SIZE.get(settings); | ||
if (tcpReceiveBufferSize.getBytes() > 0) { | ||
socket.setReceiveBufferSize(tcpReceiveBufferSize.bytesAsInt()); | ||
} | ||
socket.bind(address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense ++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this LGTM @tbrooks8 please also take a look if this makes sense to you as well. @original-brownbear can you update the description for this PR please. Thanks for working on this.
@s1monw description updated :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@tbrooks8 thanks, should I squash this? |
You can if you wanted. But we often just use the GitHub "Squash and merge" button. |
@tbrooks8 ah ok, doing that then. Thanks! |
@original-brownbear I backported it to 6.x |
This is a follow up to elastic#26764. That commit set SO_LINGER to 0 in order to fix a scenario where we were running out of resources during CI. We are primarily interested in setting this to 0 when stopping the tranport. Allowing TIMED_WAIT is standard for other failure scenarios during normal operation. Unfortunately this commit set SO_LINGER to 0 every time we close NodeChannels. NodeChannels can be closed in case of an exception or other failures (such as parsing a response). We want to only disable linger when actually shutting down.
This is a follow up to #26764. That commit set SO_LINGER to 0 in order to fix a scenario where we were running out of resources during CI. We are primarily interested in setting this to 0 when stopping the tranport. Allowing TIMED_WAIT is standard for other failure scenarios during normal operation. Unfortunately this commit set SO_LINGER to 0 every time we close NodeChannels. NodeChannels can be closed in case of an exception or other failures (such as parsing a response). We want to only disable linger when actually shutting down.
This is a follow up to #26764. That commit set SO_LINGER to 0 in order to fix a scenario where we were running out of resources during CI. We are primarily interested in setting this to 0 when stopping the tranport. Allowing TIMED_WAIT is standard for other failure scenarios during normal operation. Unfortunately this commit set SO_LINGER to 0 every time we close NodeChannels. NodeChannels can be closed in case of an exception or other failures (such as parsing a response). We want to only disable linger when actually shutting down.
I think I tracked down #26701 here:
The problem is that we're using quite a few connections/sockets here, so with some (relatively low) probability this line:
will throw this exception because of running out of ports for the connecting sockets locally.
Eventually, this exception is just silently ignored by only
trace
ing it inorg.elasticsearch.common.util.concurrent.AbstractRunnable#onFailure
:https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java#L520
=> which leads to pings silently not happening => counts are off.
I ran the test in a loop to reproduce this and upped the log level for that statement and missed pings like in the issue always coincided with the exception from the screenshot.
I fixed this by fixing two types of spots where we were leaking local ports/sockets (in the form of
TIME_WAIT
).TCP_REUSE_ADDRESS
after abind
call which will not work on all systemsTIME_WAIT
sockets when shutting downTcpTransport
on aFIN
instead ofRST
after the other side went down for sure. This was leaking so manyTIME_WAIT
clients that we eventually ran out of local ports and saw aboveException
in tests.Closes #26701