Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix socket leak in TCPConnection/TCPConnectionSsl #2255

Merged
merged 10 commits into from Jan 30, 2020
Merged

Conversation

shaan1337
Copy link
Member

@shaan1337 shaan1337 commented Jan 27, 2020

Fixes #2010

Description of race condition

It is possible for TcpConnection.Close() -> CloseInternal() to be called before the onConnectionEstablished callback is called. Thus _socket will still be null (since connection.InitSocket has not been called yet) and the socket will not be closed due to the null check below, resulting in a socket leak:

Normal TCP connection / CloseInternal():

			if (_socket != null) {
				Helper.EatException(() => _socket.Shutdown(SocketShutdown.Both));
				Helper.EatException(() => _socket.Close());
				_socket = null;
			}

As mentioned on issue #2010, this results in connections staying in the CLOSE_WAIT state until the client holding the socket is stopped.

The same applies to SSL TCP connections, except that in this case, _sslStream will still be null and the stream will not be disposed:

SSL TCP connection / CloseInternal():

			if (_sslStream != null)
				Helper.EatException(() => _sslStream.Close());

Reproduction steps

The following steps mirror what @Zetanova reported in #2010:

  • Start a single node
  • Run following application socket_pile_up.zip:
    dotnet run
  • Monitor number of sockets: ss|grep 1113|grep "CLOSE-WAIT"|wc -l . You will notice that the count keeps on increasing.

Regression tests

4 regression tests have been added, one for each connection type: TcpConnection/TcpConnectionSsl (client & server). All of them fail when the fix is not present.

Fix description

The fix adds a callback called onSocketAssigned which is triggered as soon as the socket is created and before it is used to connect using ConnectAsync(). This ensures that _socket in TcpConnection will be properly assigned before the created ITcpConnection object is returned to the caller and thus if Close() is called on the ITcpConnection at any point after this, _socket is guaranteed to be not null which in turn implies that the socket will properly be disposed in CloseInternal.

The same applies to TcpConnectionSsl except that there was no _socket property (there is an _sslStream property instead). A _socket property has been added to ensure that the socket is disposed if the connection is closed before the onConnectionEstablished callback is called (in this case, _sslStream will still be null which would previously also result in a socket leak)

Consequences of fixing the socket leak

Fixing the socket leak now implies that when onConnectionEstablished is called, the socket may have already been disposed. Thus, some new exceptions can now be thrown and these have been fixed in commits: e90c3b5, e7e8eff and 3bc0187.

As a consequence of the fix, onConnectionFailed callback can now also be triggered in no_data_should_be_dispatched_after_tcp_connection_closed. This has been fixed in commit: 032efe2.

@shaan1337 shaan1337 changed the title Fix socket leak in TCP connections Fix socket leak in TCPConnection/TCPConnectionSsl Jan 27, 2020
…dd a callback to set its value in the ITcpConnection

This ensures that when TcpConnection/Ssl.CloseInternal() is called, the socket is not null and will be properly disposed, preventing socket leaks.
…if connection is closed before being initialized which can result in null reference exceptions
… for the socket to be disposed at this point (we now call _socket.Dispose() in CloseInternal())
@shaan1337 shaan1337 added this to the Event Store v6 Preview 3 milestone Jan 28, 2020
@jageall jageall merged commit 4c1b1a8 into master Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connection/Socket leak under linux dotnet2.2
3 participants