New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

npgsql leaving connections open on IOExceptions #2288

Closed
codeweaversdev opened this Issue Jan 11, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@codeweaversdev
Copy link

codeweaversdev commented Jan 11, 2019

Steps to reproduce

Please see the https://github.com/codeweaversdev/npgsql-transaction-error-handling repo for a demonstation application

The issue

Commands that take longer than the command timeout to finish execution throw a System.IO.IOException because the connected party (pg) did not properly respond after a period of time.

This leaves an active process in PG:

SELECT * FROM pg_stat_activity WHERE state != 'idle' and client_addr = 'ip' AND usename = 'user';

When handling the thrown exception, there is no way to test the state of the transaction/command as the IsCompleted is false when there has been an exception and therefore we can't see how to write transaction code that is safe?

If you transaction.RollBack() in a finally then an InvalidOperationException is thrown.

In our actual use-case we are seeing UPDATE statements that perform bulk changes timeout, leaving the transaction hanging on production and requiring manual intervention from the DBA to resolve by using pg_terminate_backend

Exception message:

System.InvalidOperationException: This NpgsqlTransaction has completed; it is no longer usable.
   at Npgsql.NpgsqlTransaction.CheckCompleted()
   at Npgsql.NpgsqlTransaction.CheckReady()
   at Npgsql.NpgsqlTransaction.<Rollback>d__21.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
Npgsql.NpgsqlException (0x80004005): Exception while reading from stream ---> System.IO.IOException: Unable to read data from the transport connection: A connec
tion attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has fai
led to respond. ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time
, or established connection failed because connected host has failed to respond
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
   at Npgsql.NpgsqlReadBuffer.<>c__DisplayClass31_0.<<Ensure>g__EnsureLong|0>d.MoveNext()
   at Npgsql.NpgsqlReadBuffer.<>c__DisplayClass31_0.<<Ensure>g__EnsureLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Npgsql.NpgsqlConnector.<>c__DisplayClass161_0.<<ReadMessage>g__ReadMessageLong|0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at Npgsql.NpgsqlDataReader.<NextResult>d__46.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Npgsql.NpgsqlDataReader.NextResult()
   at Npgsql.NpgsqlCommand.<ExecuteDbDataReader>d__100.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Npgsql.NpgsqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
   at Npgsql.NpgsqlCommand.ExecuteReader()

We would expect that we are able to rollback the transaction from the application in this scenario and npgsql honours the behaviour of the IDisposable

Further technical details

Npgsql version: 4.0.4
PostgreSQL version: 9.5.15
Operating system: Server 2012R2
.NET Version: 4.5.2/4.6.1

@roji roji added the bug label Jan 11, 2019

@roji roji added this to the 4.0.5 milestone Jan 11, 2019

@roji

This comment has been minimized.

Copy link
Member

roji commented Jan 11, 2019

The physical connection should definitely not be left open in case of a timeout - I'll try to investigate that soon.

Regarding rolling back and ascertaining transaction state... When a socket timeout occurs, the physical connection is assumed to be in an unusable state, since protocol sync has been lost. The (current) expected behavior is therefore for the connection to be closed (broken) as soon as a timeout occurs. In other words, after a timeout occurs it no longer makes sense to send a rollback, because there's no longer a connection to roll back on. As a general rule, this also means that the transaction is automatically rolled back on the PostgreSQL side as the connection is closed without a commit. So assuming we're doing things right, when a timeout exception occurs one can assume that the transaction failed and was rolled back. One exception for this is if the timeout occurs during the commit itself, in which case there's no way of knowing whether the commit succeeded or failed.

Note that in addition to the default socket timeout, PostgreSQL also supports a statement_timeout parameter, which makes PostgreSQL cancel the query on its side after a certain amount of time. This manifests itself in a clean PostgreSQL error which is raised by Npgsql as a PostgresException, and does not break the connection.

@roji roji self-assigned this Jan 15, 2019

@roji

This comment has been minimized.

Copy link
Member

roji commented Jan 15, 2019

Duplicate of #1567

@roji roji marked this as a duplicate of #1567 Jan 15, 2019

@roji

This comment has been minimized.

Copy link
Member

roji commented Jan 15, 2019

After a 2nd look this is the current expected behavior, although we may implement #1567 to improve this. It's worth reading that issue and the issues it references to fully understand what's going on. Note that this is very different from PostgreSQL timeouts which can be set via statement_timeout, and which do trigger cancellation on the PostgreSQL side (and also don't cause the connection to be broken). You should probably look into using that mechanism.

Regarding the socket timeout issue, in a nutshell, when a socket timeout occurs, network connectivity is assumed to be severed and Npgsql closes the connection on its side. On the PostgreSQL side the current query continues to execute until it is completed - PostgreSQL doesn't check the state of the network connection with the client while it's executing a statement (unless it needs to write back results). #1567 would be about attempting to establish a new, separate connection in order to cancel the running statement - this obviously would work only if there's no actual network issue.

Note also #1606 which is about attempting cancellation of all connections on process exit.

If anything about this is unclear please feel free to write back.

@roji roji closed this Jan 15, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment