Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve command timeout handling #455

Closed
vaintroub opened this issue Mar 15, 2018 · 17 comments · Fixed by #887
Closed

Improve command timeout handling #455

vaintroub opened this issue Mar 15, 2018 · 17 comments · Fixed by #887
Assignees

Comments

@vaintroub
Copy link
Contributor

The query timeout handling in MySqlConnection can be simplified a lot, if it could use server support for query timeouts. E.g in MariaDB 10.1+ you can using per-statement max_query_time , described in https://mariadb.com/kb/en/library/aborting-statements/ . There is a max_statement_time session variable (double) for max time, measured in seconds

MySQL supports this functionality but chose to implement it slightly differently, and in a limited way (variable name is max_execution_time, and it does only work for SELECTs).

Traditional way to implement statement timeouts in other connectors (preceding server support) , was to start a timer task, right before execution of the query. If the task fires, i would create issue "KILL QUERY " from another connection. If query finishes before the timeout , the timer task would be canceled. For MySqlConnector, I feel that even this handling would simplify IO code, which currently relies on socket or stream timeouts( yet still has timer queues for async)

@bgrainger
Copy link
Member

I initially started implementing command timeouts that way, but changed my mind: #67 (comment)

The primary reason the command timeout exists is to stop the client hanging forever if there is a network failure; it is not about cancelling a long-running command. (Use MySqlCommand.Cancel or a CancellationToken for that purpose.) The current behaviour is by design.

@bgrainger
Copy link
Member

E.g in MariaDB 10.1+ you can using per-statement max_query_time

If I can unambiguously detect MariaDB, this seems like a great enhancement to add. (I looked into MySQL's max_execution_time but ruled it out because it's too limited.)

If the task fires, i would create issue "KILL QUERY " from another connection.

This is unreliable in the face of load-balanced servers, proxies, or network partitions.

However... we could try KILL QUERY first (to keep the connection open #453) then fall back to closing the socket (after some delay) if that doesn't work.

@vaintroub
Copy link
Contributor Author

vaintroub commented Mar 15, 2018

you can unambigously detect MariaDB. It has -MariaDB in server version string.

The "real" version is also prefixed by 5.5.5- prefix , due to https://jira.mariadb.org/browse/MDEV-4088 . MariaDB-aware Connectors strip off that prefix : https://jira.mariadb.org/browse/CONC-21 https://jira.mariadb.org/browse/CONJ-32

@vaintroub
Copy link
Contributor Author

MySQL-protocol aware Proxies are supposed to support KILL QUERY and KILL CONNECTION, because those are quite widely used among all connectors.

@vaintroub
Copy link
Contributor Author

vaintroub commented Mar 15, 2018

"The primary reason the command timeout exists is to stop the client hanging forever if there is a network failure; it is not about cancelling a long-running command"

This is usually worked around not with query timeout, but with KeepAlive (with keepalive parameters), e.g SIO_KEEPALIVE_VALS on Windows, https://msdn.microsoft.com/en-us/library/system.net.sockets.iocontrolcode(v=vs.110).aspx . See also https://github.com/dotnet/corefx/issues/25040

@bgrainger
Copy link
Member

Investigate what the Connector/J queryTimeoutKillsConnection option does, and if it's relevant to this issue.

@bgrainger
Copy link
Member

Similar issue being discussed for npgsql (which might provide an implementation approach to copy): npgsql/npgsql#1567.

@penguinawesome
Copy link

+999! This feature is badly needed.

Setting it on a per connection level something like
Connection.MaxExecutionTime = 8000; (in milliseconds) or something similar that will indicate the library to make either auto cancel (reflected in the server side) or kill the query when the timeout reaches, not only in the client side.

@bgrainger
Copy link
Member

bgrainger commented Oct 1, 2020

The current situation for MySqlConnector is very similar to what @roji describes for npgsql, with the same good and bad points: npgsql/npgsql#1567 (comment)

Based on recent work in npgsql and recent feedback here, I'm proposing the following changes:

  • MySqlCommand.Cancel() and MySqlCommand.ExecuteXAsync(CancellationToken) continue the same behaviour of performing a server-side cancellation (via KILL QUERY).
  • The handling of MySqlCommand.CommandTimeout (which is set from MySqlConnectionStringBuilder.DefaultCommandTimeout) is unified with that; i.e., when the CommandTimeout expires, MySqlConnector will attempt to cancel the query server-side.
    • If the server-side cancellation is successful, the Execute/Read/etc. method will throw a MySqlException with an ErrorCode set to MySqlErrorCode.QueryInterrupted. This is a change from current behaviour, which throws a MySqlException with MySqlErrorCode.CommandTimeoutExpired. set to MySqlErrorCode.CommandTimeoutExpired.
    • If the server-side cancellation is successful, the connection is still usable (and will be returned to the pool when it's closed). This is a change from current behaviour, which leaves the connection in a broken state.
  • A new connection string option, Cancellation Timeout, is introduced to handle failure in the case of a network partition:
    • The default is 2 seconds (cf. Cancelling queries on timeouts npgsql/npgsql#3130 (comment)).
    • After CancellationTimeout seconds without receiving an "ERR" packet from the server (which will have a message indicating that the query was interrupted), the client socket will be closed; this is what the implementation of CommandTimeout used to do.
    • In the case of a network partition, the maximum delay (until ExecuteX throws an exception) is raised from CommandTimeout seconds to CommandTimeout + CancellationTimeout seconds; with the default settings, this is an increase from 30 to 32 seconds.
    • The user can distinguish server-side cancellation from local socket closure by testing the thrown exception for MySqlErrorCode.QueryInterrupted vs MySqlErrorCode.CommandTimeoutExpired. for an InnerException with MySqlErrorCode.QueryInterrupted.
  • To opt out of this change and return to the current behaviour of closing the socket when CommandTimeout expires, set CancellationTimeout=-1 in the connection string. When CancellationTimeout is -1, the socket will be closed immediately after CommandTimeout seconds.

There's an open question of whether exceptions should be unified between sync and async cancellation (cf. npgsql/npgsql#1146 (comment) ff). Under this proposal MySqlCommand.Cancel() and MySqlCommand.CommandTimeout handling will cause the Execute/Read/etc. method to throw MySqlException with MySqlErrorCode.QueryInterrupted, but the ExecuteXAsync/ReadAsync overloads with a CancellationToken will throw OperationCanceledException if the CancellationToken is cancelled (and MySqlException if the CommandTimeout expires). This could be unified so that OperationCanceledException is thrown in all cases, no matter how the query is cancelled. This would (obviously) be a much more significant breaking change.

@vonzshik
Copy link

vonzshik commented Oct 2, 2020

  • There is no way to opt out of this change and return to the current behaviour of closing the socket when CommandTimeout expires. If this is necessary, a CancelQueryOnCommandTimeout=false connection string option could be added (defaulting to true).

Wouldn't it be better to just set CancellationTimeout to 0? And -1 to disable?

@roji
Copy link

roji commented Oct 2, 2020

Yeah, we also raised this at some point (but haven't implemented it). @bgrainger if this makes sense to you, we can implement the same on the Npgsql side.

@bgrainger
Copy link
Member

There is no way to opt out of this change and return to the current behaviour of closing the socket when CommandTimeout expires. If this is necessary, a CancelQueryOnCommandTimeout=false connection string option could be added (defaulting to true).

Wouldn't it be better to just set CancellationTimeout to 0? And -1 to disable?

It feels a little "weird" to me to say "set CancellationTimeout to -1 to make CommandTimeout close the socket instead of cancelling the query"; OTOH, the CancellationTimeout connection string option would have no effect if CancelQueryOnCommandTimeout=false were set, so why not overload it as a way to disable this behaviour?

@roji it looks like you were maybe discussing that here? npgsql/npgsql#3173 (comment)

Would you implement it the following way, or switch the meanings?

  • CancellationTimeout=0: socket is never closed after query is cancelled; similar to old behaviour of CommandTimeout=0
  • CancellationTimeout=-1: query is never cancelled; when CommandTimeout expires, the client immediately closes the socket

@roji
Copy link

roji commented Oct 3, 2020

I do agree with @vonzshik that an additional connection string flag for managing whether cancellation occurs is not needed, and that CancellationTimeout should probably be sufficient. I generally try to avoid connection string parameter explosion (there already are many and it's confusing...), and I suspect disabling cancellation is a very rare edge which doesn't deserve a parameter.

@bgrainger I do think it would be more logical/consistent for zero to mean "wait 0 seconds before closing the socket" (just like one means "wait 1 second before closing the socket"). But your proposal makes the meaning of zero is the same as what it is for CommandTimeout (don't timeout at all), which is consistent in another, probably more important way. So I vote for your proposal - @vonzshik any additional thoughts on this?

@vonzshik
Copy link

vonzshik commented Oct 3, 2020

But your proposal makes the meaning of zero is the same as what it is for CommandTimeout (don't timeout at all), which is consistent in another, probably more important way.

In a context of CommandTimeout, 0 doesn't make sense as anything but an infinite timeout. Most likely, that's one of the reasons why it's 0 and not -1.
I think, having to know only one rule (-1 to disable, 0 for an infinite) is much better, than keeping in mind multiple exceptions (even if they do make sense in the context of the parameter).

@roji
Copy link

roji commented Oct 3, 2020

OK, so I think we're all saying the same thing. @vonzshik would you like to open an issue to implement the above behavior in Npgsql too?

@vonzshik
Copy link

vonzshik commented Oct 3, 2020

OK, so I think we're all saying the same thing. @vonzshik would you like to open an issue to implement the above behavior in Npgsql too?

Done.

@bgrainger
Copy link
Member

Edited the MySqlConnector proposal above to document the expected behaviour of CancellationTimeout=-1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

5 participants