Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unhandled exception on timerqueue #820

Closed
ghost opened this issue May 16, 2020 · 11 comments
Closed

Unhandled exception on timerqueue #820

ghost opened this issue May 16, 2020 · 11 comments

Comments

@ghost
Copy link

ghost commented May 16, 2020

Hi,

I'm running a moderately-high traffic asp.net core 3.1 app that constantly goes down due to unhandled exceptions that appear to come from the TimerQueue.

All code is disposing properly (using and async using), is async and I am extensively trying to enforce timeouts: using CancellationTokens and sometimes using mysql's built-in set session max_execution_timoeut.

I'm trying to provide some sample stack traces.

at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool)
at System.Threading.TimerQueueTimer.CallCallback(Boolean isThreadPool)
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of inner exception stack trace ---
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of stack trace from previous location where exception was thrown ---
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
at MySql.Data.MySqlClient.MySqlConnection.Cancel(ICancellableCommand command) in /_/src/MySqlConnector/MySql.Data.MySqlClient/MySqlConnection.cs:line 521
at MySqlConnector.Core.ServerSession.get_IPAddress() in /_/src/MySqlConnector/Core/ServerSession.cs:line 63
at System.Net.Sockets.Socket.get_RemoteEndPoint()
Object name: 'System.Net.Sockets.Socket'.
---> System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.)
Unhandled exception.Unhandled exception.Unhandled exception.   System.AggregateException: One or more errors occurred. (Cannot access a disposed object.
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool)
at System.Threading.TimerQueueTimer.CallCallback(Boolean isThreadPool)
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of inner exception stack trace ---
at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException)
--- End of stack trace from previous location where exception was thrown ---
---> System.InvalidOperationException: Connection must be Open; current state is Closed
   at MySql.Data.MySqlClient.MySqlConnection.get_Session() in /_/src/MySqlConnector/MySql.Data.MySqlClient/MySqlConnection.cs:line 488
   at MySql.Data.MySqlClient.MySqlConnection.Cancel(ICancellableCommand command) in /_/src/MySqlConnector/MySql.Data.MySqlClient/MySqlConnection.cs:line 521
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
Unhandled exception. System.AggregateException: One or more errors occurred. (Connection must be Open; current state is Closed)

Reviewing the TimerQueue code, line 79 invokes the action without protecting against exceptions, which seems to be my case. iirc exceptions on threading times callbacks take down the process.

// process all timers that have expired or will expire in the granularity of a clock tick
while (m_timeoutActions.Count > 0 && unchecked(m_timeoutActions[0].Time - current) < 15)
{
	m_timeoutActions[0].Action();
	m_timeoutActions.RemoveAt(0);
}
@ghost
Copy link
Author

ghost commented May 16, 2020

Also, I think that ObjectDisposedException exception might be a sign of a double-dispose on the socket.
I see that the TCP client is safely disposed in the ServerSession class but its Client socket is passed to the SocketByteHandler which registers a Timer to call its Dispose method.

For this, maybe applying the safe dispose pattern to then queued timer action instead of directly calling dispose on the socket is the solution.

@bgrainger
Copy link
Member

I don't think the TimerQueue class in this project is involved; this appears to be an exception thrown from MySqlCommand.Cancel, which is registered with a CancellationTokenSource.

(I'm finding it rather difficult to read the exception call stacks, though; they seem to be mostly, but not entirely "upside down".)

@bgrainger
Copy link
Member

I'm suspecting there's a race condition between MySqlCommand.Cancel being called (via CancellationTokenSource.Register) and the ServerSesson being closed (not sure how, possibly via the connection being disposed, or perhaps via a different timeout?).

@ghost
Copy link
Author

ghost commented May 16, 2020

Sorry ab weird stacktraces, this is the best i could do.

I think TimerQueue should be more defensive, taking down the whole process is quite severe.

I use 2 mechanisms to enforce timeouts:

  1. CancellationTokenSource.
  2. mysql command that sets the session timeout.

For some command's I've taken care to have lower session timeout compared to CancellationTokenSource, but for others I know I've missed adding it. Maybe this is the race condition.
I'll make changes to make sure the session timeout is less than cts timeout; If it solves the issue i'll let you know.

@bgrainger
Copy link
Member

System.Threading.TimerQueueTimer is different than MySqlConnector.Utilities.TimerQueue, so hardening the latter won't help this particular crash (but may be a good idea in general).

There is definitely a problem if MySqlCommand.Cancel throws an exception in the cases when it's triggered by a CancellationToken. (If called directly from user code, it might be good to throw an exception to signal a failure or indicate that preconditions aren't satisfied. But when triggered by a timeout, it does take down the whole process if it fails, which isn't good.)

@bgrainger
Copy link
Member

All code is disposing properly (using and async using), is async and I am extensively trying to enforce timeouts: using CancellationTokens and sometimes using mysql's built-in set session max_execution_timoeut.

Are you able to provide some sample code of what "typical" database access in your code looks like?

I've been trying to repro a high-concurrency cancellation race condition but haven't had success so far.

@ghost
Copy link
Author

ghost commented May 18, 2020

Ok,

Pseudocode is like this:

public async Task<IEnumerable<T>> SearchForSomething(CancellationToken? token)
{
        var operationTimeout = TimeSpan.FromSeconds(15);
        using var cts = new CancellationTokenSource(operationTimeout);
        await using var _ = (token ?? CancellationToken.None).Register(cts.Cancel);

        using var conn = new MySqlConnection(DatabaseConnectionString);
        await conn.OpenAsync(cts.Token);

        await using var cmd = conn.CreateCommand();
        cmd.CommandText = @"
SET SESSION MAX_EXECUTION_TIME={operationTimeout.Subtract(TimeSpan.FromSeconds(1)).TotalMilliseconds};
(some sql query)";

        await using (var reader = await cmd.ExecuteReaderAsync(System.Data.CommandBehavior.SequentialAccess, cts.Token) as MySqlDataReader)
                {
                    while (await reader.ReadAsync(cts.Token))
                    {
                        //reder code.
                    }
                }
}

Often the command is reused within the same method to manage temporary tables creation/deletion; other times I use transactions.

I used to have the same server timeout as the client timeout untill yesterday

SET SESSION MAX_EXECUTION_TIME={operationTimeout.TotalMilliseconds};

now, I've subtracted 1s from the operation timeout, to get some determinisms on who enforces timeouts. In a few days, I'll be able to tell if this provided a solution.

This is .net core 3.1 against GCP Cloud Mysql, which is 5.7 with almost all default options set.
Meanwhile, I am getting tons of UnobservedTaskException with the following stack trace

System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.)
 ---> System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
   at System.Net.Sockets.Socket.<>c.<ConnectAsync>b__275_0(IAsyncResult iar)
   --- End of inner exception stack trace ---

I know this does not take the process down, but still, it should be avoided.

@bgrainger
Copy link
Member

Thanks; I'll try to test something similar later.

BTW is await using var _ = (token ?? CancellationToken.None).Register(cts.Cancel); a deliberate choice instead of using:

using var combinedSource = CancellationTokenSource.CreateLinkedTokenSource(cts.Token, token ?? CancellationToken.None);
// use combinedSource.Token below

They're probably pretty similar under-the-hood, but I feel that CreateLinkedTokenSource expresses the semantics better.

@ghost
Copy link
Author

ghost commented May 18, 2020

No, i was not aware of CreateLinkedTokenSource.
I am now using a syntax like

using var cts = CancellationTokenSource.CreateLinkedTokenSource(token);
cts.CancelAfter(TimeSpan.FromSeconds(55));

@ghost
Copy link
Author

ghost commented May 23, 2020

@bgrainger After changing the mysql session timeout to be less then the CancellationTokenSource timeout, i am no longer seeing the timer exception.

Though my logs are full of double dispose issues on sockets.

@bgrainger
Copy link
Member

Most other ADO.NET libraries ignore failure/invalid state in DbCommand.Cancel: https://mysql-net.github.io/AdoNetResults/#Cancel_disposed_Command_is_no_op

I'll make the same change in MySqlConnector, which will have the side-effect of fixing this bug/race.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant