-
Notifications
You must be signed in to change notification settings - Fork 4.9k
SqlClient make managed connection more async #36667
Conversation
{ | ||
cancellationTokenSource.CancelAfter(timeout); | ||
} | ||
connectTask.Wait(cancellationTokenSource.Token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sync Wait in async method? Can't you just await
it?
Also you could pickup one of the Timeout patterns from the TaskTimeoutExtensions in tests
public static async Task TimeoutAfter(this Task task, int millisecondsTimeout) | |
{ | |
var cts = new CancellationTokenSource(); | |
if (task == await Task.WhenAny(task, Task.Delay(millisecondsTimeout, cts.Token)).ConfigureAwait(false)) | |
{ | |
cts.Cancel(); | |
await task.ConfigureAwait(false); | |
} | |
else | |
{ | |
throw new TimeoutException($"Task timed out after {millisecondsTimeout}ms"); | |
} | |
} | |
public static async Task<TResult> TimeoutAfter<TResult>(this Task<TResult> task, int millisecondsTimeout) |
/cc @stephentoub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't test any of the parallel branch and the code in the ParallelAsyncHelper is a bit beyond me. If there's a way to safely await it and preserve behaviour without risk i'm open to suggestions.
Similarly if there's a good way to avoid the async wait in ResolveConnectionStatus it would be good but I don't see one without an async lock.
@Wraith2 The change you are making is something SQLClient used to do earlier and had to be changed to sync network calls. With the change you have made a connection.OpenAsync() will cause new threads to be launched for the network operations which will ultimately lead to thread pool exhaustion. In Asp.Net core apps we saw, with the behavior that you are recommending that, it was hard to open more than 20 connections and with the sync calls we could open more than 10k connections. Ref PR #26200 |
Since the direction of this approach can lead to scalability issues, I recommend closing this and if sync APIs need to be used for networking, then they should be plumbed all the way down from the connection pool without changing the guarantees of the connection pool behavior. From what I understand, the connection pool itself has some issues where it returns open connections slowly, and I haven't dived deeper into it. That is an area that should be attacked for connection open, vs the managed SNI layer. |
Same calls, different way of using them. The previous version was using async methods but not awaiting them and was using .Result in the ctor so despite using async methods everything was happening in sync and the caller thread was blocked. So what it did was create a task which probably ran on a threadpool thread and then waited for it. This PR changes the ctor so the work of connection is immediately farmed out to a detached task and left to run. The ctor caller gets returned to immediately and is not blocked. The connection task is left alone until something requires knowledge of the connection state and at that point it must block to resolve the state, hopefully at this point the connection task has had enough time to do some of the work required to connect and if we're lucky it'll have finished. Fundamentally if a new connection is requested there is a set of actions that must take place for it to happen, there isn't a way to avoid the sequential nature of the dns lookup and then tcp connect. What this does is make it a little more concurrent friendly so it won't block unless it needs to and more connections can be started. The only way to avoid the real problem is to change the connection pooling to be async only throughout. That's a big bit of work. |
This is the problem. This shouldn't happen. This will need a new thread to complete the connection task, which is what we want to avoid. |
And also the changes will affect the Sync Open as well. |
It'll need a thread to run the task on. The difference is that the task doing the connection work will not block that thread indefinitely because it isn't being synchronously awaited until state needs to be resolved. In the old model caller thread starts a task on the threadpool and then blocks caller and pool threads waiting for the answer in the ctor. This version starts a task returns caller and defers resolution of the task until later allowing many awaiting tasks to be present without blocking.
Yes. This is because the open action was being performed in the ctor and was always blocking, whether that be the calling thread or a pool thread the .Result call forced state resolution before exit. That has been moved to calling .Status or CheckConnection() which should be later (though unlikely to be ms required for a network dns query) so there's some concurrency now instead of none. As I said you can't be fully concurrent without making the connection pool async all the way from the top. Essentially I think this is better but can't be perfect. I'll close as requested but I'd like to continue the discussion to get feedback on the problem to make sure I'm using the right mental model when working with async, hopefully @benaadams or @stephentoub can comment on that. |
The task which is doing is the connection work is not blocked already. SqlConnection.OpenAsync is not blocking. This is because it requests the Connection Pool thread to queue its request. Connection pool thread tries to drain out its queue in serial manner. I.e. if it has 10 requests, then 1 followed by 2 followed by 3 will be fulfilled, and 1 2 3 ... 10 will not lead to 10 parallel TCP connection open. This approach takes up more threads for the work it needs to do and reduces the scalability of opening connections. Even if the connection open was not in the constructor and in a different method, with async connection open on connection pool thread, more connections would have been opened. Right now a single thread from connection pool is used to fulfill the connection open. What really is the problem that you are facing for which this is the solution is something I am not able to understand. Is the change purely based on reading the code and seeing that there is no async path in managed SNI for opening connections? |
It was suggested it might be worth taking a look at. I can't see ConnectAsync going async It returns a task but that isn't the same thing, it's hard to follow so I must be missing the Task.Run or await somewhere. |
The code for async connection open was built on top of the way sync Open was happening. This statement doesn't imply that Async over sync pattern is being used. When a connection.OpenAsync() is called, then
This queue is dequed by the connection pool thread (which is launched in case there are pending open requests and there is no thread running), and it dequeues the Concurrent Queue at https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/ProviderBase/DbConnectionPool.cs#L965 Then a connection request is made synchronously on the connection pool thread at https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/ProviderBase/DbConnectionPool.cs#L991 At this line the When the connection is created, then the TaskCompletionSource for the SqlConnection.Open result is set at https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/ProviderBase/DbConnectionPool.cs#L1009 The TaskCompletionSource (tcs) is allocated during SqlConnection.OpenAsync() at https://github.com/dotnet/corefx/blob/master/src/System.Data.SqlClient/src/System/Data/SqlClient/SqlConnection.cs#L1002 This will explain how the SqlConnection.OpenAsync is async. It doesn't block on any real TCP connection Open. The actual TCP connection establishment happens on a ConnectionPool thread, of which only one is launched per connection pool and all the connection requests for opening connections are fulfilled on the connection pool thread. |
Very detailed thank you and i see the confusion now, i hadn't found the producer-consumer thread because i was working up from the tcp handle constructor and not going far enough up into the provider architecture. So then the only outstanding question i have is whether it is desired for tcp handle ctor to be blocking? As mentioned in the suggestion thread the connection only needs to be finished opening in CheckConnection and my change affects that path which would allow multiple handles to be created more quickly and to block slightly later on when finding one that is open a possible minor speed increase and less blocking on the connection pool thread. It'd be good to quantify this but i couldn't work out how to setup your docker repro. Would simply benchmarking opening and closing of connections in a loop be sufficient? |
Yes it is desired for the TCPHandle Ctor to be blocking. Lets do a deeper dive into this. When SqlConnection.OpenAsync was called, the request was sent to the Connection Pool thread, which ends up calling SqlConnection.OpenAsync provides either an open authenticated connection or an exception. Hence the job of the Ctor of
The TCP connection of step 1 is created using TdsParserStateObject.CreatePhysicalSNIHandle which calls the overridden TdsParserStateObjectManaged.CreatePhysicalSNIHandle which in turn, instantiates the SNITcpHandle. As of now, returning from This is accomplished via So that could be a bug in the approach in this PR. You could end up adding a check after SNITcpHandle ctor is created, to check if the underlying task has completed. There are 2 caveats.
Essentially, for connection to open, we need a guaranteed TCP connection here so that the rest of the steps of connection open can be done successfully. Whether the constr returns early or later, it wouldn't matter and connection pool wouldn't be able to continue unless the current connection being processed, is driven to completion. |
@Wraith2 In case you need more information with respect to this PR, let's continue the conversation here. Else to brainstorm perf of the connection pool we could discuss on #30430 |
Related to https://github.com/dotnet/corefx/issues/25742
SNITcpHandle currently uses synchronous dns queries and socket open calls in the ctor which causes blocking behaviour which can be long running. This PR changes the ctor to initiate an async task to establish the connection and then makes sure that the task is correctly cleaned up and the status resolved when any sync apis request that information. async dns and connect are used where possible. This should allow more connections to be started concurrently by changing wait characteristics to non-blocking.
Notes:
ResolveConnectionStatus
on the two paths that are always used to check the status but this method has to intersect the sync and async worlds and must be serialized so that only one completion is possible, lock and async aren't compatible so it uses a sync wait on an async task. I don't like this but I don't have a better way/cc owners @afsanehr, @tarikulsabbir, @Gary-Zh , @David-Engel and interested people @divega @saurabh500