Skip to content

Broken connection on unix systems when executing multiple concurrent statements against a single connection #1620

@Mike737377

Description

@Mike737377

Describe the bug

On unix based systems the following exception is thrown when a single connection is being shared by multiple tasks/threads. The number of threads does not seem to be too relevant as long as there are 2+ threads involved.

Exception message: Microsoft.Data.SqlClient.SqlException (0x80131904): The connection is broken and recovery is not possible.  The connection is marked by the client driver as unrecoverable.  No attempt was made to restore the connection.
Stack trace:
at Task<DbDataReader> Microsoft.Data.SqlClient.SqlCommand.ExecuteDbDataReaderAsync(CommandBehavior behavior, CancellationToken cancellationToken)+(Task<SqlDataReader> result) => { } 
at void System.Threading.Tasks.ContinuationResultTaskFromResultTask<TAntecedentResult, TResult>.InnerInvoke()

Other threads throw:

  • System.InvalidOperationException: Invalid operation. The connection is closed.
  • System.InvalidOperationException: BeginExecuteReader requires an open and available Connection. The connection's current state is open.
  • System.InvalidOperationException: BeginExecuteReader requires an open and available Connection. The connection's current state is closed.

It will typically take somewhere between 10,000 to 10,000,000 sql statements to occur over the connection before the exception happens. By increasing the minimum number of threads into the hundreds the thread pool via ThreadPool.SetMinThreads the exception will usually present itself earlier on.

This exception is problematic under our use case as this is a long running transactional batch import operation, otherwise we would just reconnect again and continue.

Running the same version of the code on windows executes successfully.

To reproduce

The following is a harsh simulation of what we are doing however I'm unable to reproduce the exact error as I encounter previously reported errors (#422, #826) before I can reproduce this one.

using System.Data;
using System.Linq;
using Microsoft.Data.SqlClient;

System.Threading.ThreadPool.SetMinThreads(500, 500);

using var sqlConnection = new SqlConnection("Server=tcp:*******,1433;Database=****;User Id=****;Password=****;MultipleActiveResultSets=true;TrustServerCertificate=true;Encrypt=true;");
sqlConnection.Open();
using var tran = sqlConnection.BeginTransaction();

Task.WaitAll(Enumerable.Range(0, Environment.ProcessorCount * 10).Select(i =>
{
    return Task.Run(() =>
    {
        for (var j = 0; j < 100000; j++)
        {
            DoThing(sqlConnection, tran, i);
        }

        Console.WriteLine(i);
    });        
}).ToArray());

tran.Commit();

static void DoThing(SqlConnection conn, SqlTransaction tran, int i)
{
    var sql1 = "insert into aaa(gid, txt, num, dt) values(@0, @1, @2, @3); select @@IDENTITY";
    var sql2 = "select * from aaa where id=@0";

    using var command = conn.CreateCommand();
    command.CommandText = sql1;
    var param1 = command.CreateParameter();
    param1.ParameterName = "0";
    param1.Value = Guid.NewGuid();
    param1.DbType = DbType.Guid;
    var param2 = command.CreateParameter();
    param2.ParameterName = "1";
    param2.Value = "Some random text with: " + i;
    param2.DbType = DbType.String;
    param2.Size = 100;
    var param3 = command.CreateParameter();
    param3.ParameterName = "2";
    param3.Value = i;
    param3.DbType = DbType.Int64;
    var param4 = command.CreateParameter();
    param4.ParameterName = "3";
    param4.Value = DateTime.Now;
    param4.DbType = DbType.DateTime;
    command.Parameters.Add(param1);
    command.Parameters.Add(param2);
    command.Parameters.Add(param3);
    command.Parameters.Add(param4);
    command.Transaction = tran;
    var x = command.ExecuteScalar();

    using var command1 = conn.CreateCommand();
    command1.CommandText = sql2;

    var param5 = command1.CreateParameter();
    param5.ParameterName = "0";
    param5.DbType = DbType.Int32;
    param5.Value = x;

    command1.Transaction = tran;
    command1.Parameters.Add(param5);
    using var reader =  command1.ExecuteReader();

    while (reader.Read()) { }
}
create table aaa
(
	id int identity primary key,
	gid uniqueidentifier,
	txt nvarchar(max),
	num integer, 
	dt datetime
)

Expected behavior

SqlConnection stays open

Further technical details

.NET target: .net 5 & .net 6
SQL Server version: SQL Server 14.0.2037.2, Azure SQL instance (general purpose serverless gen 5)
Operating system: 5.4.0-110-generic #124-Ubuntu, Ubuntu Focal 20.04 inside a docker container
Reproducable against:

  • System.Data.SqlClient 4.700.21.41603
  • Microsoft.Data.SqlClient 4.0.1 & 5.0.0-preview2.22096.2

Additional context
I've grabbed a copy of the code (commit #dfa62a1746) and added some extra tracing and can see that SNIHandle.CheckConnection() in the following stack trace is returning 1.

   at Microsoft.Data.SqlClient.SNI.TdsParserStateObjectManaged.CheckConnection() in    at Microsoft.Data.SqlClient.TdsParserStateObject.ValidateSNIConnection() in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\TdsParserStateObject.cs:line 2663
   at Microsoft.Data.SqlClient.SqlConnection.ValidateAndReconnect(Action beforeDisconnect, Int32 timeout) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlConnection.cs:line 1448
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean isAsync, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 4678
   at Microsoft.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String method) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 4561
   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, Boolean sendToPipe, Int32 timeout, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry, String methodName) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 1644
   at Microsoft.Data.SqlClient.SqlCommand.BeginExecuteNonQueryInternal(CommandBehavior behavior, AsyncCallback callback, Object stateObject, Int32 timeout, Boolean inRetry, Boolean asyncWrite) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 1247
   at Microsoft.Data.SqlClient.SqlCommand.BeginExecuteNonQueryAsync(AsyncCallback callback, Object stateObject) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 1216
   at System.Threading.Tasks.TaskFactory`1.FromAsyncImpl(Func`3 beginMethod, Func`2 endFunction, Action`1 endAction, Object state, TaskCreationOptions creationOptions)
   at Microsoft.Data.SqlClient.SqlCommand.InternalExecuteNonQueryAsync(CancellationToken cancellationToken)
   at Microsoft.Data.SqlClient.SqlCommand.ExecuteNonQueryAsync(CancellationToken cancellationToken) in \src\Microsoft.Data.SqlClient\netcore\src\Microsoft\Data\SqlClient\SqlCommand.cs:line 2516

Using dotnet trace to collect the event source I can see the following
image
Note that the ValidateSNIConnection is one of the tracing points I added to determine that CheckConnection was returning something other than success.

I'm happy to try anything that will either help pinpoint the issue or attempt a work around.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions