Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With TransactionScope, after DbContext is disposed, DbConnection is still open. #14218

Closed
Dixin opened this issue Dec 20, 2018 · 15 comments · Fixed by #15152
Closed

With TransactionScope, after DbContext is disposed, DbConnection is still open. #14218

Dixin opened this issue Dec 20, 2018 · 15 comments · Fixed by #15152
Labels
breaking-change closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-bug
Milestone

Comments

@Dixin
Copy link

Dixin commented Dec 20, 2018

This issue happens with TransactionScope. After DbContext is disposed, DbConnection is still open.

Expected

In EF, after DbContext is disposed, its DbConnection is closed:

new ExecutionStrategy().Execute(() =>
{
    using (TransactionScope transactionScope = new TransactionScope())
    {
        DbConnection connection;
        using (AdventureWorks adventureWorks = new AdventureWorks())
        {
            connection = adventureWorks.Database.Connection;
            // Some operations.
            ProductCategory category = new ProductCategory() { Name = nameof(ProductCategory) };
            adventureWorks.ProductCategories.Add(category);
            adventureWorks.SaveChanges();
        }
        Trace.WriteLine(connection.State); // Closed, expected.
    }
});

The above ExecutionStrategy class is from here.

Not expected

In EF Core, the same workflow, after DbContext is disposed, its DbConnection is NOT closed:

dbCon.Database.CreateExecutionStrategy().Execute(() =>
{
    using (TransactionScope transactionScope = new TransactionScope())
    {
        DbConnection connection;
        using (AdventureWorks adventureWorks = new AdventureWorks())
        {
            connection = adventureWorks.Database.GetDbConnection();
            // Some operations.
            ProductCategory category = new ProductCategory() {Name = nameof(ProductCategory)};
            adventureWorks.ProductCategories.Add(category);
            adventureWorks.SaveChanges();
        }
        Trace.WriteLine(connection.State); // Open, NOT expected.
    }
});

Problem

This causes issues, for example:

dbContext.Database.CreateExecutionStrategy().Execute(() =>
{
    using (TransactionScope transactionScope = new TransactionScope())
    {
        using (AdventureWorks adventureWorks = new AdventureWorks())
        {
            // Some operations.
            adventureWorks.Products.ToArray();
        }

        // DbConnection is still open!!!

        using (DbConnection connection = new SqlConnection("ConnectionString"))
        using (DbCommand command = connection.CreateCommand())
        {
            connection.Open(); // PlatformNotSupportedException
            // Some operations.
        }
    }
});

The above workflow works in EF, and throws exception in EF Core:

System.PlatformNotSupportedException
HResult=0x80131539
Message=This platform does not support distributed transactions.
Source=System.Transactions.Local
StackTrace:
at System.Transactions.Distributed.DistributedTransactionManager.GetDistributedTransactionFromTransmitterPropagationToken(Byte[] propagationToken)
at System.Transactions.TransactionInterop.GetDistributedTransactionFromTransmitterPropagationToken(Byte[] propagationToken)
at System.Transactions.TransactionStatePSPEOperation.PSPEPromote(InternalTransaction tx)
at System.Transactions.TransactionStateDelegatedBase.EnterState(InternalTransaction tx)
at System.Transactions.EnlistableStates.Promote(InternalTransaction tx)
at System.Transactions.Transaction.Promote()
at System.Transactions.TransactionInterop.ConvertToDistributedTransaction(Transaction transaction)
at System.Transactions.TransactionInterop.GetExportCookie(Transaction transaction, Byte[] whereabouts)
at System.Data.SqlClient.SqlInternalConnection.GetTransactionCookie(Transaction transaction, Byte[] whereAbouts)
at System.Data.SqlClient.SqlInternalConnection.EnlistNonNull(Transaction tx)
at System.Data.ProviderBase.DbConnectionPool.PrepareConnection(DbConnection owningObject, DbConnectionInternal obj, Transaction transaction)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.Open()
at Tutorial.LinqToEntities.Transactions.<>c.b__6_0() in D:\User\GitHub\Tutorial\Tutorial.Shared\LinqToEntities\Transactions.cs:line 162
at Microsoft.EntityFrameworkCore.ExecutionStrategyExtensions.<>c.b__0_0(Action operationScoped)
at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementation[TState,TResult](Func`3 operation, Func`3 verifySucceeded, TState state)

Workaround

The workaround apparently is to manually close the DbConnection:

adventureWorks22.Database.CreateExecutionStrategy().Execute(() =>
{
    using (TransactionScope transactionScope = new TransactionScope())
    {
        using (AdventureWorks adventureWorks = new AdventureWorks())
        {
            // Some operations.
            adventureWorks.Products.ToArray();
            adventureWorks.Database.CloseConnection(); // Manually close DbConnection.
        }

        // DbConnection is closed.

        using (DbConnection connection = new SqlConnection(ConnectionStrings.AdventureWorks))
        using (DbCommand command = connection.CreateCommand())
        {
            connection.Open(); // Works.
            // Some operations.
        }
    }
});

Further technical details

EF Core version: 2.2.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10 1809 x64
IDE: Visual Studio 2017 15.9.4

@apacurariu
Copy link

Probably same problem but slightly different setup. Maybe it helps debug the problem.

Works

The scenario below, although pretty similar to @Dixin's example, works.

            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync(); 
                }


                using (var ctx2 = new EfCoreTxContext())
                {
                    ctx2.Rows.Add(new Row() { Value = 1000 });
                    await ctx2.SaveChangesAsync();
                }

                scope.Complete();
            }

Fails

Calling SaveChangesAsync() before opening a second transaction scope.

        private static async Task FirstStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();

                    await SecondStep();

                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                }

                scope.Complete();
            }
        }

        private static async Task SecondStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                }

                scope.Complete();
            }
        }

Also Fails

Fails even when there is no second TransactionScope. Simply calling SaveChangesAsync() before using a new context instance triggers the failure.

        private static async Task FirstStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();

                    using (var ctx2 = new EfCoreTxContext())
                    {
                        ctx2.Rows.Add(new Row() { Value = 1000 });
                        await ctx2.SaveChangesAsync();
                    }

                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync(); //A single call to SaveChangesAsync
                }

                scope.Complete();
            }
        }

Works

Manually closing connection as indicated by @Dixin

        private static async Task FirstStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                    ctx.Database.CloseConnection(); //Manually closing connection

                    await SecondStep();

                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                }

                scope.Complete();
            }
        }

        private static async Task SecondStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                }

                scope.Complete();
            }
        }

Also Works

Only calling SaveChangesAsync() once, after calling SecondStep().

        private static async Task FirstStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });

                    await SecondStep();

                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync(); //A single call to SaveChangesAsync
                }

                scope.Complete();
            }
        }

        private static async Task SecondStep()
        {
            using (var scope = new TransactionScope(TransactionScopeAsyncFlowOption.Enabled))
            {
                using (var ctx = new EfCoreTxContext())
                {
                    ctx.Rows.Add(new Row() { Value = 1000 });
                    await ctx.SaveChangesAsync();
                }

                scope.Complete();
            }
        }

Conclusions

All of the above work with EF6 without promoting to a distributed transaction. I think the expectation is that as long as we're connecting to the same DB all of these scenarios should also work with EF Core without having to manually close the connection after calling SaveChangesAsync().

@apacurariu
Copy link

Further to my previous comment, I did some more tests and the problem might be that EF6 closes the connection immediately after each query and SaveChanges, however EF Core keeps the connection open until the DBContext is disposed. Is there a specific reason why EF Core doesn't close the connection after every query (FirstOrDefault, Where, etc.) and after SaveChanges? The connection will not actually be closed but rather returned to the pool so EF not closing the connection wouldn't have any performance benefits. Is this correct?

@ajcvickers
Copy link
Member

@apacurariu When not using some form of explicit transaction EF will open and close the connection in the same way that EF6 did.

@apacurariu
Copy link

@ajcvickers Thank you for letting me know; I wasn't aware of that. My comments where in the context of using an explicit transaction. In that case, EF6 closes the connection after every query/command while EF Core keeps it open until the context is disposed. I'm frequently using more than one context within a transaction for various purposes. For example, I use one context instance for domain logic and another for tracking the status of the asynchronous command that triggered the domain logic. Another example is implementing the outbox pattern where after executing domain logic I want to store outbound messages to the DB in the same transaction so that I can reliable deliver them to, for example Azure Service Bus, with an at-least-once delivery guarantee. I need all of these operations to be performed in the context of the same transaction. As a workaround I opted to explicitly close the connection after every EF operation in my repository/data access layer built on top of EF Core. Therefore, I can now effectively use multiple different EF Core db context instances in the same transaction. I implemented this workaround once I figured out the behavour difference between EF6 and EF Core as this I didn't have this issue with EF6. The other option I had was explicitly reusing the DbConnection across contexts but that would have been a leaky abstraction and I don't prefer it.

I would be interested in learning whether you and the EF Core team consider it a good idea to implicitly close the connection after every query/command despite being in the context of a transaction or not, mimicking what EF6 does, or if I should continue to explicitly close the connection in my repo/data access layer?

@ajcvickers
Copy link
Member

@AndriySvyryd Do you see any issues with the application explicitly closing the issue after each operation to mimic EF6 behavior?

@AndriySvyryd
Copy link
Member

@ajcvickers I think current behavior was supposed to prevent promotion for providers that don't use connection pooling. We can change the default behavior to be the same as EF6 and allow the providers to flip it if needed.
@divega cc, in case you want to add something

@AndriySvyryd
Copy link
Member

As for a workaround closing the connection should be ok, but configuring the contexts with the same connection object would be best.

@divega
Copy link
Contributor

divega commented Feb 4, 2019

The behavior in EF Core was designed to avoid triggering transaction promotion to 2-phase commit with most databases.

SqlClient/SQL Server (2008 and up) happen to implement a specific optimization that avoids re-enlisting physical connections from the pool if they were previously enlisted in the same transaction. This optimization is described in https://blogs.msdn.microsoft.com/adonet/2008/03/25/extending-lightweight-transactions-in-sqlclient/ and I am only aware of SqlClient/SQL Server (2008 and up) implementing it (cc @roji in case he knows of others).

As the same blog post states:

There are still limitations with this approach. If you try to open a second outer connection BEFORE closing the first one, there won’t be a free connection in the pool, so a second inner connection will have to be enlisted (“A” is using “z” so “B” must again obtain “y”). Ditto if you open a connection with pooling turned off or a slightly different connection string, since neither case will find the original inner connection, even if it is sitting idle in the pool.

So unfortunately the behavior that was designed to work better with most databases in a common scenario (execute multiple operations on the same DbContext that would normally open, close and re-open the same connection object inside a transaction scope) is preventing SqlClient/SQL Server specific optimizations from kicking in and making another (probably less common) scenario (multiple, non-overlapping DbContext connecting to the same database inside a transaction scope) work.

All in all, I agree with @AndriySvyryd about the workarounds, but long term there is another option we can discuss: Making the behavior provider specific.

@roji
Copy link
Member

roji commented Feb 5, 2019

@divega Npgsql also implements the optimization where if you connect multiple types from the same TransactionScope, but not at the same time, the same connection will be used and no escalation will occur. I don't know about other providers do, though.

Note that opening/closing the ADO connection on every query could have some negative perf impact, as the pool gets a lot more pressure... For this reason I'm a bit more comfortable with the ADO connection having the same lifespan as the context. It seems reasonable to guide users to simply use the same context when doing different activities, rather than creating multiple contexts at the same time. This is admittedly more complicated if more than one context type is involved, although one can simply pass the ADO connection around.

@apacurariu
Copy link

@roji On the topic of the potential performance hit incurred for closing the ADO connection after every query, is that just because the pool would get more usage (e.g. open/close requests) which I assume require some sort of synchronization to ensure thread safety and thus the pool would be a bottleneck or is it more than that? If it's just that, and no new underlying connection has to actually be opened but it's rather just time spent moving that connection in and out of the pool then that is just in-memory processing and it shouldn't translate to a significant hit. Moreover, if the number of queries issued within an operation (e.g. web request) is reasonably small, the performance hit should be marginal. Are my assumptions correct? In any case, EF6 basically has that exact behavior so the performance hit wouldn't be more severe than using EF6 from this perspective? Is this correct or is it more to it than that?

@apacurariu
Copy link

apacurariu commented Feb 5, 2019

@roji The reason I ask is because to me passing the connection around seems like a leaky abstraction and is complicated further by things such as middleware which would like to do DB bound work in the context of the request, in a web app context. One would have to find a way to inject the ADO connection via the container which is even more complicated in a multi tenant context with a DB-per-tenant model where the actual connection string is resolved dynamically for every request based on some tenant-id header, for example. This could get quite complicated quite quickly so it makes me hesitant to expose and pass around the underlying ADO connection.

@roji
Copy link
Member

roji commented Feb 6, 2019

On the topic of the potential performance hit incurred for closing the ADO connection after every query, is that just because the pool would get more usage (e.g. open/close requests) which I assume require some sort of synchronization to ensure thread safety and thus the pool would be a bottleneck or is it more than that?

Yes, I'm referring to open/close requests in the pool, and to the fact that the pool is typically the single point of contention in the database driver. We've recently seen that in certain providers the pool is quite a significant operation; we indeed are working on improving pooling implementations and making it easier for providers to have better-performing pools, but the general issue does remain, and it doesn't seem like a great idea to add significantly more pool pressure unless necessary.

Moreover, if the number of queries issued within an operation (e.g. web request) is reasonably small, the performance hit should be marginal. Are my assumptions correct?

That's true - the less queries you have within the lifetime of a single context, the less it matters. However, I don't see any particular reason to assume that the number of queries is small... In some scenarios, a single context instance can have many queries in it, and adding a pooled open/close on each and everyone of those doesn't seem negligible.

In any case, EF6 basically has that exact behavior so the performance hit wouldn't be more severe than using EF6 from this perspective?

That's true once again, but I don't see why EF Core should be held to the performance standard of EF6. We should be providing better perf rather than the same.

The reason I ask is because to me passing the connection around seems like a leaky abstraction and is complicated further by things such as middleware which would like to do DB bound work in the context of the request, in a web app context. One would have to find a way to inject the ADO connection via the container which is even more complicated in a multi tenant context with a DB-per-tenant model where the actual connection string is resolved dynamically for every request based on some tenant-id header, for example. This could get quite complicated quite quickly so it makes me hesitant to expose and pass around the underlying ADO connection.

Those are very valid points. However, it's worth noting we're talking about applications which need multiple contexts, open at the same time, within the same transaction - this seems like quite a specific case. Doing this change needs to be measured against the perf degradation that everyone else will experience, especially keeping in mind that an alternative does exist: managing the connection used by contexts.

Regardless, I'm not sure exposing the ADO connection can be considered a leaky abstraction - relational DbContexts can work with ADO connections just as they can work with connection strings, I'm not sure why the former is more leaky than the latter. The same holds true for the multi-database (DB-per-tenant) model - applications already have to deal with multiple connection strings etc.

@apacurariu
Copy link

@roji Thank you very much for the comprehensive response. All your points are very valid and informative.

Your point on there being no significant difference between dealing with connection strings versus dealing with ADO connections is also true. Thank you for the suggestion! I see it has more benefits over explicitly closing connections.

Again, thanks a lot for your answers.

@divega
Copy link
Contributor

divega commented Mar 27, 2019

@ajcvickers are we keeping breaking change issues open until documentation is updated?

@ajcvickers
Copy link
Member

Yep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants