Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disposing an EF enumerator takes long time on large/complex queries on SQL Server. #24857

Closed
dozer75 opened this issue May 8, 2021 · 14 comments

Comments

@dozer75
Copy link

dozer75 commented May 8, 2021

When you have a query that returns lots of records, the disposing of an enumerator takes long time if you aborts the enumeration before traversing through all rows.

I did a simple test project that has a simple table with 2 million rows and executed the following code. The metrics for the disposing was approx. 1 second on my environment when creating a data reader without filtering (other filtering is also done in the test project.

foreach (var t in context.Tests)
{
    break; //Just to stop the iteration directly
} // This takes approx one second when it disposes the enumerator foreach uses

After investigating the EF core code around this, I see that the Dispose method in RelationalDataReader.cs calls Close on the data reader. According to the Remarks section of the SqlDataReader.Close method it states:

The Close method fills in the values for output parameters, return values and RecordsAffected, increasing the time that it takes to close a SqlDataReader that was used to process a large or complex query.

And this is the problem; if you haven't iterated through the reader to the end, the close method will iterate through the rest of the reader and thus create a heavy performance impact against SQL Server on large/complex queries.

Further, the remarks section follows up with

When the return values and the number of records affected by a query are not significant, the time that it takes to close the SqlDataReader can be reduced by calling the Cancel method of the associated SqlCommand object before calling the Close method.

The number of rows affected isn't interesting in when iterating, so the disposing logic should call Cancel on the SqlDataReader. I see that this may be an issue to solve this easily in the code as it is, since the RelationalDataReader uses the DbDataReader which doesn't have an Cancel logic... But again, it is something that heavily impacts performance when certain operations are done, so it should be fixed some way.

This issue has been reproduced with the following environment:

.NET SDK (reflecting any global.json):
 Version:   5.0.201
 Commit:    a09bd5c86c

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.19042
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\5.0.201\

Host (useful for support):
  Version: 5.0.4
  Commit:  f27d337295

.NET SDKs installed:
  3.1.301 [C:\Program Files\dotnet\sdk]
  3.1.407 [C:\Program Files\dotnet\sdk]
  5.0.104 [C:\Program Files\dotnet\sdk]
  5.0.201 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.All 2.1.26 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.26 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 5.0.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.26 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 5.0.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 3.1.5 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 3.1.13 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 5.0.4 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

And with Microsoft.EntityFrameworkCore.SqlServer Version 5.0.5

@roji
Copy link
Member

roji commented May 8, 2021

When a reader is disposed (e.g. because an EF Enumerator is disposed), any pending results must indeed be consumed - this is necessary in order for the database connection to be usable for another query. In your example app, you execute a query which fetches a huge number of rows, and then immediately dispose the reader; it's expected for this to be quite slow (the notes on the number of records affected as well as output/return parameters are very unlikely to be relevant for this case).

Systematically calling DbCommand.Cancel on unconsumed readers would most probably not be a good thing, as it would likely negatively impact queries where there isn't a huge number of pending rows. However, you should be able to trigger this behavior from user code by executing your query asynchronously, and then triggering the cancellation token before disposing the enumerator.

However, at the end of the day, it's usually bad practice to be selecting rows which you won't be consuming - it's recommended to use the Take operator (and paging in general) to fetch rows which you know you'll need. There are indeed some cases where the number of rows needed isn't known until you start enumerating the resultset, but these cases should be rare, and it may still be better to simply request more rows using paging (at the cost of additional roundtrips).

@dozer75
Copy link
Author

dozer75 commented May 9, 2021

I do not agree on this at all. This example is what it is, just an example to have an easy way to provoke the issue.

This can also happen on any kind of complex queries, it may not be seconds, but it will be a performance impact nevertheless, especially if you want to break the loop for any reason.

My real world estimations is that on an average complex query the dispose call differs quite a lot even on smaller result sets, so even with paging you will in certain scenarios get a performance impact which is not necessary.

Running the code asynchronously has the same impact, if you use a cancellation to stop the iteration, the DisposeAsync is still executed with the same result as the DisposeAsync of the RelationDataReader calls the same Close method as the synchronous Dispose and thus will suffer from the same issue.

To close a performance issue as by design is ridiculous I'm afraid. It should be investigated as the documentation of SqlDataReader.Close states that Cancel is the solution, so it is worth to look at the performance impact of doing this somehow at least.

@roji
Copy link
Member

roji commented May 9, 2021

@dozer75 at the end of the day, the question is why a real-world application would execute a query with a huge resultset, and then not consume any of it; I see the performance issue as being in the application, and not something that EF Core should mitigate. What is your real-world scenario for this?

In addition, as I wrote above, executing a DbCommand.Cancel every time a reader is disposed would probably introduce perf issues of its own.

Running the code asynchronously has the same impact, if you use a cancellation to stop the iteration, the DisposeAsync is still executed with the same result as the DisposeAsync of the RelationDataReader calls the same Close method as the synchronous Dispose and thus will suffer from the same issue.

I'm... confused; I'm suggesting that your own user code trigger cancellation before disposing EF's enumerator. Isn't that exactly the behavior you're asking for? For a long query with a huge resultset, the cancellation should stop the query at the server, thereby reducing the time and the amount of the data that needs to be transferred...?

It should be investigated as the documentation of SqlDataReader.Close states that Cancel is the solution, so it is worth to look at the performance impact of doing this somehow at least.

First, EF Core isn't just about SQL Server - there are other database providers out there. Second, the SqlClient docs do say that the time that it takes to close the SqlDataReader can be reduced by calling the Cancel method"; this doesn't mean it makes sense to always do that (e.g. in EF Core), for all queries.

@dozer75
Copy link
Author

dozer75 commented May 9, 2021

@dozer75 at the end of the day, the question is why a real-world application would execute a query with a huge resultset, and then not consume any of it; I see the performance issue as being in the application, and not something that EF Core should mitigate. What is your real-world scenario for this?

Again as I said, this is an EXAMPLE that makes it easy to reproduce. The real world example traverses using a complex query which causes the same behavior. The actual result set isn't this big, but again this is a reproducible example that gives the same behavior (actually it gives a lesser impact than the real world example where the dispose takes 27 seconds due the joins and size of data not the result it returns itself which isn't that big). I have worked around the issue so it is not a problem for me right now, but it is still a performance issue problem in the core library.

Running the code asynchronously has the same impact, if you use a cancellation to stop the iteration, the DisposeAsync is still executed with the same result as the DisposeAsync of the RelationDataReader calls the same Close method as the synchronous Dispose and thus will suffer from the same issue.

I'm... confused; I'm suggesting that your own user code trigger cancellation before disposing EF's enumerator. Isn't that exactly the behavior you're asking for? For a long query with a huge resultset, the cancellation should stop the query at the server, thereby reducing the time and the amount of the data that needs to be transferred...?

It should, but it doesn't... It still does exactly the same behind since DisposeAsync calls Close on the datareader synchronously, so cancelling the execution using a WithCancellation(cancellationToken) on the async enumerator, will still issue the same result. I don't know why, but it does... So your workaround has the same impact. This is a general problem async or not of what I see.

It should be investigated as the documentation of SqlDataReader.Close states that Cancel is the solution, so it is worth to look at the performance impact of doing this somehow at least.

First, EF Core isn't just about SQL Server - there are other database providers out there. Second, the SqlClient docs do say that the time that it takes to close the SqlDataReader can be reduced by calling the Cancel method"; this doesn't mean it makes sense to always do that (e.g. in EF Core), for all queries.

Of course I know, I have been with EF since it's beginning... But there are as you know different implementation of the providers of EF backends, so it could maybe be solved on a provider based level if possible. As I said in my initial post, I know it isn't an easy solution due to the implementation that exist in EF Core today but that doesn't say that it shouldn't be looked at.

Anyway; I have reported this issue, and my opinion is that this is something that has a general unknown performance impact for all applications in some degree if they uses iterators and breaks the iteration. Of course you can decide to do what you want with this issue it's the teams responsibility. But I have submitted a performance issue and I think it would deserve some kind of investigation at least and not just be closed "as designed" because I'm sure that you don't intend that to have code that knowingly creates performance issues in EF Core.

If you need help to investigate or have more questions about the performance issue itself (not politics) feel free to ask.

@roji
Copy link
Member

roji commented May 10, 2021

Again as I said, this is an EXAMPLE that makes it easy to reproduce. The real world example traverses using a complex query which causes the same behavior. The actual result set isn't this big, but again this is a reproducible example that gives the same behavior (actually it gives a lesser impact than the real world example where the dispose takes 27 seconds due the joins and size of data not the result it returns itself which isn't that big).

@dozer75 any chance you could actually provide the details on your real world example? At the moment all we have is the pretty contrived scenario you submitted above, which selects a huge number of rows and then closes the reader before reading a single one. I personally don't think that scenario justifies integrating automatic cancellation into EF Core, and should rather be treated as problematic user code. I'm not trying to be difficult; users sometimes write queries in this way since they don't understand the importance of including a Take operator in their queries, to limit resultset size at the server to what they need. We call this out in our docs.

So I'm asking again - what real-world scenarios do you see as requiring users to fetch more rows than they'll actually end up consuming, and why can't they (or shouldn't they) simply fetch what they need with Take?

[...] It still does exactly the same behind since DisposeAsync calls Close on the datareader synchronously [...]

That doesn't sound right... The enumerator returned when executing an async EF Core query calls DisposeAsync on its RelationalDataReader, which itself calls DisposeAsync on DbDataReader; so we should be async all the way down. If you find an async code path where a sync API is called, that would be an EF Core that needs to be fixed regardless of this issue.

However, regardless of the above, you're right that triggering the cancellation token might not properly cancel the query, as outlined here; if cancellation happens to be triggered before in between calls to DbDataReader.ReadAsync (race condition), IIRC SqlClient would simply throw an exception rather than cancelling the query at the server. I've opened dotnet/SqlClient#1065 to discuss this at the SqlClient level.

But I have submitted a performance issue and I think it would deserve some kind of investigation at least and not just be closed "as designed" because I'm sure that you don't intend that to have code that knowingly creates performance issues in EF Core.

The closed-by-design label is only an initial determination I put, as my own personal opinion; the issue hasn't been closed, and will be discussed as a team in EF triage for a final decision. Based on the discussion above, my feeling is still that this still describes a user coding issue - there are many ways to use EF in an inefficient manner; that doesn't mean there's an EF performance issue per se. But again, a detailed, real-world scenario would help understand your point better.

@roji
Copy link
Member

roji commented May 10, 2021

Notes from triage discussion:

  • In principle, if it's correct for EF Core to do DbCommand.Cancel when an unconsumed reader is disposed, then it should be correct for SqlClient to do the same; therefore, this wouldn't belong in EF Core. In other words, it's the ADO.NET provider's responsibility to complete DbDataReader.Dispose as efficiently as possible.
    • Server cancellation may interact with batching; cancelling an earlier batched statement would cause skipping of later batched statements in at least some providers (e.g. Npgsql). In that case, it would be unsafe (and a breaking change) for a provider to start doing this.
    • This may still be an optimization for when there's no batching (or when we're cancelling the last statement in a batch, or whatever). It could be done either at the ADO or at the EF level.
  • Adding a cancellation when disposing unconsumed readers may have perf impacts of its own (e.g. additional database roundtrip), and so it is unclear whether this is the right thing to do systematically. Needs to be analyzed on a per-provider basis.
  • Leaving this up to the user may be the right thing. EF Core hides the DbCommand, so DbCommand.Cancel isn't accessible, but the async cancellation token is. Unfortunately that isn't fully reliable because of Triggering a cancellation token may not cancel a running query based on timing SqlClient#1065.

@dozer75
Copy link
Author

dozer75 commented May 10, 2021

@dozer75 any chance you could actually provide the details on your real world example?

I think I misunderstood your question her initially, sorry about that. I can't give you the exact code as it is business critical, but I'll try to describe it as good as possible.

We have a rather large table (with a couple of millions records). I had initially against this table an EF query that without paging restrictions that returns approx 450k rows. The query itself looks like the following (pseudo code):

context.Table.Where(row => row.Column1 = value1 && row.JoinedTable1.Column2 = value2)
             .Select(row => new 
             {
                 row.Id,
                 row.Column1,
                 row.JoinedTable1.Column2,
                 row.JoinedTable2.Column3
             })

Initially (when I discovered the issue) this query was iterated through an async foreach but I also tried using synchronous foreach to decide if this situation is caused by async issues, but with the same result. Within this foreach, we're doing some calculations and external calls, and if the external call succeeds we add the item to a list and when the list has been filled with a predefined amount we break the iteration. We cannot limit the query using the predefined amount since we don't know how the external call reacts on the passed data, it may be 1000 it rejects or it doesn't reject anything.

var result = new List<Result>();
await foreach(var item in query.WithCancellation(cancellationToken)) // Initial iteration query execution takes approx 1.6 seconds
{
    // Do initial logic

    // Calls external dependency
    var response = _httpClient.PostAsync(<url>, <content>, cancellationToken).ConfigureAwait(false);
    if (response.IsSuccessStatusCode)
    {
        result.Add(item);
        
        if (result.Count == maxResult)
        {
            // Also tried to create a custom cancellationtoken and cancel the async iterator
            // with the same result.
            break;
        }
    }
} // Here DisposeAsync is called and takes 27 seconds in my example

I think that this way to work with an iterator is quite common no matter the data source (that is, breaking the iteration before all entities in the result are processed), and it is a bit pity that having sql server as the data source (I don't know how other datasources reacts to the dispose logic with EF the iterator) will suffer a performance hit in scenarios where you break the the iteration before it's done (the iteration break can be both handled break and unexpected like exceptions).

Of course, paging is an option. And then, when the result count matches the maxResult, it won't take 27 seconds to dispose the remaining result of the current paged query. But it will still have some performance impact on dispose when you don't iterate all records before the break. In addition; a another performance hit will come int to play when using paging since each paging query will average a 2 second query request which in turn when we have done 15 paged queries exceed the total execution time of the non-paged query and all above will give us an even more performance hit compared to the non-paged result.

There are also a lot of other non-performance aspects here as well like extra unnecessary load on the sql server, network impacts, costs in e.g. cloud for both network and resource usage both with non-paged query and paged query.

I don't argument that the code could be better and the data structure could be better. In fact I have found a workaround that gives me an acceptable throughput here (still with some performance issue), but this was the original "real-world" problem which my initial example shows in a more simpler manner.

[...] It still does exactly the same behind since DisposeAsync calls Close on the datareader synchronously [...]

That doesn't sound right... The enumerator returned when executing an async EF Core query calls DisposeAsync on its RelationalDataReader, which itself calls DisposeAsync on DbDataReader; so we should be async all the way down. If you find an async code path where a sync API is called, that would be an EF Core that needs to be fixed regardless of this issue.

True, but you are looking at the wrong place. This is fixed in EFCore 6 two months ago, but still synchronous in EFCore 5.

I hope that this clarifies a bit around the issue. And again, for my part I have found a workaround that will give me a performance boost compared to the original code, but it is still impacted but it is an acceptable hit compared to my original issue. But the fact still remains that there are performance issues when iterating an EF query that runs against sql server.

@dozer75
Copy link
Author

dozer75 commented May 10, 2021

  • Leaving this up to the user may be the right thing. EF Core hides the DbCommand, so DbCommand.Cancel isn't accessible, but the async cancellation token is. Unfortunately that isn't fully reliable because of dotnet/SqlClient#1065.

This is one thing I was thinking about. In the dispose logic there are some interceptor calls after close but before dispose. If there was a way to intercept the call before the Close/CloseAsync calls it would be possible for users of EF to intercept and call Cancel before the Close/CloseAsync.

@roji
Copy link
Member

roji commented May 10, 2021

Thanks @dozer75, that is the information I was looking for; we discussed similar scenarios internally and I agree in these cases it makes sense to have a non-paging query and (efficiently) cancel the query at some arbitrary point in the middle.

True, but you are looking at the wrong place. This is fixed in EFCore 6 two months ago, but still synchronous in EFCore 5.

Ah yes, you're right - I fixed this in 6.0 as part of https://github.com/dotnet/efcore/pull/24207/files#diff-84b8b9e1d0dc310acad83d528446ead8dca459070754debafefad0f29f4f24b7R166.

If there was a way to intercept the call before the Close/CloseAsync calls it would be possible for users of EF to intercept and call Cancel before the Close/CloseAsync.

That could be a way insert a cancellation, though you'd have to either do it for all queries, or identify the queries you're interested in. Unfortunately we don't currently support intercepting reader close/dispose - this is tracked by #24295.

I also recommend following dotnet/SqlClient#1065 for the SqlClient side of things.

@smitpatel
Copy link
Contributor

Chiming in from LINQ world
in LINQ, enumerators allow you to stop iterating it any point and it doesn't have to traverse rest of the enumerator to dispose it. Some basic operations like First/TakeWhile are implemented in that way only.
As above discussion points out that is not true for databases always. We try to match LINQ behavior when doing query when it is possible easily and doesn't cause any drawback in any case. But when database behavior is quite different from LINQ, we just give users database behavior. One basic example of that is string case sensitivity.

Specifically for aforementioned use case where the size of records needed is not known in advance but need to terminate early, one way is to do keyset based pagination and use continuation to get data in blocks. Block size depends on network round trip cost. That would avoid loading large dataset in single query and disposing without cancelling wouldn't cause significant delay because it would be just that one block iterating.

@dozer75
Copy link
Author

dozer75 commented May 11, 2021

That could be a way insert a cancellation, though you'd have to either do it for all queries, or identify the queries you're interested in. Unfortunately we don't currently support intercepting reader close/dispose - this is tracked by #24295.

@roji If it was possible to cancel a current execution when running async with a cancellation token passed to the async iterator using WithCancellation and the iterator stopped the read and executed the CancelAsync operation on the associated DbCommand it could work having a cancellation token. Looking at the SingleQueryingEnumerable the change shouldn't be that big, and it should not give a big impact in any way when it comes to the internal logic or the expected result for a user. Users of synchronous iteration would still suffer, but there would be an option at least.

But when database behavior is quite different from LINQ, we just give users database behavior. One basic example of that is string case sensitivity.

@smitpatel I see your point, but this isn't sql server behavior (nor any other database engine that I know of). Cursors are the closest on DBMS level that can be compared to iteration in code. When iterating on database level using cursors you can stop, close and deallocate the cursor at an point of time and it does not iterate the rest of the cursor result as the iteration in EF does and thus not give the performance issue. Even if you work with SqlDataReader directly you have the option to skip this "round up on close" logic by doing a cancel on the SqlCommand (with the impact that may have, but it is still possible) as documented. So, this is a choice done in EF Core that isn't imitating any sql server relational database behavior in the background.

@smitpatel
Copy link
Contributor

Even if you work with SqlDataReader directly you have the option to skip this "round up on close" logic by doing a cancel on the SqlCommand

Same goes for EF Core, if you are working with EF Core generated enumerable then you have option to call cancel on the enumerator before disposing it. EF Core is not doing anything special here, it just passes on what user asks to underlying reader. If you ask it to dispose, it will just call dispose.

@roji
Copy link
Member

roji commented May 16, 2021

Quick summary of the situation and possible next steps:

  • EF 5.0 does have a bug, where RelationalDataReader.DisposeAsync (async) calls DbDataReader.Close (sync). This was already fixed for 6.0 as part of Recycling relational and ADO.NET objects in query pipeline #24207. We can consider patching this.
  • If we do the above, EF will at least flow the cancellation token when the enumerator is disposed. This would help for e.g. Npgsql, but unfortunately SqlClient doesn't implement CloseAsync (so Close is called anyway) - this is tracjed by Implement async transaction commit/rollback methods SqlClient#113. So the token wouldn't be respected regardless of what we do.
  • In any case, we still lack a means for users to cancel sync queries, since we don't expose DbCommand (which is where Cancel lives). That doesn't seem very high-priority to me.
  • We could consider triggering a cancellation in EF itself when an unconsumed enumerator is disposed, but as written above, this doesn't seem to belong to our layer of the stack (but rather in the ADO.NET provider).

@AndriySvyryd
Copy link
Member

EF 5.0 does have a bug, where RelationalDataReader.DisposeAsync (async) calls DbDataReader.Close (sync). This was already fixed for 6.0 as part of #24207. We can consider patching this.

Unless we get more reports we don't think there's enough value in patching this in 5.0.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants