-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploring Scoped vs Transactional (IDbContextFactory
) DbContexts
#25653
Comments
@Mike-E-angelo DbContext is not thread-safe. You cannot safely perform multiple queries (even no-tracking) concurrently. See Avoiding DbContext threading issues. |
Thank you for that link, @ajcvickers. I was aware of that, and that you could turn off the concurrency check, but I wanted to verify/ensure the possibility here. I guess this is wishful thinking on my part, then. 😅😭 It would be nice to know what sort of issues occur from concurrent queries occurring. From the outset it would seem that read operations would be OK, but considering there can be anywhere to 23KB of allocations occurring to create an empty array, there could be complications. 😁 Wishful thinking aside, are there any plans to make queries/read-only behavior thread-safe? |
Turning off the concurrency check only turns of the safety check; it doesn't make it safe. It can provide a percent or two of perf benefit in extremely high perf scenarios, like tested by TechEmpower. The issues are the same as in most threading cases--undefined behavior that will occasionally cause crashes depending on the timing of the threads. |
To add to @ajcvickers answer:
It's important to understand that a single DbContext uses a single database connection, and those can only execute one query at a time. There's also various state kept internally in DbContext itself which isn't thread-safe. However, assuming your using DbContext pooling, there really should not be any reason to want to reuse the same context - regardless of what your query looks like and how many expression it has. If you're seeing differently, can you please try to put together a minimal code sample that shows that, preferably as a simple console application? |
OK, I am with you @ajcvickers. Thank you for that additional context, it is valuable to me and my understanding of this topic.
That was the gotcha I was looking for @roji. 😁 I knew there was a catch somewhere here. Thank you for letting me know that. So that means if I shared 1 Alright, this is better understood to me. I appreciate the time and insight provided here. 👍
Correct, and your patience is appreciated here as I articulate my findings. The only re-use of contexts that is occurring is how my application is designed now (and I am trying to move away from), and that current design consists of 1 scoped/shared Conceptually, it's tough for me to commit to ditching re-use vs That stated I think we're on the same page as far as the approach here. My only remaining issue now is those allocations. Is this a known issue that a request to materialize an empty |
It wouldn't crawl to a halt. If at any point the singleton DbContext is used concurrently, you'd get an exception at best and undefined behavior at worse. Technically, if you could be somehow sure that the singleton is never used concurrently, everything would work. Note that depending on exactly how things work in your application, it may be possible to trigger concurrent usage of the DbContext even without multiple users, e.g. if a single user clicks a button twice, with the 2nd time trying to use the DbContext before the 1st operation completed (you can disable the UI via a modal dialog to prevent this for as long as the operation is ongoing).
As I wrote above, that doesn't correspond to how things should be working. Whether you use a single DbContext instance or multiple pooled ones should make almost no difference; the overhead of getting and returning a pooled DbContext instance is negligible. I really recommend carefully reading this doc page, which goes over how Blazor Server and EF interact, and shows various strategies for managing your DbContext.
We worked a lot on improving the runtime perf of non-tracking queries for EF Core 6.0 (including reducing memory allocation), so give the latest preview a try; if you're using EF Core 5.0 there's a very good chance you'll see other results (would be good to hear about them too!). |
Right, but if I have 100 users all using that same context, and only 1 connection can be used at a time, and each operation takes 100ms to complete (rounding up 😁), it would take 10 seconds for all of those operations to complete, correct? This is what I mean by grinding to a halt. I'm actually thinking more than 100 users, more like 512 or 1,024, :P If they are all sharing that one connection and that connection was thread-protected, there would be a huge bottleneck because it only has one connection at a time.
So it's not the instance/activation/retrieval of the https://github.com/Mike-E-angelo/Stash/tree/master/EfCore.ScopedVsTransaction
I know it doesn't seem like it, but I actually spent the day reading it and other articles before writing in with this question. 😆
The project above uses So to summarize, calling Please let me know if there is any further information you require to help further diagnose this issue and I will assist the best that I can. 👍 |
Also, @roji in addition to the allocations, I want to be sure you understand that the other sticking point here is the time spent doing this. Calling When you add a few (3) expressions to the mix, a pooled DbContext |
DbContext doesn't serialize the connections for you. Once again, if you attempt using it concurrently, it would not take longer - it would throw (or have undefined behavior). Apart from that, yes - if you were to theoretically serialize all usage of a single DbContext instance (via some sort of lock or queue), then indeed things would be slow. But that wouldn't make much sense.
There shouldn't be any difference here - a DbContext that you reuse yourself vs. a DbContext that you get from context pooling is the same DbContext - there's no difference in how it executes things. I do note that in the Scoped case, you're not executing Though all that is quite academic. Sharing a singleton DbContext in an inherently concurrent application (e.g. a webapp) simply isn't a viable option. One last (likely academic) point is that if you're using DbContext pooling, and care about every little bit off perf, then you're better off not using
I don't remember the absolute per-run allocation numbers I ended up with... 2KB does seem a little high for an empty non-tracking query, but not completely unreasonable. I'd have to look in a memory profiler again. If you're interesting in the optimization work done for EF Core 6.0, here's some info. |
I'm glad we have shared agreement here. 😁
Correct, the difference is when is it executed. In a scoped session, it is executed once when it is stored in memory as the DbContext is also stored in memory as a scoped instance. In the other scenarios, it is created and executed each and every time a
Good catch. Thank you for pointing that out. I have committed an update that makes it all consistent. The results are similar:
As I attempted to describe above, this is the whole point. In a scoped user/circuit session, an Conversely, with transactional/pooled, each query must be created at the time the context is retrieved, as the new context is the dependency for the query. As such, there is more net overhead (in both time and allocations) in this scenario than one where it is scoped to memory.
This is so funny to me as I was doing exactly that until I read this section: Which says to use
Great, thank you for any time/consideration you can provide. 🙏 |
All in the details, isn't it? :) This made me realize that the benchmarks I provided were not calling
Note, too, that the times are affected adversely as well. |
You may need to understand better how EF actually handles queries. Composing LINQ operators over a DbSet doesn't compile them or optimize them in any way - it simply constructs an expression tree in memory; this isn't really EF yet, it's just what the LINQ operators do (e.g. Where). In any case, when that expression gets evaluated (via ToArray or similar), if checks its internal cache to see if the query has been compiled before. If so, it skips most of that and executes directly, otherwise it needs to go through the heavy process of query compilation. The crucial bit is that the query cache is not tied to a specific DbContext instance. It's certainly possible that for very optimized queries using the InMemory provider (or even a real database), the process of composing a LINQ query itself starts to show up in benchmarks. If you want to avoid that, use EF Core's compiled query feature - this compiles a (fully composed) query once, and gives you back a function which you can invoke multiple times with different DbContext instances. Finally, I haven't looked at your changes, but here's a quick benchmark of my own which shows different results (code at the bottom): BenchmarkDotNet=v0.13.0, OS=ubuntu 21.04
The part about SqlServer vs. InMemory is quite crucial; if you're looking at pure percentages, that pooling may seem expensive compared to Same. The moment you throw a real database in there, things start to look a bit different. Note that this run used 6.0.0-preview7.
That advice is correct as long as DbContext pooling isn't being used. We could amend this, but we're really into micro-optimization here, which doesn't matter to all but the most high-perf applications. Benchmark codeBenchmarkRunner.Run<Program>();
[MemoryDiagnoser]
public class Program
{
private BlogContext _reusableContext { get; set; }
private PooledDbContextFactory<BlogContext> _factory { get; set; }
[Params(Providers.InMemory, Providers.SqlServer)]
public Providers Provider { get; set; }
[GlobalSetup]
public async Task Setup()
{
var options = Provider == Providers.InMemory
? new DbContextOptionsBuilder<BlogContext>().UseInMemoryDatabase("foo").Options
: new DbContextOptionsBuilder<BlogContext>().UseSqlServer(@"Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0").Options;
_reusableContext = new BlogContext(options);
await _reusableContext.Database.EnsureDeletedAsync();
await _reusableContext.Database.EnsureCreatedAsync();
_factory = new PooledDbContextFactory<BlogContext>(options);
}
[Benchmark]
public Blog[] Same()
{
return _reusableContext.Blogs.AsNoTracking().ToArray();
}
[Benchmark]
public Blog[] Pooled()
{
using var context = _factory.CreateDbContext();
return context.Blogs.AsNoTracking().ToArray();
}
public class BlogContext : DbContext
{
public DbSet<Blog> Blogs { get; set; }
public BlogContext(DbContextOptions options) : base(options) {}
}
public class Blog
{
public int Id { get; set; }
public string Name { get; set; }
}
public enum Providers { SqlServer, InMemory }
} |
EXCELLENT. That is indeed the missing piece here in my world. Allow me to look into this in addition to your benchmarks, and I will get back to you here when I have a better understanding. Thank you for taking the time to provide the above valuable information and for the informative discussion, @roji. It is much appreciated. On a weekend no less. 😁 |
Sounds good, am happy I could help. Also always interested if you find odd perf tidbits that could be optimization opportunities. |
OK I really wish I could mark these posts as "answers" as I am so very happy to declare that we have one now. To start, @roji your benchmarks above are very much like how I started with the very first benchmark results posted in my original/first post here. When no expressions are applied, Scoped/Pooled are very similar. However, when expressions are applied, there's a huge deviation which was my concern BUT NOW I have an answer! Check out scoped vs. compiled query!
Peep those metrics. 👀 By using a compiled query, I can store that as a singleton (replacing my I am so happy I wrote in now. But honestly, I hope it's the last time I have to do so. 😆 You all are so amazing and I hate to be a burden. However, with this major upgrade in my world it was worth risking the time and I landed on exactly what I was looking for. So, thank you once again, this is perfect! |
Ask a question
I am currently upgrading my framework to EfCore 6.0. In doing so, I have been taking the time to examine best practices to ensure that I am doing everything properly for my Blazor server-side application.
Currently, all my components and
DbContext
instances are scoped to the user. I am concerned about the memory utilization this may incur as more and more users adopt my application (🤞), but I have not been able to definitively ascertain this is an actual concern yet. I mention this as part of me is wrestling with the notion of avoiding premature optimization in my codebase, and solving a problem that does not actually exist yet.OK, so with that tidbit aside, I started to do some performance analysis around scoped vs. transactional operations, which I share with you below. When I say "transactional," I primarily mean the use of
IDbContextFactory
(and pooled ones at that, as we'll see), but in my tests I use that to mean direct activation of aDbContext
. Essentially, "transactional" means something that has to be created/disposed during an operation rather than pulled from (scoped) memory.What's beneficial and elegant about my current design is that all
IQueryable<T>
instances are defined once perDbContext
and then scoped to the user, along with theDbContext
that created them. In effect, this caches the query but also adds the memory overhead which is the concern I shared earlier.Switching everything to be transactional/
IDbContextFactory
would be very time consuming in my application, particularly all the storedIQueryable<T>
queries that I have defined that are subsequently scoped to the user.However, upon further inspection, I could take a whack at everything that is non-queryable, that is, writable, or anything involving a
DbContext.SaveChangesAsync
.So then the thought struck me, and that leads me to my question (which I will share in a bit, I promise!):
How about using a singleton
DbContext
for all queries (reads), and usingIDbContextFactory
for everything else (writes)?To do this, there are two identified issues I would need to do:
DbSet<T>
queries asAsNoTracking
.InvalidOperationException
that occurs when same-thread access occurs.There may be others, but I wanted to throw the thought out there here to see if there is anything else to consider.
The Question
So the question is: is it considered OK to use a singleton-scoped
DbContext
in a Blazor server-side application to handle all the queries (reads) of the application, whileIDbContextFactory
handles all the modifications/operations (writes)?Follow up: Is there anything really stopping this from happening from a design perspective? That is, are there limits to the amount of queries that one
DbContext
can process, assuming the thread-checking is disabled?Include your code
The other aspect here that is driving me towards this compromise in my application design, is that I did some benchmarking, and what I found was surprising with a very basic in-memory
DbContext
. You can find the code here:https://github.com/Mike-E-angelo/Stash/tree/master/EfCore.ScopedVsTransaction
Simply returning an empty
DbSet<T>
as an array returned the following metrics:Here,
Scoped
refers to caching theIQueryable<T>
in memory (ala what happens when scoping to user),Pooled
is using aPooledDbContextFactory
, andTransactional
is straight-up activating a newDbContext
.Looks like
Scoped
andPooled
are pretty much even here, and I probably would have continued to move toward a pooledIDbContextFactory
model for all of my codebase until I appended a few expressions and saw the following:It would seem that the more expressions added, the more and more
Scoped
wins. And my codebase contains a lot of very complex queries containing a lot of expressions. All of which work amazing, btw. 😁 All of which is due to your amazing work over there.So, seeing this is what got me going down this path and considering/contemplating a singleton
DbContext
to handle the reads of my application, whileIDbContextFactory
handles the writes.Also, while I am at this. Keep in mind that my
DbContext
above is completely empty, and the simplest of operations generate 2KB-23KB of allocations. To me, this seems a tad excessive and wanted to ensure this is a known issue and/or if I am doing something fundamentally wrong in my tests. I am using the In-Memory provider, which I would expect to be pretty lean in such a scenario, but pointing this out just in case.To close, I would really like to express my sincere gratitude for all your efforts out there. EfCore is really great, and the team there has been really helpful in attending to my questions/issues. I am a huge fan of this project and all your efforts. What you have made here definitely deserves a unicorn as a mascot, indeed. 🦄
Thank you for any assistance/insight you can provide. 👍
Include stack traces
NA
Include verbose output
NA
Include provider and version information
EF Core version:
6.0.0-rc.1.21416.1
Database provider:
Microsoft.EntityFrameworkCore.InMemory
Target framework: (e.g. .NET 5.0) net6.0 rc1
Operating system: Windows 10
IDE: Visual Studio 2022 Preview 3.1
The text was updated successfully, but these errors were encountered: