-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow nesting sync_to_async via asyncio.wait_for #367
Allow nesting sync_to_async via asyncio.wait_for #367
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These kind of changes are incredibly hard to fully review, but it seems reasonable and also passes all the current tests (which are pretty good at catching bad behaviour!).
Hi @jthorniley
I'm still looking at this. There are two issues it looks like: even with the patch here, @brownan's example from #348 still hits a deadlock. |
The test-case doesn't quite match the behaviour of the example running under Daphne. In the test the executors selected in SyncToAsync.call are:
Under Daphne we get the same |
1650d42
to
a4a2216
Compare
Hi @carltongibson thanks for your comments, sorry I haven't had time to look at this for a while. I've tried an alternative approach which I think makes more sense. Instead of changing how we look up the executor, set the reference to the current thread executor using the contextvars lib - since Python 3.7 is the minimum supported version this seems like its probably an option when it wasn't before. The asgiref local has to be maintained across sync_to_async/async_to_sync conversions using the I've tried this implementation with the example ASGI code in issue #348 - and it seems to work (GET request completes and the page renders "Hello world"). I've tried daphne and uvicorn and they both seem ok with this. One problem is that one of the deadlock unit tests now fails, I think because there isn't a deadlock anymore! (It just actually works). I think its probably safe to just remove the test? |
I'd comment that also, while this "fixes" the current thread executor issue, it also "breaks" asgiref.Local, which will not be maintained across a stack of sync_to_async/async_to_sync if theres a asyncio.wait_for etc somewhere in the middle. I'm not sure if thats really fixable except by either reimplementing Local with contextvars, or just deprecating it in favour of contextvars (Django seems to use it in a few fairly disparate places) |
The intention was always to implement Local via contextvars once we hit the minimum version to support it, so I wouldn't be against doing that. |
That makes sense, contextvars is python 3.7 which seems to be now the min supported version, I did try to actually do that first and I have an implementation that seems to work, but I wasnt too happy with the way it handles deleting keys, see here: https://github.com/django/asgiref/compare/main...jthorniley:asgiref:local-contextvar-draft?expand=1 (Comments in the diff, I won't repeat but tldr I'm worried I've made a memory leak) |
Hi @jthorniley — thanks for swinging back to this. For life reasons, it's not likely I'll have capacity to look at this closely soon, so let me just comment quickly not to block:
Sounds right 😅
This would be great IMO! I briefly glanced at this back last year, but didn't get to progress it. If you could push it through, super! 🎁 There's even a comment about this: Lines 29 to 30 in 0357158
|
Thanks both for the comments. I've updated the PR to use contextvars as an implementation for Local, and reverted the logic that keeps track of the CurrentThreadExecutor to use that new Local implementation rather than a plain contextvar, as I found in testing that plain contextvars are not thread safe when you are passing them between threads as is the case here. Anyway, I think this implementation is fairly solid from what testing I've managed to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have no idea how delighted I am that this is net-deletion of code! From what I can see it looks good, and honestly I mostly place my faith in the tests for this sort of thing anyway.
If you think this is ready to merge, I will happily land it and we can get around to a new minor release of asgiref.
Talking of tests... those and the lint failures will need fixing first! |
The test failure was a good catch - the state was not cleaned up properly so it only occurred on the full test run rather than running that test individually. Lint issue also fixed. Whether this is ready to merge... as you say its hard to assess this sort of thing beyond the tests. I'm tempted to wait until I've rolled out this branch at work on Monday just to give some extra confidence from a "real-world" test (I can't see what would go wrong, and its passed all our CI). I will update then. |
Alright, sounds like a good plan! |
@jthorniley Did you manage to get any testing of this in this week? |
@andrewgodwin yes and no I'm afraid... We rolled it out but had to roll back due to (unrelated to this change) infrastructure issues and haven't managed to redo this. So yes, this still looks fine still as far as I can see, but didn't get a proper test really. |
Ah, no problem - let's wait a bit longer to see if you can give it some more burn-in time. Hopefully the other issues are easy to fix! |
Ok, we've tried rolling out again and actually I think there are some issues with this branch so unfortunately it wouldn't be good to merge now. Using this change (with a Django/ASGI app) we end up with too many database connections getting created, I think because they are held in a local by Django and the behaviour has changed in some unfortunate way (hard to track down exactly). Additionally (trying to track down something more concrete) I found that running the current Django tests with this asgiref branch results in a failure:
I still think this approach (changing to |
Hmm, well that's annoying, but not entirely unexpected. I can only presume that the behaviour around synchronous contexts is different enough to make it break - contextvars inherit in a different way to what Local currently does. |
ff83f49
to
b71b833
Compare
I think this is now a viable solution. I'll just note that there's a separate issue #399 which is also directly affected by this (i.e. this essentially does that). So this went stale as I got a bit stuck but I've had time to work on it, so I've rebased and addressed a few issues: Local(thread_critical=True)The behaviour of Local(thread_critical=True) has a critical subtlety which I've added a test for in Essentially, when In an Django ASGIHandler server, there will be one sync thread per request, and it will get a DB connection initialised local to that thread (typically) somewhere inside the async view code, wrapped in a Its easy to confirm that ApplicationCommunicatorThe change in behaviour of When the django test code run an However that means when using The solution was to create a new |
I really appreciate your work on this! Looks like the tests are failing in CI, at least - different environment issue to you locally, or something else, you think? |
@andrewgodwin thanks for running it - yes just taking a quick look: I think CI is running mypy which I didn't run locally, but I'm sure I can tidy that up. Additionally used py3.11 locally - and the tests do work there in CI, i'm guessing some API changes between 3.10 - 3.11 that I didn't account for, I'll investigate that today. |
Thanks - looks great now. I'll merge it in, and hopefully when we make a release with this in we won't get any immediate screaming! (The test suite looks good and pretty decently covers this, but there's always the chance of uncaught side effects) |
@andrewgodwin thanks! |
Change the order of fallbacks used by SyncToAsync to find the appropriate executor for sync code, so it prefers to use AsyncToSync.loop_thread_executors rather than thread_sensitive_context. Add test case to demonstrate problem.
Resolves #348