Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#9719] Support contextvars in coroutines #1192

Merged
merged 27 commits into from
May 1, 2020
Merged

[#9719] Support contextvars in coroutines #1192

merged 27 commits into from
May 1, 2020

Conversation

hawkowl
Copy link
Member

@hawkowl hawkowl commented Oct 20, 2019

@glyph
Copy link
Member

glyph commented Feb 25, 2020

This looks substantially complete to me, modulo maybe some minor test issues. Should it be out of draft / in review?

@jonnyyu
Copy link

jonnyyu commented Apr 21, 2020

Hi, I'm looking for such capability! Is there any plan for merging this PR?

@hawkowl hawkowl marked this pull request as ready for review April 26, 2020 17:13
glyph
glyph previously requested changes Apr 29, 2020
Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I'm an idiot, I read the test backwards, this was all wrong

I think these semantics will be extremely surprising to most users.

In asyncio (and abstractly, in PEP 567) the context follows the Task, not the Future.

This is what I think most people want when they want context. For example, consider the following

async def do_work(request_id):
    with some_context_for_logging(request_id): # manipulate a context var
        do_something_that_logs_stuff()
        api_value_1 = await some_remote_api()
        api_value_2 = await other_remote_api()
        do_other_thing_that_logs_stuff()

# elsewhere, in framework land
def other_remote_api():
    with some_context_for_logging(not_my_request_id):
        get_set_up()
        deferred = make_actual_api_request()
    return deferred

I think we can agree that if do_other_thing_that_logs_stuff would really want to get request_id and not not_my_request_id associated with its logs, even though the coroutine has unsuspended due to other_remote_api waking it up.

More importantly, if we were to api_value_1, api_value_2 = await gatherResults([some_remote_api(), other_remote_api()]) instead, small differences in both the implementation of gatherResults and the order of results being returned might affect the context values.

I hate to send this back to the drawing board entirely, but reading test_contextvarsWithAsyncAwait really confirmed my suspicions that this is, unfortunately, not a useful way to propagate async context between coroutines.

FWIW: this is the reason that this feature never really made it in stock Twisted: you really want context to follow the abstract notion of the "task" that you're performing, but without a coroutine, it's hard to tell where that really is, and there are lots of these edge cases which very clearly make it not the callback chain of an individual Deferred object.

My inclination would be to do a more limited version of coroutine support which works exclusively with ensureDeferred — then, maybe, accidentally have it work with @inlineCallbacks too, if that's easy.

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay way smaller changes this time.

Top line feedback:

I think we should use an active-nothing pattern and "emulate" the Context object when it's not present (mostly by doing nothing), especially given that it's so easy to do, and that it can't possibly be much of a perf win to avoid it since we'd only be avoiding it on older versions anyway. This would also simplify the implementation a little, which is nice, given that the code here is already pretty gnarly and subtle.

If you feel strongly for the current implementation strategy, then we should be carefully is None-ing everywhere so we don't pay for extra method-dispatch overhead on every single call (__bool__ / __nonzero__) when we absolutely never want to be doing that anyway.

src/twisted/internet/defer.py Outdated Show resolved Hide resolved
src/twisted/internet/defer.py Outdated Show resolved Hide resolved
src/twisted/internet/defer.py Outdated Show resolved Hide resolved
src/twisted/internet/defer.py Outdated Show resolved Hide resolved
src/twisted/internet/defer.py Outdated Show resolved Hide resolved
src/twisted/internet/defer.py Outdated Show resolved Hide resolved
@glyph
Copy link
Member

glyph commented Apr 29, 2020

Some IRC discussion that might be interesting for posterity:

[23:01:05]  <glyph> hawkowl: So, here's a question - is the Deferred-callbacks support for contexts necessary / related to the `inlineCallbacks` support for contexts?
[23:02:21]  <hawkowl> glyph: yes but because inlinecallbacks has multiple trampolines.
[23:02:35]  <hawkowl> so you can have multiple without actually hitting a deferred
[23:02:40]  <hawkowl> i think
[23:04:14]  <hawkowl> oh yeah that's why
[23:05:10]  <hawkowl> okay so when you first call inlineCallbacks, it doesn't make a Deferred yet
[23:05:33]  <hawkowl> so you need to run it in the context? I think?
[23:05:15]  <glyph> isn't there a base context?
[23:05:34]  <glyph> >>> contextvars.copy_context()
[23:05:34]  <glyph> <Context object at 0x7fc0d826a0d8>
[23:05:39]  <glyph> seems like
[23:07:51]  <hawkowl> right so
[23:08:11]  <hawkowl> inlinecallbacks doesn't run callbacks
[23:08:16]  <hawkowl> it pulls the result out of the deferred
[23:08:41]  <hawkowl> so we need to also keep a context per inlinecallbacks object
[23:09:10]  <hawkowl> otherwise when we resume the coroutine, it will have not-the-expected-context
[23:09:45]  <glyph> right
[23:09:52]  <glyph> so you could do these as 2 separate features
[23:10:00]  <glyph> one for inlineCallbacks which is pretty useful
[23:10:04]  <glyph> (i.e. one for coroutines)
[23:10:25]  <glyph> and one for Deferred which… I'm not convinced is that useful, but is at least separable and could be landed / discussed separately since the coroutine one is indisputably valuable
[23:11:21]  <hawkowl> you need them both, don't you?
[23:10:54]  <glyph> I don't think you do
[23:11:45]  <hawkowl> I mean it means that you have to use coroutines to use contextvars
[23:11:13]  <glyph> it does mean that
[23:11:57]  <hawkowl> which otherwise works in a deferred-style situation
[23:12:10]  <hawkowl> it also means that deferreds aren't isolated
[23:12:26]  <hawkowl> they'll modify whatever context was last, I think?
[23:12:08]  <glyph> yeah
[23:12:13]  <glyph> well, here's my concern
[23:12:59]  <hawkowl> also: lmao inlineCallbacks DOES have different behaviour
[23:13:02]  <hawkowl> :D because lol
[23:12:47]  <glyph> I think that *maybe* there is a stronger case for "callbacks should capture the context *where they are added*"
[23:12:57]  <glyph> as opposed to "where the Deferred is defined"
[23:13:07]  <glyph> I am not sure I could cohesively argue for that very convincingly right now
[23:13:41]  <glyph> but I'm concerned that if we nail down the behavior of Deferred callbacks with respect to contextvars right now, we might be in a situation we can't easily back out
[23:14:53]  <hawkowl> I'm not sure what "callbacks should capture the context where they are added" means
[23:14:14]  <glyph> whereas you already have the "not isolated" problem, and the changing semantics; just use a Deferred anywhere in an asyncio program and you're leaking context from the task where the callback is running
[23:15:27]  <hawkowl> because if you call some code, it should... be that?
[23:15:06]  <glyph> hawkowl: okay give me a sec for an example.
[23:16:10]  <hawkowl> and in that case, then you can make a new deferred, and chain it from the old one or something
[23:17:36]  <glyph> hawkowl: so, somebody types this in: https://gist.github.com/glyph/51115322ae7acd15ceee0e168e84e967
[23:17:39]  <glyph> what do they expect to happen
[23:18:15]  <glyph> lexically, it appears as though setup_log_context would apply to those callbacks
[23:18:24]  <glyph> like I said - not sure if this is right
[23:18:28]  <glyph> there may be good reasons it's not
[23:18:33]  <glyph> but I am really not sure it's wrong
[23:19:17]  <hawkowl> however, logically, you would likely never want to _not_ wrap make_api_request in that
[23:19:23]  <hawkowl> as well
[23:19:30]  <hawkowl> at least the log context
[23:20:04]  <hawkowl> also, in the deferred method, you would chain `setup_error_context` _in_ the callbacks
[23:19:30]  <glyph> would you?
[23:20:10]  <hawkowl> in the callback chain, rather
[23:19:32]  <glyph> what about this version: https://gist.github.com/glyph/821f13fea74156706b752adec603312d
[23:20:45]  <hawkowl> with makes no sense for other things for deferreds
[23:20:57]  <hawkowl> so it would be like everything else in a deferred: it needs to go in the chain
[23:21:35]  <hawkowl> that code makes no sense for every other use of with: one could do for deferreds, it's just the nature of them
[23:21:52]  <hawkowl> for example, with lock_file: deferred.addCallback(use_the_file) makes no sense
[23:22:23]  <glyph> context is inherently something you want to propagate across time on logical groups of async operations
[23:22:37]  <glyph> i.e. it's the way that you *fix* the fact that `deferred.addCallback(use_the_file)` doesn't work
[23:22:48]  <glyph> you put the file to use in the context, rather than having to pass it as a parameter
[23:23:18]  <glyph> I understand your description of the way existing things work, I'm just saying, if someone expected this to work and it didn't, how would you explain it to *them*'
[23:25:16]  <glyph> to put it a different way - closed-over variables typically propagate lexically. context typically propagates lexically. so these seem like similar things that would propagate in similar ways
[23:26:00]  <glyph> this really only makes sense if you're *defining* the callbacks inline, which is also weird: someone might expect that the callback definitions would capture them, and much like expecting `for` loop variables to stay constant you just have to explain that one as "that's not how python works"
[23:26:05]  <glyph> anyway
[23:26:12]  <glyph> this is why I want to merge the coroutine version first
[23:26:33]  <glyph> I don't have the energy for this full conversation and I don't want to stall out the obviously-useful obviously-correct-semantics part (the coroutines part) trying to figure it out
[23:28:06]  <hawkowl> glyph: i'm not sure how that fixes the use_the_file
[23:28:13]  <hawkowl> because adding the callback is... instant?
[23:28:28]  <hawkowl> like if you do with file.lock() then it'll go away before the callback fires
[23:29:10]  <hawkowl> so I think that the context not applying is not _more_ surprising than Deferred semantics
[23:28:34]  <glyph> hawkowl: okay, sure, 'with' is a weird construct here, you'd need like `with context_on_callbacks(d):`
[23:28:53]  <glyph> anyway I don't have any more energy for this
[23:29:38]  <glyph> if you want to discuss it, we'll discuss it when I have the energy for the re-review or if somebody else gets to it first :).  if you want to split out the coroutine bits I can approve in like 30 seconds
[23:30:11]  <glyph> or rather, you might dodge the discussion if somebody else gets to it first and doesn't agree with me
[23:31:36]  <glyph> (if you think the semantics are interrelated and you don't want to merge one without the other I also get that)
[23:32:30]  <hawkowl> glyph: sure, so we just say "this is for async only code" right now?
[23:32:09]  <glyph> probably like "coroutines only" or "async def" just to be clear
[23:32:49]  <hawkowl> and that it won't do entirely the right thing unless you're async/inline-callbacks all the way down
[23:32:13]  <glyph> right
[23:33:24]  <hawkowl> this hampers its usefulness for library code but that's probably not a big deal anyway
[23:33:03]  <glyph> I mean, we might get them in the same release anyway, depending on timing
[23:33:16]  <glyph> since the code is already written ;)
[23:33:51]  <glyph> I'm going to splat this whole conversation right into the ticket for posterity in case somebody else wants to think about it?
[23:34:34]  <hawkowl> glyph: so your issue with the deferred impl is that it is presently surprising?
[23:34:25]  <glyph> hawkowl: more like it *might* be surprising and I want to play with the semantics to make sure
[23:35:13]  <hawkowl> in the sense that it might not be immediately clear where the context is from
[23:34:40]  <glyph> yeah
[23:34:48]  <glyph> I think either strategy is going to be surprising to _some_ audience
[23:34:51]  <glyph> in some edge cases
[23:35:01]  <glyph> the question is, how deep does the surprise go, how useful are the alternate ways it might go down
[23:35:47]  <hawkowl> just so I can think on it: is a more explicit "setContext()" option possibly viable?
[23:35:20]  <glyph> on the deferred object itself?
[23:36:02]  <hawkowl> yes
[23:36:06]  <hawkowl> maybe
[23:35:29]  <glyph> hmm
[23:35:45]  <glyph> so like… callbacks get added with the deferred's original context… then you call setContext, and later Deferreds get the different context?
[23:35:54]  <glyph> erm
[23:35:56]  <glyph> later callbacks
[23:36:02]  <glyph> (exactly what I mean by not having the energy rn)
[23:36:52]  <hawkowl> yeah I mean I have been awake for a Long Time
[23:36:52]  <glyph> yeah that might be a good idea too. maybe?
[23:37:04]  <glyph> maybe it's too fiddly though
[23:37:49]  <hawkowl> alternative, recaptureContext as an arg to addCallback, but then we have to keep track of/possibly merge contexts

@hawkowl hawkowl changed the title [#9719] Support contextvars [#9719] Support contextvars in coroutines Apr 30, 2020
@@ -0,0 +1 @@
twisted.internet.defer.inlineCallbacks and ensureDeferred will now capture the current contextvars context when ran, and restore the context when resuming the coroutine between yields. This functionality requires Python 3.7+, or the contextvars PyPI backport to be installed for Python 3.5-3.6.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grammar / agreement nitpick: "when ran" ➡️ "when running"

(also: pretty sure contextvars only supports 3.6)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the backport I mean

@glyph glyph self-requested a review May 1, 2020 07:54
@glyph
Copy link
Member

glyph commented May 1, 2020

L G T M. Will merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants