Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch locals to be based on ContextVars #1778

Merged
merged 1 commit into from Feb 2, 2021
Merged

Conversation

pgjones
Copy link
Member

@pgjones pgjones commented Mar 28, 2020

ContextVar was introduced in Python 3.7 and is effectively the same as Local in Werkzeug (although a different API). They should work in Greenlet and threading contexts as before but also now in asyncio or other async/await contexts.

The __storage__ attribute has been kept for backwards compatibility (even though it is a dunder it seems it is used, e.g. in tests).

The ident_func though must be deprecated as this is now handled by the ContextVar. This may cause some backwards incompatibility.

@pgjones pgjones force-pushed the local branch 2 times, most recently from 02284ef to 9254935 Compare March 28, 2020 19:05
@pgjones pgjones changed the title WIP: Switch locals to be based on ContextVars Switch locals to be based on ContextVars May 10, 2020
@pgjones
Copy link
Member Author

pgjones commented May 10, 2020

For context there is a discussion on discord about keeping __ident_func__ such that it warns. My view is that it shouldn't warn, but rather error (as it is missing) to prevent any surprises from code that uses it. I don't think any other libraries use it though...

@davidism
Copy link
Member

Need to figure out what to pass to Flask-SQLAlchemy's scoped session, since it's using _app_ctx_stack.__ident_func__ right now. Shouldn't be a backwards compatibility issue, as I plan to bump Flask-SQLAlchemy 3.0's dependency to Flask 2.0, which will depend on Werkzeug 2.0.

@davidism
Copy link
Member

davidism commented Jun 5, 2020

Looks like Gevent now patches contextvars: gevent/gevent#1512 Can't find exactly what release this was, it's not in the changelog, but there is a mention of contextvars in 20.04, so at least since April.

@davidism
Copy link
Member

davidism commented Jun 5, 2020

@untitaker I know you've been working on contextvars in Sentry recently, any chance you could review this?

@untitaker
Copy link
Contributor

I would definetly try to detect gevent and eventlet monkeypatching because if you run Flask under Python 3.7 with gunicorn you're gonna get a broken out of the box experience with this PR.

Here's the impl in the Sentry SDK: https://github.com/getsentry/sentry-python/blob/36ed64eb0f65a0abae83fd5eacf1a524e2d17a37/sentry_sdk/utils.py#L754

Here's the documentation: https://docs.sentry.io/platforms/python/contextvars/

Django also has its own implementation under asgiref.local which does not use contextvars at all but still behaves in a way that seems to work out fine for the most part. It seems like a more useful approach to context locality for codebases that need to support Python 3.6 codebases than the contextvars themselves, but there are some issues such as django/asgiref#170 with it (nothing that can't be fixed though, I think. Still investigating).

@untitaker
Copy link
Contributor

I'm still figuring out if asgiref.local or aiocontextvars from PyPI is a better approach for having context locals in Python 3.6 that work with asyncio. I think it's nice that aiocontextvars can never go out of sync with contextvars (as seems to be the case with asgiref.local), at the same time aiocontextvars monkeypatches contextvars and asgiref.local doesn't.

@davidism
Copy link
Member

davidism commented Jun 5, 2020

Given that 3.6 is on its way out, and async support is very optional right now, maybe it would be ok to have an optional dependency on aiocontextvars, or just keep the other local implementation for 3.6 and say that async support is 3.7+ only.

@davidism
Copy link
Member

davidism commented Jun 5, 2020

Actually, it looks like Sentry is doing something similar to what this PR is doing, falling back to threading.local, so maybe we just need to bring this PR in line with that? Could you elaborate on what would be broken in this implementation?

@untitaker
Copy link
Contributor

untitaker commented Jun 5, 2020

Sentry has two things that differentiate it from this PR:

  • It crashes hard (taking down the server) under circumstances where it determines that the contextvars implementation is unsafe for use (such as when gevent/eventlet have monkeypatched threading but not the contextvars) but an integration for an asyncio/ASGI web framework has been enabled. See HAS_REAL_CONTEXTVARS and how it is being read in the rest of the SDK.

    This is maybe controversial for the Sentry SDK but I think absolutely required for Flask, considering you're using context locals for session data and misbehaving context locals can leak user sessions across request boundaries. While it's funny when it happens to Steam it may be less funny in other circumstances.

  • It detects gevent/eventlet monkeypatching and, based, on that, avoids contextvars if it has not also been patched (_is_contextvars_broken). This is IMO important to get right as gunicorn+gevent is a very popular WSGI server choice even on Python 3.7. gevent has fixed this in a recent version but you'd have to check for that version, and eventlet is not seeing much maintenance so don't expect anything there.

@davidism
Copy link
Member

davidism commented Jun 5, 2020

Thanks! @pgjones looks like there's some work to do before merging this.

@davidism
Copy link
Member

@pgjones can you take another look at the feedback here?

@untitaker
Copy link
Contributor

untitaker commented Oct 29, 2020

We have an official summary of this now in Sentry docs: https://docs.sentry.io/platforms/python/troubleshooting/

@pgjones if you are no longer planning on pursuing this lmk I can probably continue your work, if necessary

@untitaker
Copy link
Contributor

Also note that asgiref works fine now, so perhaps it makes sense to just hard-require asgiref in Flask (and ahve it be an extra in werkzeug)

@pgjones
Copy link
Member Author

pgjones commented Oct 29, 2020

How about this version? It follows the sentry checks to see if ContextVars are usable, if not it will fall back to the current implementation.

@untitaker
Copy link
Contributor

untitaker commented Oct 29, 2020 via email

@pgjones
Copy link
Member Author

pgjones commented Oct 29, 2020

Note: I've asked on the eventlet issue tracker about ContextVars.

Some extra thoughts:

  • I'm not convinced about using aiocontextvars as, as I understand it, they are not shared with child tasks like ContextVars are - better I think to document that Werkzeug only supports asyncio locals with Python 3.7+.
  • I think we should avoid asgiref if possible, as it isn't even clear it is the correct choice for Flask (where the async motivation is stronger).

src/werkzeug/local.py Show resolved Hide resolved
src/werkzeug/local.py Outdated Show resolved Hide resolved
@davidism
Copy link
Member

Is it worth supporting gevent<20.5? If we leave out the check and just use a ContextVar when gevent/eventlet doesn't patch it, what happens? If we're going to keep this fake class that emulates thread local storage, it should probably show a warning when it's used because it could cause unexpected interactions.

There should probably be a test that use async to demonstrate that these work in coroutines, although the current test that uses threads is prone to failure in CI.

@pgjones
Copy link
Member Author

pgjones commented Oct 30, 2020

If we leave out the check and just use a ContextVar when gevent/eventlet doesn't patch it, what happens?

If you mean a stdlib ContextVar it would result in locals not being local to green threads, and hence things working very weirdly (leaking between requests).

If we're going to keep this fake class that emulates thread local storage, it should probably show a warning when it's used because it could cause unexpected interactions.

I don't think we can, as at the moment it would be used for all eventlet implementations, and hence warn the user when they have nothing to worry about. The unexpected interactions would only be if the user is using asyncio/trio/curio and eventlet/gevent which I think is unlikely and in addition detecting to warn is too difficult and not worth the effort.

pgjones added a commit to pallets/quart that referenced this pull request Oct 30, 2020
Whilst not used this way with Quart, if a Local is created and
assigned a value before being accessed across tasks (requests) it
would erroneously share the values across the tasks. This is because
the dictionary would be mutated, rather than replaced - hence the copy
fix.

See also pallets/werkzeug#1778 for the future
replacement of Quart's locals.
@pgjones pgjones force-pushed the local branch 2 times, most recently from 10b8d27 to 6bd6618 Compare October 30, 2020 10:55
@davidism
Copy link
Member

davidism commented Oct 30, 2020

I always mix up that the import is from Greenlet, not Gevent, it's shared by both Gevent and Eventlet. So if either library doesn't patch contextvars, the fallback keys off of greenlet.getcurrent().

We should plan on removing the fallback once we drop Python 3.6 and Eventlet supports context vars.

@pgjones
Copy link
Member Author

pgjones commented Oct 30, 2020

We should plan on removing the fallback once we drop Python 3.6 and Eventlet supports context vars.

Yep, fully agree.

Copy link
Contributor

@untitaker untitaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. For future work, I would forbid use of ASGI if the real contextvar is not used. Due to security reasons.

@davidism
Copy link
Member

davidism commented Oct 30, 2020

I'm reading the contextvars docs and looking at how Werkzeug's Local is implemented and used. One interesting quote linked from PEP 567 (https://www.python.org/dev/peps/pep-0550/#replication-of-threading-local-interface) is:

Choosing the threading.local()-like interface for context variables was considered and rejected for the following reasons:

  • A survery of the standard library and Django has shown that the vast majority of threading.local() uses involve a single attribute, which indicates that the namespace approach is not as helpful in the field.
  • ...
  • Single-value ContextVar is easier to reason about in terms of visibility. ...

I'm not sure of the timeline, but Werkzeug probably originally had to support Python 2.3, which didn't have threading.local. If it was started a few years later, it probably wouldn't use its own implementation like this. Maybe we should consider moving away completely from the Local interface and have LocalStack and LocalProxy use ContextVar directly when we drop Python 3.6.

@pgjones
Copy link
Member Author

pgjones commented Nov 29, 2020

Maybe we should consider moving away completely from the Local interface

I'm not sure that things would be easier to reason about in the Werkzeug context if this was done.

@davidism
Copy link
Member

The way locals are used in Werkzeug and Flask is to hold one value (or one list). Werkzeug would still have LocalStack and LocalProxy, but Local itself doesn't seem necessary for Werkzeug's own operation, it's just an extra layer. Other code that does need a namespace can be adapted to use ContextVar("v", default=Namespace()); v.get().x = 1.

ContextVars have been introduced into Python 3.7 and are effectively
the same as Locals in Werkzeug (although a different API). They should
work in a Greenlet, and threading, context as before but also now in
an asyncio, or other async/await context.

The __storage__ attribute has been kept for backwards compatibility
(even though it is a dunder it seems it is used, e.g. in tests).

The ident_function though must be deprecated as this is now handled by
the ContextVar. This may cause some backwards incompatibility.

This is based on the Quart local implementation.

If ContextVars aren't available (Python 3.6) or there is an issue
using them i.e. Gevent/Eventlet have no monkey patched them then the
local will fall back to the previous (green) thread identity based
version.
@untitaker
Copy link
Contributor

This PR is IMO missing a facility to check if the real contextvars is being used for locals. I am not sure what the plans are, but I just want to reiterate my suggestion to forbid any usage of ASGI without real contextvars being available. Even Flask supports only 3.7, older gevent/eventlet can still mess locals up.

@pgjones pgjones deleted the local branch February 2, 2021 09:17
@pgjones
Copy link
Member Author

pgjones commented Feb 2, 2021

So Quart already forbids this (3.7+), but for Flask maybe we should add a Py version check in the ensure wrapper?

@untitaker
Copy link
Contributor

@pgjones right but since the werkzeug impl checks for broken gevent/eventlet it can happen that werkzeug ends up chosing not to use contextvars even if on Python 3.7. So a version check would be insufficient. The check would have to be closer to ContextVars is FakeContextVars

@pgjones
Copy link
Member Author

pgjones commented Feb 10, 2021

@untitaker do you think it is now sufficient to check if from greenlet import GREENLET_USE_CONTEXT_VARS; GREENLET_USE_CONTEXT_VARS is True? Rather than the monkeypatching checks? (See contextvar support in Greenlet).

@untitaker
Copy link
Contributor

untitaker commented Feb 10, 2021 via email

@davidism
Copy link
Member

I think I'm ok with "if greenlet isn't new enough, fall back to fake implementation" now, and making it "raise RuntimeError" once we drop 3.6. Since gevent still patches on older greenlets, we could still have that as an extra check too.

@untitaker
Copy link
Contributor

untitaker commented Feb 10, 2021 via email

@untitaker
Copy link
Contributor

untitaker commented Feb 10, 2021 via email

@pgjones
Copy link
Member Author

pgjones commented Feb 10, 2021

I don't think there are any plans for Flask to work as an ASGI application. I will add a hard check to the async (ensure_sync) proposal though.

See #2039 for a proposed greenlet update.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants