Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use contextvars when available, to support async frameworks #2072

Closed
wants to merge 4 commits into from
Closed

Use contextvars when available, to support async frameworks #2072

wants to merge 4 commits into from

Conversation

tiangolo
Copy link

@tiangolo tiangolo commented Dec 6, 2019

Summary

This is a backwards-compatible change to use contextvars instead of threading.local when available.

contextvars is only available in Python 3.7 and above, so the implementation falls back to threading.local when not available.

This fixes errors for async frameworks (like FastAPI) when used with a load that requires more than one single request concurrently.

Description

threading.local creates a value exclusive to the current thread, but an async framework would run all the "tasks" in the same thread, and possibly not in order. On top of that, an async framework could run some sync code in a threadpool (run_in_executor), but belonging to the same "task".

This means that, with the current implementation, multiple tasks could be using the same threading.local variable, and at the same time, if they execute sync IO-blocking code in a threadpool (run_in_executor), that code won't have access to the variable, even while it's part of the same "task" and it should be able to get access to that.

Reproducing the error

To demonstrate the error, take this example FastAPI application:

from datetime import date
import time

from fastapi import FastAPI, Depends
import peewee

db = peewee.SqliteDatabase("my_app.db", check_same_thread=False)


class Person(peewee.Model):
    name = peewee.CharField()
    birthday = peewee.DateField()

    class Meta:
        database = db


db.connect()
db.create_tables([Person])


if not Person.select().first():
    uncle_bob = Person(name="Bob", birthday=date(1960, 1, 15))
    uncle_bob.save()


db.close()

app = FastAPI()


def get_db():
    db.connect()
    try:
        yield db
    finally:
        if not db.is_closed():
            db.close()


@app.get("/", dependencies=[Depends(get_db)])
def read_root():
    # Simulate a potentially long task to make it easy to
    # demonstrate the error
    time.sleep(15)
    return Person.select().get()

Run it with:

uvicorn main:app

Open the browser at http://127.0.0.1:8000/docs in 5 tabs at the same time.

Use the "Try it out" button and the "Execute" button for the single / endpoint, in each one of the tabs. With a small time difference between each one (1 sec).

After 15 seconds, one (or more) of the tabs will show the response, while the rest will show a 500 error.

And the console will show an error like:

peewee.OperationalError: Connection already opened

What happens is that:

  • The first tab starts a request
  • The application opens the database connection to handle the first request
  • Meanwhile, another tab starts another request
  • The application tries to open the database connection to handle the second request
  • It currently expects to handle that second request in a different thread, but as it is on the same thread, it's accessing the same object, with the connection already opened by the previous request. And more importantly, the same connection object.

Note

An equivalent error would occur when using a middleware, with an example comparable to the one in the docs.

Although the current example in the docs starts a connection at application startup and closes it right before terminating the application. It doesn't really create a connection per-request, that would be achieved with a middleware (or better, with a dependency, as in the example above).

Proposed fix

The proposed fix uses tries to import contextvars (Python 3.7+) and updates the _ConnectionLocal class to keep behaving as normal, but use contextvars underneath when available. When not, fall back to threading.local. It keeps the interface of _ConnectionLocal the same to simplify any external code using it.

contextvars is backward-compatible. In Python 3.7+, with synchronous code, it will provide the same behavior that threading.local would have.

https://www.python.org/dev/peps/pep-0567/#backwards-compatibility

This proposal preserves 100% backwards compatibility.

Libraries that use threading.local() to store context-related values, currently work correctly only for synchronous code. Switching them to use the proposed API will keep their behavior for synchronous code unmodified, but will automatically enable support for asynchronous code.

Running the same example from above, using this branch, should return the responses without errors.

Updated docs

I also updated the docs for FastAPI, to use normal def functions and a dependency, as that would probably be the best way to structure it in FastAPI.

@coleifer
Copy link
Owner

coleifer commented Dec 8, 2019

I'm going to pass on this implementation. I think a more correct approach would be to create a standalone connection-state manager for asyncio usage. This could be provided by a mixin that overrides the Database.__init__() method and assigns compatible values to _state and _lock (also note that this patch does not handle _lock).

@coleifer coleifer closed this Dec 8, 2019
@tiangolo
Copy link
Author

tiangolo commented Dec 9, 2019

I see. Would you be willing to accept a mixin like that as part of Peewee?

About _lock, its job is to bock all the other threads that want to use the same lock. In an async framework, Peewee interactions would/should run in a threadpool with run_in_executor, and threading.lock would still be blocking any other thread that tries to use the same lock. So it would behave the same and achieve the same.

In contrast, contextvars are needed because the same connection data would be used in more than one thread, so it needs to be shareable. And the same thread could end up handling different connections, so they need to be dependent on their async context, not the thread.

@coleifer
Copy link
Owner

coleifer commented Dec 9, 2019

Would you be willing to accept a mixin like that as part of Peewee?

I'm not sure yet. Peewee isn't async internally; doesn't use async idioms, doesn't use async drivers, etc. At the same this is a legit issue for people trying to use Peewee with asyncio. That said, the solution is 3.7+ only, which I guess is a very small number of users.

However I think most people who want to use Peewee with asyncio just use peewee-async. So I'm not sure if there's an actual need for this change.

At the end of the day I'm always in favor of playing to pythons dynamism and using gevent, which just works.

@tiangolo
Copy link
Author

Get it. Thanks for the feedback. I'll document it accordingly in FastAPI.

@kkinder
Copy link

kkinder commented Dec 18, 2019

Just chiming in: I have a FastAPI+Peewee project going, mostly because I enjoy each project's simplicity. I didn't choose FastAPI because it's async, I choose it because it was a quick and easy way to build a project.

I think some better support between the two projects would be worthwhile, even if Peewee will never be async-ready. FastAPI is asyncio-based, but is so much more than just for async, so I'd urge you to consider the mixin.

@tiangolo tiangolo deleted the optional-contextvars branch April 10, 2020 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants