Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #33646 -- Added async-compatible interface to QuerySet. #14843

Merged
merged 1 commit into from
Apr 27, 2022

Conversation

andrewgodwin
Copy link
Member

@andrewgodwin andrewgodwin commented Sep 8, 2021

This adds an async-compatible set of methods and special methods to the QuerySet class (and, via the generic pass-through, Managers as well).

Included are:

  • Async versions of all methods that do not return a QuerySet themselves, with an a prefix: aiterator, aaggregate, acount, aget, acreate, abulk_create, abulk_update, aget_or_create, aupdate_or_create, aearliest, alatest, afirst, alast, ain_bulk, aupdate, adelete, aexists, acontains, aexplain. Most are just wrappers around the underlying sync version, though some have a few performance shortcuts added.
  • Async iterator ability on the BaseIterable that propagates through to all QuerySets as well as the results of values(), values_list() etc. It's not terribly efficient, but we can improve this progressively once we teach compiler.results_iter the wonders of async.
  • A new alist() utility function for turning async iterables into lists asynchronously, because we do use list() quite a bit.

As a nice example, this means you can now write this kind of view:

async def myview(request):
    results = []
    async for row in TestModel.objects.filter(good=True):
        results.append(row)

    user = await TestModel.objects.aget(name=Andrew)

    return render(request, "index.html", {"results": results, "user": user})

Remaining work:

  • Ensure transactions error properly (we'll get these in another pass)
  • Tests
  • Documentation

django/db/models/query.py Outdated Show resolved Hide resolved
@dvarrazzo
Copy link
Contributor

Does this feature use native async drivers?

Dropping these references, in case you are not aware: psycopg 3 async support and psycopg 3 django driver.

@andrewgodwin
Copy link
Member Author

Does this feature use native async drivers?

No - getting to that point will be a lot of work. This is the first phase, where we outline the asynchronous API and mostly make it work via threadpools. The plan is to progressively work down the stack towards native drivers, but that'll probably take quite a while.

django/db/models/query.py Outdated Show resolved Hide resolved
django/utils/asyncio.py Outdated Show resolved Hide resolved
Copy link

@pwtail pwtail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async for?. Maybe just objects = await queryset? I don't think making fetching a row an async operation is a good idea (of course, they will actually be fetched in chunks). We rarely need iteration over objects really

docs/topics/db/queries.txt Outdated Show resolved Hide resolved
@andrewgodwin
Copy link
Member Author

async for?. Maybe just objects = await queryset? I don't think making fetching a row an async operation is a good idea (of course, they will actually be fetched in chunks). We rarely need iteration over objects really

If you want to fetch the whole queryset you can just use alist(qs) or an async list comprehension - but the reason to offer an alternative here is the same as in normal Django, which is that you might have a million rows returned and they won't all fit into memory.

@johnthagen
Copy link

For what is it's worth, I find the a prefix makes some of these methods hard to read, for example aaggregate. With editor auto complete so great these days, I personally think async_aggregate or at least a_aggregate would make Django code not readable, especially for those new to Django at little cost to devs writing the code.

@felixxm
Copy link
Member

felixxm commented Oct 20, 2021

For what is it's worth, I find the a prefix makes some of these methods hard to read, for example aaggregate. With editor auto complete so great these days, I personally think async_aggregate or at least a_aggregate would make Django code not readable, especially for those new to Django at little cost to devs writing the code.

This discussion is already over, see also Django 4.0 release notes.

@dralley
Copy link

dralley commented Nov 29, 2021

Apart from questions around naming, where is this PR blocked? Is the design sound and simply requires tests, or is it still needing design changes?

@carltongibson
Copy link
Member

@dralley It's just waiting for an opportunity to review. The PR came in too-hot for the 4.0 feature freeze so we've been dealing with 4.0 things, and will look to target 4.1 for these changes. (It's next on my list of bigger items as it happens.)

@carltongibson carltongibson force-pushed the async_queryset branch 2 times, most recently from e64b050 to 31c067d Compare January 27, 2022 11:51
@carltongibson
Copy link
Member

OK — big thanks to @andrewgodwin for laying this down, and @fcurella for
blocking out the tests yesterday — I'm going to mark this Ready for Review.

There's more to be done, but it would be good to get a rough set of TODOs
together, so that we can look to land it in Django 4.1. Particularly @charettes
and @felixxm, can I ask you to give it a pass with that medium to high-level
overview in mind? (What do you want to see? Are there any blockers?)

A few points:

At this stage it's essentially just the API, but are we happy with that?
(We're more or less committed once it's in, so niggles now… 🙂)

There's a couple of failures on the Windows/SQLite build. Non-matching PKs. I
assume I did something wrong in 31c067d, but I
haven't quite got the energy to dig into it this afternoon, and I (more than)
half-expect you'll spot the issue instantly.

Then, the aexplain tests needs integrating with the tests in
queries/test_explain. It's a bit more complex than just asserting against the
output.…

Given what's there, we'd want to do something like this:

# Doesn't work. 
with self.subTest(format=format, queryset=idx):
    with self.assertNumQueries(1), CaptureQueriesContext(connection) as captured_queries:
        result = queryset.aexplain(format=format)  # Note: aexplain here. 
        self.assertTrue(captured_queries[0]['sql'].startswith(connection.ops.explain_prefix))

But the context managers aren't async compatible (since they call
ensure_connection) currently, so it needs a little thought.

Q: Do we need an aexplain at all? (Discuss)

Overall, I like how minimal it is: it gives a nicer layer to work down to fully
async DB adapters behind — but I'd appreciate your consideration at this point.
Thanks 🙏

@bigfootjon
Copy link
Contributor

Q: Do we need an aexplain at all?

My Django site is fully async (there are no sync views due to a custom ACL framework I’ve written that is async), so I would be required to ‘async_to_sync(qs.explain)()’ which feels odd considering the rest of the API has async equivalents.

Copy link
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewgodwin Many thanks 👍 I really like it, IMO it's the right first step to move async-ORM forward 🎊 🥳 I can check Windows failures, and do a detailed review later.

Do we need an aexplain at all?

Yes we do 🛠️

django/db/models/query.py Outdated Show resolved Hide resolved
django/db/models/query.py Outdated Show resolved Hide resolved
Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments but nothing major beyond the BaseIterable.__aiter__ and alist work.

I'd add a mention that this is not true async database I/O per se in the announcement for now but mainly syntactic sugar to avoid having to do sync_to_async all over the place when interacting with the user-facing part of the ORM.

It's not to diminish the efforts made in this direction in the announcement but just to make it clear that until we land async compiler and backend support we'll still be heavily relying on threading which might not be what users expect when interacting with an async compatible interface to a database.

django/db/models/query.py Outdated Show resolved Hide resolved
django/db/models/query.py Outdated Show resolved Hide resolved
``get()``
~~~~~~~~~

.. method:: get(*args, **kwargs)

*Asynchronous version*: ``aget``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be great to find a way to have referenceable versions of these definitions instead of raw text references.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so a quick play and this works:

 .. method:: get(*args, **kwargs)
+.. method:: aget(*args, **kwargs)

That generates this with the dev stylesheet:

Screenshot 2022-02-23 at 08 31 55

Both of those anchors are linkable with:

:meth:`get`
:meth:`aget`

… as usual.

I couldn't put the *Asynchronous version* on the same line; likely with a bit of thought we could do something more sophisticated.

django/utils/asyncio.py Outdated Show resolved Hide resolved
@carltongibson carltongibson force-pushed the async_queryset branch 3 times, most recently from 040d5a2 to 9559c9c Compare April 12, 2022 11:27
@felixxm felixxm force-pushed the async_queryset branch 2 times, most recently from e5924ed to a544d97 Compare April 14, 2022 09:27
Copy link
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewgodwin Thanks 👍 I simplified tests and left comments.

django/db/models/query.py Outdated Show resolved Hide resolved
django/db/models/query.py Show resolved Hide resolved
django/db/models/query.py Show resolved Hide resolved
docs/topics/async.txt Show resolved Hide resolved
docs/topics/db/queries.txt Show resolved Hide resolved
@felixxm felixxm force-pushed the async_queryset branch 3 times, most recently from 62bcfb9 to 13646fe Compare April 15, 2022 11:04
@felixxm
Copy link
Member

felixxm commented Apr 15, 2022

I pushed "small" 😉 edits ✏️ . I'll continue next week.

@felixxm felixxm changed the title Add async-compatible interface to QuerySet Fixed #33646 -- Added async-compatible interface to QuerySet. Apr 15, 2022
@felixxm felixxm force-pushed the async_queryset branch 3 times, most recently from 0618370 to 41c1101 Compare April 19, 2022 06:33
@felixxm
Copy link
Member

felixxm commented Apr 21, 2022

buildbot, test on oracle.

django/db/models/query.py Outdated Show resolved Hide resolved
django/db/models/query.py Outdated Show resolved Hide resolved
@felixxm
Copy link
Member

felixxm commented Apr 26, 2022

I rebased, squashed commits, and tried to implement fetching in chunks to reduce the number of sync_to_async() calls.

@charettes What do you think?

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adjustments @felixxm, they look great. This should make a noticeable difference in performance.

django/db/models/query.py Show resolved Hide resolved
Thanks Simon Charette for reviews.

Co-authored-by: Carlton Gibson <carlton.gibson@noumenal.es>
Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
@felixxm
Copy link
Member

felixxm commented Apr 26, 2022

@charettes Thanks for reviews ⭐

I'm going to prepare another PR to make RawQuerySet async-compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet