Skip to content

Conversation

felixxm
Copy link
Member

@felixxm felixxm commented Apr 20, 2022

This separates data fetching from data preparation, see #14843 (comment).

@carltongibson
Copy link
Member

Oh, nice @felixxm! 👍 I would ask @charettes to have a look. 👀

@felixxm
Copy link
Member Author

felixxm commented Apr 20, 2022

The following changes are required on the top of #14843 with this patch:

diff --git a/django/db/models/query.py b/django/db/models/query.py
index d6db88dab2..b6e8ef3729 100644
--- a/django/db/models/query.py
+++ b/django/db/models/query.py
@@ -534,9 +534,10 @@ class QuerySet:
         use_chunked_fetch = not connections[self.db].settings_dict.get(
             "DISABLE_SERVER_SIDE_CURSORS"
         )
-        async for item in self._iterable_class(
+        iterable = await sync_to_async(self._iterable_class)(
             self, chunked_fetch=use_chunked_fetch, chunk_size=chunk_size
-        ):
+        )
+        async for item in iterable:
             yield item
 
     def aggregate(self, *args, **kwargs):
@@ -1821,7 +1822,8 @@ class QuerySet:
 
     async def _async_fetch_all(self):
         if self._result_cache is None:
-            self._result_cache = [result async for result in self._iterable_class(self)]
+            iterable = await sync_to_async(self._iterable_class)(self)
+            self._result_cache = [result async for result in iterable]
         if self._prefetch_related_lookups and not self._prefetch_done:
             sync_to_async(self._prefetch_related_objects)()
 

Copy link
Member

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @felixxm, I'm not sure I understand the current approach though as the point of #14843 (comment) was to prevent having BaseIterable.__aiter__ cause the creation of tons of sync_to_async jump (one instance created for each retrieved row and a resulting colour jump). I don't see harm in merge these changes though as they seems like a valuable separation of concern.

I'd be curious to see what effect this has on performance as it seems like it will quickly add up when dealing with large data set. Have we considered buffering results in custom_next somehow to reduce jumps between sync and async contexts? Does asgiref offers any better way with dealing the conversion of sync I/O generator to async ones?

@felixxm
Copy link
Member Author

felixxm commented Apr 25, 2022

Thanks for the patch @felixxm, I'm not sure I understand the current approach though as the point of #14843 (comment) was to prevent having BaseIterable.__aiter__ cause the creation of tons of sync_to_async jump (one instance created for each retrieved row and a resulting colour jump). I don't see harm in merge these changes though as they seems like a valuable separation of concern.

Ahh, sorry 🤦‍♂️ . It seems that I completely misunderstood your concerns about the BaseIterable.__aiter__() implementation. We can then keep it as a separate cleanup or close it 🤷 .

@felixxm
Copy link
Member Author

felixxm commented Apr 26, 2022

Closing for now.

@felixxm felixxm closed this Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants