Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prefetch_related support to ParentalManyToManyField #122

Closed
wants to merge 2 commits into from

Conversation

chosak
Copy link
Member

@chosak chosak commented May 5, 2020

This commit adds support for prefetch_related on ParentalManyToManyFields. Consider for example these models:

class Author(models.Model):
    ...

class Article(ClusterableModel):
    authors = ParentalManyToManyField(Author)

With this change, you can create a query like this:

articles = Article.objects.prefetch_related('authors')

and then subsequent references to article authors won't invoke additional database queries.

This implementation is modeled after the current implementation of Django's ManyRelatedManager.get_prefetch_queryset.

Closes #101.

I've done limited testing on this with Wagtail 2.3 (see wagtail/wagtail#4784), but it'd be great to get others testing against more recent versions.

Comment on lines +361 to +363
def _apply_rel_filters(self, queryset):
# Required for get_prefetch_queryset.
return queryset._next_is_sticky()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chosak This method doesn't appear to be needed any longer

Copy link
Member Author

@chosak chosak Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ababic thanks for taking a look at this PR.

The Django 1.10 release notes say that

If you defined a custom manager class available through prefetch_related() you must make sure it defines a _apply_rel_filters() method.

Prior to Django 2.0 this wasn't technically required yet, but 2.0 removed a shim that allowed that:

The shim for supporting custom related manager classes without a _apply_rel_filters() method is removed.

_apply_rel_filters gets called by Django code here. Note that the DeferringRelatedManager used for ParentalKey also defines this method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay. That makes sense! In that case, ignore me! This is super impressive BTW... I've had a crack at this before and it's not the nicest part of Django to delve into.

@ababic
Copy link
Contributor

ababic commented Jun 24, 2020

I'll try to give this a whirl today in a 2.8 project. It'll make a huge difference to performance on at least one of my active projects!

@ababic
Copy link
Contributor

ababic commented Jun 24, 2020

@chosak Working nicely for me in a Wagtail 2.8.2 project with Django 3.0.4. @gasman, can you think of any edge-cases that might not be covered here? e.g. when used in the context of a FakeQuerySet?

@gasman
Copy link
Contributor

gasman commented Jun 24, 2020

@ababic Haven't dug into the logic, but assuming Django's internal mechanisms call get_prefetch_queryset with an instances list obtained from the queryset, it looks like it should be fine. That's definitely something worth testing, though. I guess there are two cases to consider, although the second may be out of scope for this fix:

  • the top-level data is a fake in-memory queryset, but the next level being prefetched is real data from the database. To test this, we can add this to the existing ParentalManyToManyPrefetchTests setup:

    def test_prefetch_from_fake_queryset(self):
        article = Article(title="Article with related articles")
        article.related_articles = [Article.objects.first()]
        with self.assertNumQueries(1):
            prefetched_names = self.get_author_names(
                article.related_articles.prefetch_related('authors')
            )
    

    (In Wagtail terms, this would be the equivalent of previewing an Article page with a set of related articles, and the template outputting title + author list for each one.)

  • Both levels of fetch are using fake in-memory data. A test for this would look like the one above, but with article.related_articles set to an unsaved in-memory article populated with authors. I don't think this case ever comes up in Wagtail, because the instances at the other end of a ParentalManyToMany relation are always real database objects, but one case that could come up is prefetching a ParentalManyToMany relation across a ParentalKey relation. In terms of the test app's models, that would correspond to NewsPaper being a Wagtail page, with Article as an inline panel with an author multi-select field in it:

    def test_prefetch_fake_m2m_across_fake_children(self):
        newspaper = NewsPaper(title="Daily Mail")
        article = Article(title="an inline article")
        article.authors = list(Author.objects.all())
        newspaper.articles = [article]  # NB this requires adding related_name='articles' on Article's parental key
        with self.assertNumQueries(0):
            prefetched_names = self.get_author_names(
                newspaper.articles.prefetch_related('authors')
            )
    

    Again, not sure if this counts as in scope for the current PR (since in this case the ParentalManyToMany relation is the one being prefetched, not the one we're calling prefetch_related on).

This commit adds support for prefetch_related on
ParentalManyToManyFields. Consider for example these models:

class Author(models.Model):
    ...

class Article(ClusterableModel):
    authors = ParentalManyToManyField(Author)

With this change, you can create a query like this:

articles = Article.objects.prefetch_related('authors')

and then subsequent references to article authors won't invoke
additional database queries.

This implementation is modeled after the current implementation of
Django's ManyRelatedManager.get_prefetch_queryset.

Closes issue 101.
@chosak chosak force-pushed the feature/m2m-prefetch-related branch from 787bead to 1699718 Compare June 25, 2020 18:03
@chosak
Copy link
Member Author

chosak commented Jun 25, 2020

@ababic thanks very much for the review.

@gasman, thanks for the thoughtful suggestions. I've pushed a commit that adds the first test you mention, to confirm that prefetching on a fake queryset works as expected.

Handling your second case is a little more tricky. The test you propose fails right now, because right now FakeQuerySet.prefetch_related (probably sensibly) just defers to Django's prefetch_related_objects, and that path provides limited opportunity to customize the prefetch further to avoid doing the unnecessary lookup.

One possible fix that does seem to work for your proposed test would be for this project to use _prefetched_objects_cache as the name for the object cache instead of _cluster_related_objects (which this PR duplicated). I didn't test this thoroughly and so it might prove to be too neat of a solution, not to mention relying on undocumented internal Django behavior.

Regardless, I tend to agree that this is out of scope for this more immediate fix to get prefetch_related directly on ParentalManyToManyFields.

@gasman
Copy link
Contributor

gasman commented Sep 2, 2020

Thanks @chosak! Merged in 21d4960.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prefetch related not working for parental many to many relations
3 participants