Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid N+1 in users Index view #10477

Merged
merged 1 commit into from
Jul 17, 2023

Conversation

Tijani-Dia
Copy link
Collaborator

@Tijani-Dia Tijani-Dia commented May 24, 2023

I found an N+1 coming from the wagtail_userprofile field in the users' index view.

@squash-labs
Copy link

squash-labs bot commented May 24, 2023

Manage this branch in Squash

Test this branch here: https://tijani-diafeatureusers-index-n-4d5ig.squash.io

@@ -82,12 +82,13 @@ class Index(IndexView):
is_searchable = True
page_title = gettext_lazy("Users")

model_fields = set(f.name for f in User._meta.get_fields())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change, as if someone modifies this using self.model_fields + [...], it'll result in an error.

I'd suggest keeping this as a list, as it won't impact the performance of the view, nor the original N+1 issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't think of a use case for modifying the self.model_fields. Could you please give an example of when someone would need to do that?

it won't impact the performance of the view

Well, I've made a quick check on a project I'm working on and the model_fields contains 37 entries:

>>> [f.name for f in get_user_model()._meta.get_fields()]
['field_1', 'field_2', 'customdocument', 'customimage', 'field_3', 'field_4', 'field_5', 'wagtail_userprofile', 'document', 'uploadeddocument', 'image', 'uploadedimage', 'locked_pages', 'owned_pages', 'wagtail_revisions', 'requested_workflows', 'finished_task_states', 'wagtail_admin_comments', 'comments_resolved', 'comment_replies', 'page_subscriptions', 'logentry', 'id', 'password', 'last_login', 'is_superuser', 'field_b', 'uuid', 'first_name', 'last_name', 'email', 'is_staff', 'is_active', 'date_joined', 'groups', 'user_permissions', 'index_entries']

In get_users_filter_query, we're querying the list 4 times and we're doing the same 3 times in the get_queryset method (considering the if 'wagtail_userprofile' in self.model_fields line I've added). Most of the fields we're looking for are located near the end of the list or aren't in it at all (username for example) so I think using a set is more efficient.

I'd agree that this is a negligible part of the work done in the view but why do work we don't need to do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a list of ~30 items, the performance difference is going to be huge, in fact I honestly doubt it's detectable (hashing is often slower than iteration for small length lists). It's more significantly larger lists that set can make a difference - in is already pretty fast. This feels firmly in the realm of premature optimisation and extra noise for a PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly doubt it's detectable (hashing is often slower than iteration for small length lists)

Python's strings cache their hash value. Moreover, all the strings in model_fields are interned at runtime meaning that their hash value will already be computed at the point we come to this function.

When we iterate over the list, we'll still compare strings one by one and that will first check the strings' hash values. So we'll do at least as much work when using lists than when using sets.

I wrote a quick script to benchmark finding a list of strings in a larger list.

import random
import string
import timeit
import sys


def build_random_list_of_strings(n):
    return [
        "".join(random.choices(string.ascii_lowercase, k=random.randint(5, 25)))
        for _ in range(n)
    ]


def find_strings(strings, to_find):
    """Find a list of strings in a given collection."""
    return list(filter(strings.__contains__, to_find))


if __name__ == "__main__":
    # Obtain parameters from command line arguments
    if len(sys.argv) > 3:
        num_strings = int(sys.argv[1])
        num_to_find = int(sys.argv[2])
        num_to_not_find = int(sys.argv[3])
    else:
        num_strings = 37
        num_to_find = 6
        num_to_not_find = 1

    # Build a list and set of random strings
    strings_list = build_random_list_of_strings(num_strings)
    strings_set = set(strings_list)
    assert len(strings_list) == len(strings_set)

    # Build a list of strings to find
    to_find = random.choices(strings_list, k=num_to_find)
    to_find += build_random_list_of_strings(num_to_not_find)

    assert find_strings(strings_list, to_find) == find_strings(strings_set, to_find)

    print("List:")
    list_time = timeit.timeit(
        "find_strings(strings_list, to_find)",
        number=10000,
        globals=globals(),
    )
    print(f"Time taken: {list_time}\n")

    print("Set:")
    set_time = timeit.timeit(
        "find_strings(strings_set, to_find)",
        number=10000,
        globals=globals(),
    )
    print(f"Time taken: {set_time}\n")

    faster_structure, time_difference = (
        ("Set", list_time / set_time)
        if list_time > set_time
        else ("List", set_time / list_time)
    )
    print(f"{faster_structure} is {time_difference:.2f} times faster.")

When I run this script using the default values (which correspond to the setup we have for the model_fields), using a set is around 4 times faster than using a list.

This feels firmly in the realm of premature optimisation and extra noise for a PR.

I'll be happy to remove it if other people have the same opinion. To me, however, this is more a case of using the right tool.

Copy link
Member

@laymonage laymonage Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so that we don't argue on the less important bits of the PR here and in the spirit of making fewer changes that potentially break user code, how about we just do model_fields = set(self.model_fields) at the beginning of get_queryset(), and use it for the rest of the method (including the get_users_filter_query call)?

That way, we take advantage of set's faster search performance (provided that this still holds if the set is freshly constructed every time), while still retaining the customisability of self.model_fields 🙂

Copy link
Member

@laymonage laymonage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to apply my suggestions to settle the set vs. list debate and approve this, as I think this is a very nice improvement even without the set conversion.

@laymonage laymonage force-pushed the feature/users-index-n-plus-one branch from 313ccce to a6c9409 Compare July 17, 2023 16:23
@laymonage laymonage merged commit a6c9409 into wagtail:main Jul 17, 2023
0 of 3 checks passed
@Tijani-Dia
Copy link
Collaborator Author

Thanks @laymonage. FWIW, I had some free time and thought it'd be nice to have a series of small PRs to address the various N+1 and performance optimisations pointed by dj-tracker when running the test suite. The debate raised here refrained me from finishing the series but there is still room for improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants