Fixed #21241 -- Avoid extraneous JOINs in admin changelist search. #881

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
3 participants
@acdha
Contributor

acdha commented Mar 5, 2013

Using the database this way is inherently slow but the previous implementation
would call QuerySet.filter() once per whitespace-separated term in the query
text, causing an extra set of JOINs for each term. With the MySQL backend,
this actually causes database errors once you hit the server limit of 61.

This commit makes two changes: the first is simply to filter the queryset only
once; the second is to only process unique terms in the search field.

@charettes

This comment has been minimized.

Show comment
Hide comment
@charettes

charettes Mar 5, 2013

Member

I agree that the multiple JOINs are not desirable but I don't think that's the right approach.

Here you changed the query constraints from a set of len(bits) ANDed groups of len(orm_lookups) ORs to a set (len(bits) * len(orm_lookups)) ORs.

I think the correct fix would be to collect the OR expressions created in the loop and reduce them using operator.and_ to issue a single filter call.

or_queries = []
for bit in set(self.query.split()):
    or_queries.extend(reduce(operator.or_,
        (models.Q(**{lookup: bit}) for lookup in orm_lookups)
    ))
qs = qs.filter(reduce(operator.and_, or_queries))

Since this will require testing and thus is not a trivial patch could you open a trac ticket and link to this PR.

Thanks for your report.

Member

charettes commented Mar 5, 2013

I agree that the multiple JOINs are not desirable but I don't think that's the right approach.

Here you changed the query constraints from a set of len(bits) ANDed groups of len(orm_lookups) ORs to a set (len(bits) * len(orm_lookups)) ORs.

I think the correct fix would be to collect the OR expressions created in the loop and reduce them using operator.and_ to issue a single filter call.

or_queries = []
for bit in set(self.query.split()):
    or_queries.extend(reduce(operator.or_,
        (models.Q(**{lookup: bit}) for lookup in orm_lookups)
    ))
qs = qs.filter(reduce(operator.and_, or_queries))

Since this will require testing and thus is not a trivial patch could you open a trac ticket and link to this PR.

Thanks for your report.

@acdha

This comment has been minimized.

Show comment
Hide comment
@acdha

acdha Mar 5, 2013

Contributor

I had forgotten that the search implementation is explicitly documented as an AND search. I'll update my diff. The tests actually passed with the old implementation so it would make sense to have a test to confirm that it works as documented.

Contributor

acdha commented Mar 5, 2013

I had forgotten that the search implementation is explicitly documented as an AND search. I'll update my diff. The tests actually passed with the old implementation so it would make sense to have a test to confirm that it works as documented.

@charettes

This comment has been minimized.

Show comment
Hide comment
@charettes

charettes Mar 5, 2013

Member

This looks like the correct place to added tests.

Member

charettes commented Mar 5, 2013

This looks like the correct place to added tests.

@timgraham

This comment has been minimized.

Show comment
Hide comment
@timgraham

timgraham May 31, 2013

Member

Is there a trac ticket for this?

Member

timgraham commented May 31, 2013

Is there a trac ticket for this?

@timgraham

This comment has been minimized.

Show comment
Hide comment
@timgraham

timgraham Oct 7, 2013

Member

@acdha - are you interested in adding a test so we can commit this?

Member

timgraham commented Oct 7, 2013

@acdha - are you interested in adding a test so we can commit this?

@acdha

This comment has been minimized.

Show comment
Hide comment
@acdha

acdha Oct 7, 2013

Contributor

@timgraham Sure: I'll update my branch and update this ticket.

Contributor

acdha commented Oct 7, 2013

@timgraham Sure: I'll update my branch and update this ticket.

@acdha

This comment has been minimized.

Show comment
Hide comment
@acdha

acdha Oct 7, 2013

Contributor

Just on the off chance that this is useful for posterity, the change against 1.5 is here:

https://github.com/acdha/django/tree/admin-search-optimization-1.5

I'm adding tests after updating this branch against 1.6.

Contributor

acdha commented Oct 7, 2013

Just on the off chance that this is useful for posterity, the change against 1.5 is here:

https://github.com/acdha/django/tree/admin-search-optimization-1.5

I'm adding tests after updating this branch against 1.6.

@charettes

This comment has been minimized.

Show comment
Hide comment
@charettes

charettes Oct 7, 2013

Member

@acdha you should rebase this branch against master instead. This won't be merged in 1.6.

Member

charettes commented Oct 7, 2013

@acdha you should rebase this branch against master instead. This won't be merged in 1.6.

@acdha

This comment has been minimized.

Show comment
Hide comment
@acdha

acdha Oct 7, 2013

Contributor

@charettes My mistake - I was actually working off of master but wrote 1.6 above

Contributor

acdha commented Oct 7, 2013

@charettes My mistake - I was actually working off of master but wrote 1.6 above

acdha added some commits Oct 7, 2013

Admin: optimize search filter construction
Using the database this way is inherently slow but the previous implementation would call QuerySet.filter() once per whitespace-separated term in the query text, causing an extra set of JOINs for each term. With the MySQL backend, this actually causes database errors once you hit the server limit of 61.

This commit logically ANDs each of the query components together so
`queryset.filter()` can be called a single time.
Tests: confirm that admin changelist search terms are ANDed
Performance work on #881 inadvertently demonstrated that there wasn't a
test for the documented behaviour of admin changelist terms being
evaluated as logical ANDs
@acdha

This comment has been minimized.

Show comment
Hide comment
@acdha

acdha Oct 7, 2013

Contributor

@charettes With the updated code, I've confirmed that the behaviour without the patch was still to generate a full JOIN for each term. With the patch, it generates one JOIN.

Contributor

acdha commented Oct 7, 2013

@charettes With the updated code, I've confirmed that the behaviour without the patch was still to generate a full JOIN for each term. With the patch, it generates one JOIN.

@charettes

This comment has been minimized.

Show comment
Hide comment
@charettes

charettes Oct 7, 2013

Member

@acdha tests are looking good! Could you please create a ticket in Trac and referece this PR from there?

Member

charettes commented Oct 7, 2013

@acdha tests are looking good! Could you please create a ticket in Trac and referece this PR from there?

- queryset = queryset.filter(reduce(operator.or_, or_queries))
+ or_queries = (models.Q(**{orm_lookup: bit})
+ for orm_lookup in orm_lookups)
+ query_parts.append(reduce(operator.or_, or_queries))

This comment has been minimized.

@charettes

charettes Oct 7, 2013

Member

Import reduce from functools for py3 support.

@charettes

charettes Oct 7, 2013

Member

Import reduce from functools for py3 support.

This comment has been minimized.

This comment has been minimized.

@charettes

charettes Oct 7, 2013

Member

Sorry, you're completely right!

@charettes

charettes Oct 7, 2013

Member

Sorry, you're completely right!

@acdha

This comment has been minimized.

Show comment
Hide comment
Contributor

acdha commented Oct 7, 2013

@charettes

This comment has been minimized.

Show comment
Hide comment
@charettes

charettes Oct 7, 2013

Member

Merged in 698dd82, thanks!

Member

charettes commented Oct 7, 2013

Merged in 698dd82, thanks!

@charettes charettes closed this Oct 7, 2013

@timgraham timgraham changed the title from Admin: optimize search filter construction to Fixed #21241 -- Avoid extraneous JOINs in admin changelist search. Sep 22, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment