DDC-2381: Pagination query can be simplified when simple joins are applied #3091

Open
doctrinebot opened this Issue Mar 31, 2013 · 15 comments

2 participants

@doctrinebot

Jira issue originally created by user sergic:

@doctrinebot

Comment created by @ocramius:

Not a blocker

@doctrinebot

Comment created by @ocramius:

What's the result of EXPLAIN on a query without the subquery?

@doctrinebot

Comment created by sergic:

explain without the subquery

@doctrinebot

Comment created by @ocramius:

[~sergic] that's not the same query.

@doctrinebot

Comment created by @ocramius:

[~sergic] this is still using

Using index; Using temporary; Using filesort

Check your indexes

@doctrinebot

Comment created by sergic:

Not in the index problem

SELECT DISTINCT id0 FROM (SELECT m0.id AS id0, m0_.title AS title1, m0_.text AS text2, m0_.price AS price3, m0_.originalPrice AS originalPrice4, m0_.condition_type AS condition_type5, m0_.image_1 AS image_16, m0_.image_2 AS image_27, m0_.image_3 AS image_38, m0_.image_4 AS image_49, m0_.image_5 AS image_510, m0_.video AS video11, m0_.contact_email AS contact_email12, m0_.contact_name AS contact_name13, m0_.contact_phone AS contact_phone14, m0_.contact_type AS contact_type15, m0_.published AS published16, m0_.type AS type17, m0_.status AS status18, m0_.highlight AS highlight19, m0_.urgent AS urgent20, m0_.topads AS topads21, m0_.period AS period22, m0_.hits AS hits23, m0_.ip AS ip24, m0_.created_at AS created_at25, m0_.updated_at AS updated_at26 FROM milla_message m0_ WHERE m0_.status = 1 ORDER BY m0_.published DESC) dctrnresult LIMIT 20 OFFSET 0

Time: 104.614s explain 3

SELECT DISTINCT m0.id AS id0 FROM milla_message m0_ WHERE m0_.status = 1 ORDER BY m0.published DESC LIMIT 20 OFFSET 0;

Time: 0.001s explain 4

@doctrinebot

Comment created by @ocramius:

[~sergic] the ORM cannot simplify a complex query that way. There may be a conditional on one of the joined results, or generally usage of one of the joined results.

Things that could be optimized here are:

  • Removal of the ORDER BY clause when grouping (check ORM master, I think somebody already did that)
  • Trying to simplify the query by doing some serious hacking on the AST.

The problem I see here is that the chance to spawn random bugs because of the optimization is very high, and you'd have to rewrite walkSelectStatement

@doctrinebot

Comment created by @ocramius:

Marking as improvement

@doctrinebot

Comment created by sergic:

Minor? :D
i have 100 sec for this query.
200k items are selected for temporary table. wtf?

OK. Programmers may be mistaken in parser
expect ORDER BY m0.published DESC LIMIT 20 OFFSET 0) dctrnresult
Time: 0.001s

reality ORDER BY m0.published DESC) dctrnresult LIMIT 20 OFFSET 0

@doctrinebot

Comment created by @ocramius:

[~sergic] this problem does not introduce security issues and can be worked around by you while using your own pagination logic. It does not stop you from doing anything, that's why it's minor.

@doctrinebot

Comment created by sergic:

ok)
i have already created my own paginator.
at last
please see how to fix this problem
Sergic@2733c81

@doctrinebot

Comment created by tom.pryor:

I've also run into this problem which makes Doctrine's Paginator useless for large datasets. The actual query takes 0.002ms but the SELECT DISTINCT query doctrine executes takes over 30s because MySQL creates a temporary table with 200k+ records.

You don't need to remove the joined tables from the paginator query (I have conditions on the joined tables anyway), this has a negligible impact performance, but rather it is caused by SELECT DISTINCT and ORDER BY which no index configuration can solve. Rather, perhaps a flag could be added to the paginator to indicate my query does not fetch join any has many collections (i.e each row returned will be unique) negating the need for the SELECT DISTINCT. The pagination would only then need to perform the original query with LIMIT and OFFSET applied along with a separate COUNT query on the primary key, both of which are very fast as they'd use the indexes setup for the the original query.

@doctrinebot

Comment created by stof:

This flag already exists for the select query. See the second argument of the constructor.

For the count query, you should call $paginator->setUseOutputWalkers(false) to make it use a DQL AST walker instead of the SQL Output walker (the AST walker does not support counting on queries using HAVING which is why it is not selected by default)

@doctrinebot

Comment created by rjmunro:

I think this is more important than "minor", as I've experienced this when upgrading from 2.2. My site became unusably slow.

I can't easily work around it because I am using Symfony bundles that use this, I am not using this directly. None of the workarounds mentioned so far seem to have helped.

@Ocramius Ocramius was assigned by doctrinebot Dec 6, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment