DDC-176: Query::iterate is broken when using JOIN #2410

doctrinebot opened this Issue Nov 26, 2009 · 9 comments

2 participants


Jira issue originally created by user nicokaiser:

These two queries return different result counts:

$q = $em->createQuery('SELECT c, t FROM Entity\General\Country c LEFT JOIN c.Translations t ON t.id = c.id');
$results = $q->execute();
echo count($results) . " Results" . PHP_EOL;

$q = $em->createQuery('SELECT c, t FROM Entity\General\Country c LEFT JOIN c.Translations t ON t.id = c.id');
$results = $q->iterate();
$count = 0;
while ($results->next()) { $count<ins></ins>; }
echo $count . " Results" . PHP_EOL;

After investigating a bit into it, I found out that $results->next() delivers "null" results when the joined table has more than one value (i.e. in this case a Country has more then one Translation).

When there is more than 1 Translation, the iterator adds one Translation to the Country and then delivers "null" for the remaining Translations (which is why the while loop stops after the first Country with more than one Translation)...


Comment created by romanb:

This is actually known. The bug here rather is that an exception should be thrown if iterate() is used on a query with fetch-joined collections.

What is a fetch-joined collections? "select a, b from A a join a.bs b" where "bs" is a collection and it is put in the select clause, so appears in the result.

iterate() just cant work with those queries. The SQL result set of such queries can be in any order with elements of a collection spread across the result set.
iterate() can only see 1 row at a time and this is not enough to properly parse results where you have collections in the result set.

It only works with a) no fetch-joins or b) only single-valued fetch-joins


Comment created by romanb:

I linked this to 2 other issues you might be interested in also.


Comment created by nicokaiser:

Hm. Can't iterate() "look ahead" at the next row and see if it's relevant to the current row (and if, load all joined entities into the current one)?

I would assume that execute() and iterate() both work the same with any query...


Comment created by romanb:

As soon as we "look ahead" we need to look ahead until the end of the result set, because like I said the rows can quickly be out of order. Noone guaruantees you that all the rows that contain collection values are in order.

As a result, we need to process the whole result as soon as we start with the first row, so we're back to normal hydration.

iterate() is a way to keep memory consumption better under control when querying large result sets (thousands of rows). But it just does not work with any query. When it comes to raw performance, it is slower. It just helps to keep memory usage low but only if you detach every object after it has been hydrated.

So this will keep the peek memory usage very low, even if the result set is very large:

$q = $em->createQuery('select u from MyProject\Model\User u');
$iterableResult = $q->iterate();
while ($iterableResult as $row) {
   $user = $row[0];
   // do stuff with $user
  $em->detach($user); // detach so that it can be GC'ed

Comment created by romanb:

See also this blog post: http://www.doctrine-project.org/blog/doctrine2-batch-processing


Comment created by romanb:

Here is an example from our hydration tests:

     * Tests that the hydrator does not rely on a particular order of the rows
     * in the result set.
     * DQL:
     * select c, b from Doctrine\Tests\Models\Forum\ForumCategory c inner join c.boards b
     * order by c.position asc, b.position asc
     * Checks whether the boards are correctly assigned to the categories.
     * The 'evil' result set that confuses the object population is displayed below.
     * c.id  | c.position | c.name   | boardPos | b.id | b.category_id (just for clarity)
     *  1    | 0          | First    | 0        |   1  | 1
     *  2    | 0          | Second   | 0        |   2  | 2   <--
     *  1    | 0          | First    | 1        |   3  | 1
     *  1    | 0          | First    | 2        |   4  | 1

The first category (1) has boards: (1, 3, 4)
The second category (2) has only one board: (2)

You see in that due to the ordering by positions a row of the second category appears in the middle of a series of rows of the first category and its boards.
And thats just one possible example. Now if we would look ahead during iterate() we dont know where to stop. If we would stop when the ID of the root object changes, you would get the first category with only 1 board in the collection, which is wrong.


Comment created by @beberlei:

An exception with detailed information is now thrown on iterating fetch join collections.


Issue was closed with resolution "Fixed"

@beberlei beberlei was assigned by doctrinebot Dec 6, 2015
@doctrinebot doctrinebot added this to the 2.0-ALPHA4 milestone Dec 6, 2015
@doctrinebot doctrinebot closed this Dec 6, 2015
@doctrinebot doctrinebot added the Bug label Dec 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment