Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GH-1569] Optimize eager fetch for collections to batch query #8391

Merged
merged 19 commits into from
Nov 14, 2023

Conversation

beberlei
Copy link
Member

@beberlei beberlei commented Dec 13, 2020

This supersedes and adjusts #1569 to current Doctrine 2.17 branch.

When you configure a one-to-many collection to be fetch=EAGER, then instead of immediately loading all collections it batches the fetch operation at the end by selecting all entities for all collections. Collections are loaded in batches of 100, configurable by Configuration::setEagerFetchBatchSize.

Changing the collection fetch mode can be done on a per query basis with $query->setFetchMode('rootEntity', 'association', ClassMetadata::FETCH_EAGER);. This can be done as an alternative to fetch join collection valued associations.

Open points:

  • Naming: Its not really using subselect, alternative name could be "Batch". It supersede EAGER.
  • requires removal of WITH keyword or prevention of WITH keyword when association is fetch=eager
  • The batch loading o collections requires handling of maximum number of items in WHERE IN clause and potentially chunk it into multiple findBy operations.
  • Should work across multiple layers, i.e. being recursive
  • For fetch=eager the loading is delegated to persisters, here its going through the whole repository stack. Wondering if that must be improved code wise to go directly to persisters.
  • There is no validation in ClassMetadataInfo if the selected fetch mode makes no sense for the entity.
  • Does not work with ManyToMany, because they don't have a mapped field. Logic needs to be deferred to CollectionPersister.

Fixes #4762

@beberlei beberlei added this to the 2.9.0 milestone Dec 13, 2020
@beberlei beberlei modified the milestones: 2.9.0, 3.0.0 Feb 28, 2021
@danydev

This comment was marked as resolved.

@tuxoff

This comment was marked as resolved.

@derrabus derrabus changed the base branch from 2.9.x to 2.10.x September 21, 2021 15:35
@derrabus derrabus changed the base branch from 2.10.x to 2.12.x January 12, 2022 13:36
@kgasienica

This comment was marked as abuse.

@derrabus derrabus removed this from the 3.0.0 milestone Apr 24, 2022
@derrabus derrabus changed the base branch from 2.12.x to 2.13.x April 24, 2022 19:40
@beberlei beberlei mentioned this pull request May 27, 2022
@xepozz

This comment was marked as spam.

@mpdude
Copy link
Contributor

mpdude commented Dec 14, 2022

To summarize the suggested changes with my own words, this hooks into the place where entitites are being created during hydration. Thus, it sees all "new" entities, i. e. at the are first re-created from database values.

It tracks all those entities and their one-to-many (also many-to-many?) collections that are configured with fetch=SUBSELECT.

After the entities have been loaded, for every such collection a new query is performed through the target (referred-to) entity class' EntityRepository to find all of those entities by means the foreign key implementing the to-many association.

Then, loop over all those entities and add them to their respective collections to initialize those.

Not sure if indexBy/orderBy need to be taken into account.

@mpdude
Copy link
Contributor

mpdude commented Dec 18, 2022

@beberlei Given your experience writing tricky solutions using subselects and doing AST manipulations like the Paginator component does, do you think it would be possible to write a helper component that could provide this feature and be used to do the "context sensitive optimization"? I also have the "multi-step hydration" explained at https://ocramius.github.io/blog/doctrine-orm-optimization-hydration/ in mind.

Idea along the lines of:

// Assume User has to-many association to Address in User::$addresses
$query = $entityManager->createQuery('SELECT u FROM ' . User::class . ' u WHERE ...');
$result = $query->getResult();

InitializeCollection::for($query, 'u.addresses');

What this would be supposed to do:

  • Figure out the class that u refers to
  • Clone and modify $query to SELECT u.id ... only
  • Run a SELECT _tmp1, _tmp2 FROM {class of u} _tmp1 JOIN _tmp1.addresses _tmp2 WHERE _tmp1.id IN ({modified query})

Might be more tricky to get right with composite primary keys?

@tripany

This comment was marked as spam.

@belmeopmenieuwesim

This comment was marked as resolved.

…ions.

Co-authored-by: Wouter M. van Vliet <wouter@interpotential.com>
@beberlei beberlei changed the base branch from 2.13.x to 2.17.x October 10, 2023 05:27
@JanTvrdik
Copy link
Contributor

Disallow use of fetch=SUBSELECT on to-one associations.

Is there a technical reason to disallow this? Would it not still be benefical to fetch even has one associations in it's own query when the cardinality of the target entity is much lower to avoid fetching duplicated data?

@beberlei
Copy link
Member Author

@JanTvrdik yes, but that is what fetch=EAGER does for to one already, either by doing a join, or by triggering all laods at once.

@greg0ire greg0ire merged commit 1267f48 into doctrine:2.17.x Nov 14, 2023
59 checks passed
@greg0ire
Copy link
Member

Thanks @beberlei !

@@ -1047,7 +1047,9 @@ public function walkJoinAssociationDeclaration($joinAssociationDeclaration, $joi
}
}

$targetTableJoin = null;
if ($relation['fetch'] === ClassMetadata::FETCH_EAGER && $condExpr !== null) {
throw QueryException::eagerFetchJoinWithNotAllowed($assoc['sourceEntity'], $assoc['fieldName']);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I probably have a poor implementation or use of eager mode on my relationships but this exception creates a BC break on my side.
Maybe it should be replaced by a deprecation pointing to this ticket : #10978 to explain why this is a bad pratice ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this is a BC in v2... How was it not been cached before merging? :(

@gharlan gharlan mentioned this pull request Feb 13, 2024
tomasz-ryba pushed a commit to tomasz-ryba/orm that referenced this pull request Apr 23, 2024
Fetch EAGER mode ignores orderBy as of changes introduced with doctrine#8391
Fixes duplicated doctrine#11381
tomasz-ryba pushed a commit to tomasz-ryba/orm that referenced this pull request Apr 23, 2024
Fetch EAGER mode ignores orderBy as of changes introduced with doctrine#8391
Fixes duplicated doctrine#11381
tomasz-ryba pushed a commit to tomasz-ryba/orm that referenced this pull request Apr 24, 2024
Fetch EAGER mode ignores orderBy as of changes introduced with doctrine#8391
Fixes duplicated doctrine#11381
tomasz-ryba pushed a commit to tomasz-ryba/orm that referenced this pull request Apr 24, 2024
Fetch EAGER mode ignores orderBy as of changes introduced with doctrine#8391
Fixes duplicated doctrine#11381
greg0ire pushed a commit to tomasz-ryba/orm that referenced this pull request Apr 24, 2024
EAGER fetch mode ignores orderBy as of changes introduced with doctrine#8391

Fixes doctrine#11163
Fixes doctrine#11381
@@ -1264,7 +1264,7 @@ protected function getSelectColumnsSQL()
}

$isAssocToOneInverseSide = $assoc['type'] & ClassMetadata::TO_ONE && ! $assoc['isOwningSide'];
$isAssocFromOneEager = $assoc['type'] !== ClassMetadata::MANY_TO_MANY && $assoc['fetch'] === ClassMetadata::FETCH_EAGER;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.