-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge release 2.16.0 into 3.0.x #10861
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Run workflows also for the `entity-level-commit-order` branch
This is the second chunk to break #10547 into smaller PRs suitable for reviewing. It prepares the `UnitOfWork` to work with a commit order computed on the entity level, as opposed to a class-based ordering as before. #### Background #10531 and #10532 show that it is not always possible to run `UnitOfWork::commit()` with a commit order given in terms of entity _classes_. Instead it is necessary to process entities in an order computed on the _object_ level. That may include * a particular order for multiple entities of the _same_ class * a particular order for entities of _different_ classes, possibly even going back and forth (entity `$a1` of class `A`, then `$b` of class `B`, then `$a2` of class `A` – revisiting that class). This PR is about preparing the `UnitOfWork` so that its methods will be able to perform inserts and deletions on that level of granularity. Subsequent PRs will deal with implementing that particular order computation. #### Suggested change Change the private `executeInserts` and `executeDeletions` methods so that they do not take a `ClassMetadata` identifying the class of entities that shall be processed, but pass them the list of entities directly. The lists of entities are built in two dedicated methods. That happens basically as previously, by iterating over `$this->entityInsertions` or `$this->entityDeletions` and filtering those by class. #### Potential (BC breaking?) changes that need review scrutiny * `\Doctrine\ORM\Persisters\Entity\EntityPersister::addInsert()` was previously called multiple times, before the insert would be performed by a call to `\Doctrine\ORM\Persisters\Entity\EntityPersister::executeInserts()`. With the changes here, this batching effectively no longer takes place. `executeInserts()` will always see one entity/insert only, and there may be multiple `executeInserts()` calls during a single `UoW::commit()` phase. * The caching in `\Doctrine\ORM\Cache\Persister\Entity\AbstractEntityPersister` previously would cache entities from the last executed insert batch only. Now it will collect entities across multiple batches. I don't know if that poses a problem. * Overhead in `\Doctrine\ORM\Persisters\Entity\BasicEntityPersister::executeInserts` is incurred multiple times; that may, however, only be about SQL statement preparation and might be salvageable. * The `postPersist` event previously was not dispatched before _all_ instances of a particular entity class had been written to the database. Now, it will be dispatched immediately after every single entity that has been inserted.
…tities Prepare entity-level commit order computation in the `UnitOfWork`
This is the first chunk to break #10547 into smaller PRs suitable for reviewing. It adds a new topological sort implementation. #### Background Topological sort is an algorithm that sorts the vertices of a directed acyclic graph (DAG) in a linear order such that for every directed edge from vertex A to vertex B, vertex A comes before vertex B in the ordering. This ordering is called a topological order. Ultimately (beyond the scope of this PR), in the ORM we'll need this to find an order in which we can insert new entities into the database. When one entity needs to refer to another one by means of a foreign key, the referred-to entity must be inserted before the referring entity. Deleting entities is similar. A topological sorting can be obtained by running a depth first search (DFS) on the graph. The order in which the DFS finishes on the vertices is a topological order. The DFS is possible iif there are no cycles in the graph. When there are cycles, the DFS will find them. For more information about topological sorting, as well as a description of an DFS-based topological sorting algorithm, see https://en.wikipedia.org/wiki/Topological_sorting. #### Current situation There is a DFS-based topological sorting implemented in the `CommitOrderCalculator`. This implementation has two kinks: 1. It does not detect cycles When there is a cycle in the DAG that cannot be resolved, we need to know about it. Ultimately, this means we will not be able to insert entities into the database in any order that allows all foreign keys constraints to be satisfied. If you look at `CommitOrderCalculator`, you'll see that there is no code dealing with this situation. 2. It has an obscure concept of edge "weights" To me, it is not clear how those are supposed to work. The weights are related to whether a foreign key is nullable or not, but can (could) be arbitrary integers. An edge will be ignored if it has a higher (lower) weight than another, already processed edge... 🤷🏻? #### Suggested change In fact, when inserting entities into the database, we have two kinds of foreign key relationships: Those that are `nullable`, and those that are not. Non-nullable foreign keys are hard requirements: Referred-to entities must be inserted first, no matter what. These are "non-optional" edges in the dependency graph. Nullable foreign keys can be set to `NULL` when first inserting an entity, and then coming back and updating the foreign key value after the referred-to (related) entity has been inserted into the database. This is already implemented in `\Doctrine\ORM\UnitOfWork::scheduleExtraUpdate`, at the expense of performing one extra `UPDATE` query after all the `INSERT`s have been processed. These edges are "optional". When finding a cycle that consists of non-optional edges only, treat it as a failure. We won't be able to insert entities with a circular dependency when all foreign keys are non-NULLable. When a cycle contains at least one optional edge, we can use it to break the cycle: Use backtracking to go back to the point before traversing the last _optional_ edge. This omits the edge from the topological sort order, but will cost one extra UPDATE later on. To make the transition easier, the new implementation is provided as a separate class, which is marked as `@internal`. #### Outlook Backtracking to the last optional edge is the simplest solution for now. In general, it might be better to find _another_ (optional) edge that would also break the cycle, if that edge is also part of _other_ cycles. Remember, every optional edge skipped costs an extra UPDATE query later on. The underlying problem is known as the "minimum feedback arc set" problem, and is not easily/efficiently solvable. Thus, for the time being, picking the nearest optional edge seems to be reasonable.
This reverts commit 6662195.
…10566) When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken. #### Background The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations. For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances. For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place. _But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything. #### Example Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`. Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up. All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them. _Any_ insert order will work, so the improvements necessary to solve #10531 or #10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough. For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation. The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`. #### Related cases These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support: * #5665 * #10548
This is the third step to break #10547 into smaller PRs suitable for reviewing. It uses the new topological sort implementation from #10592 and the refactoring from #10651 to compute the UoW's commit order for entity insertions not on the entity class level, but for single entities and their actual dependencies instead. #### Current situation `UnitOfWork::getCommitOrder()` would compute the entity sequence on the class level with the following code: https://github.com/doctrine/orm/blob/70477d81e96c0044ad6fd8c13c37b2270d082792/lib/Doctrine/ORM/UnitOfWork.php#L1310-L1325 #### Suggested change * Instead of considering the classes of all entities that need to be inserted, updated or deleted, consider the new (inserted) entities only. We only need to find a sequence in situations where there are foreign key relationships between two _new_ entities. * In the dependency graph, add edges for all to-one association target entities. * Make edges "optional" when the association is nullable. #### Test changes I have not tried to fully understand the few changes necessary to fix the tests. My guess is that those are edge cases where the insert order changed and we need to consider this during clean-up. Keep in mind that many of the functional tests we have assume that entities have IDs assigned in the order that they were added to the EntityManager. That does not change – so the order of entities is generally stable, equal to the previous implementation.
Add a test case to show #10348 has been fixed
Co-authored-by: Grégoire Paris <postmaster@greg0ire.fr>
Add test to show #7006 has been fixed
Tests suggested in #7180 (comment) and #7180 (comment) by @arnaud-lb. Co-authored-by: Arnaud Le Blanc <arnaud.lb@gmail.com>
Add tests to show #6499 has been fixed
Add tests for entity insertion and deletion that require writes to different tables in an interleaved fashion, and that have to re-visit a particular table. #### Background In #10531, I've given an example where it is necessary to compute the commit order on the entity (instead of table) level. Taking a closer look at the UoW to see how this could be achieved, I noticed that the current, table-level commit order manifests itself also in the API between the UoW and `EntityPersister`s. #### Current situation The UoW computes the commit order on the table level. All entity insertions for a particular table are passed through `EntityPersister::addInsert()` and finally written through `EntityPersister::executeInserts()`. #### Suggested change The test in this PR contains a carefully constructed set of four entities. Two of them are of the same class (are written to the same table), but require other entities to be processed first. In order to be able to insert this set of entities, the ORM must be able to perform inserts for a given table repeatedly, interleaved with writing other entities to their respective tables.
Add test to show #9192 has been fixed
Add test to show #7180 has been fixed
This refactoring does two things: * We can avoid collecting the post insert IDs in a cumbersome array structure that will be returned by the EntityPersisters and processed by the UoW right after. Instead, use a more expressive API: Make the EntityPersisters tell the UoW about the IDs immediately. * IDs will be available in inserted entities a tad sooner. That may help to resolve #10735, where we can use the IDs to skip extra updates.
Add tests for entity insertion and deletion that require the commit order calculation to happen on the entity level. This demonstrates the necessity for the changes in #10547. This PR contains two tests with carefully constructed entity relationships, where we have a non-nullable `parent` foreign key relationships between entities stored in the same table. Class diagram: ```mermaid classDiagram direction LR class A class B A --> B : parent B --|> A ``` Object graph: ```mermaid graph LR; b1 --> b2; b2 --> a; b3 --> b2; ``` #### Situation before #10547 The commit order is computed by looking at the associations at the _table_ (= _class_) level. Once the ordering of _tables_ has been found, entities being mapped to the same table will be processed in the order they were given to `persist()` or `remove()`. That means only a particular ordering of `persist()` or `remove()` calls (see comment in the test) works: For inserts, the order must be `$a, $b2, $b1, $b3` (or `... $b3, $b1`), for deletions `$b1, $b3, $b2, $a`. #### Situation with entity-level commit order computation (as in #10547) The ORM computes the commit order by considering associations at the _entity_ level. It will be able to find a working order by itself.
…ing merge() This PR tries to improve the situation/problem explained in #3037: Under certain conditions – there may be multiple and not all are known/well-understood – we may get inconsistencies between the `\Doctrine\ORM\UnitOfWork::$entityIdentifiers` and `\Doctrine\ORM\UnitOfWork::$identityMap` arrays. Since the `::$identityMap` is a plain array holding object references, objects contained in it cannot be garbage-collected. `::$entityIdentifiers`, however, is indexed by `spl_object_id` values. When those objects are destructed and/or garbage-collected, the OID may be reused and reassigned to other objects later on. When the OID re-assignment happens to be for another entity, the UoW may assume incorrect entity states and, for example, miss INSERT or UPDATE operations. One cause for such inconsistencies is _replacing_ identity map entries with other object instances: This makes it possible that the old object becomes GC'd, while its OID is not cleaned up. Since that is not a use case we need to support (IMHO), #10785 is about adding a safeguard against it. In this test shown here, the `merge()` operation is currently too eager in creating a proxy object for another referred-to entity. This proxy represents an entity already present in the identity map at that time, potentially leading to this problem later on.
Entity level commit order
Co-authored-by: Grégoire Paris <postmaster@greg0ire.fr>
Resort on Query::HINT_FORCE_PARTIAL_LOAD less
…tance Support not Insertable/Updateable columns for entities with `JOINED` inheritance type
This was never meant to be under version control. Did not spot it in the diff. Closes #10836
* 2.15.x: PHPStan 1.10.25, Psalm 5.13.1 (#10842)
We already have the sidebar for this.
It confuses the guides, and is ugly.
Fix/Self deprecation with getQueryCacheImpl
* Introduce FilterCollection#restore method * Add suspend method instead * Add more tests
* Introduce FilterCollection#restore method * Add suspend method instead * Add more tests
This reverts commit 8e20e15.
* 2.15.x: Fix static analysis Other solution Avoid self deprecation fix: use platform options instead of deprecated custom options (#10855)
…cation-provided IDs This excludes such associations from the commit order computation, since the foreign key constraint will be satisfied when inserting the row. See #10735 for more details about this edge case.
Compute the commit order (inserts/deletes) on the entity level
Revert "Allow symfony/console 7"
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Release Notes for 2.16.0
2.16.0
Improvement
10848: Use properties instead of getters to read property/class names via reflection thanks to @nicolas-grekas
10832: Decouple public API from Doctrine\Persistence\Proxy thanks to @nicolas-grekas
10785: Add a safeguard against multiple objects competing for the same identity map entry thanks to @mpdude
10735: Make it possible to have non-NULLable self-referencing associations when using application-provided IDs thanks to @mpdude
10727: Revert "Allow symfony/console 7" thanks to @derrabus
10724: Temporarily allow
symfony/console
7 thanks to @derrabus10704: Remove unreachable piece of code thanks to @greg0ire
10547: Compute the commit order (inserts/deletes) on the entity level thanks to @mpdude
10833: Cleanup psalm-type AutogenerateMode thanks to @nicolas-grekas
10598: Support not Insertable/Updateable columns for entities with
JOINED
inheritance type thanks to @e-vil-devTest Suite
Deprecation
Static Analysis
AbstractQuery::getSingleScalarResult()
thanks to @phansysNew Feature
Deprecation,Improvement