Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test: Entity insertions must not happen table-wise #10532

Conversation

mpdude
Copy link
Contributor

@mpdude mpdude commented Feb 21, 2023

Add tests for entity insertion and deletion that require writes to different tables in an interleaved fashion, and that have to re-visit a particular table. This is part of the larger story in #10547.

Background

Before #10547, the UoW computes the commit order on the table level. All entity insertions for a particular table are passed through EntityPersister::addInsert() and finally written through EntityPersister::executeInserts(). This is also noted here:

* As can be seen from the above list, insertions are batched and executed all at once
* for increased efficiency.

Suggested change

The test in this PR contains a carefully constructed set of four entities. Two of them are of the same class (are written to the same table), but require other entities to be processed first.

In order to be able to insert this set of entities, the ORM must be able to perform inserts for a given table repeatedly, interleaved with writing other entities to their respective tables. The UnitOfWork can no longer batch inserts per table, but will have to call EntityPersister::addInsert() and EntityPersister::executeInserts() multiple times in succession – this is implemented in #10547.

@mpdude mpdude force-pushed the entity-persister-must-not-batch branch from 734b898 to 7a9c95c Compare February 21, 2023 22:41
@mpdude mpdude marked this pull request as ready for review February 22, 2023 06:06
mpdude added a commit to mpdude/doctrine2 that referenced this pull request Apr 24, 2023
This is the second chunk to break doctrine#10547 into smaller PRs suitable for reviewing. It prepares the `UnitOfWork` to work with a commit order computed on the entity level, as opposed to a class-based ordering as before.

#### Background

doctrine#10531 and doctrine#10532 show that it is not always possible to run `UnitOfWork::commit()` with a commit order given in terms of  entity _classes_. Instead it is necessary to process entities in an order computed on the _object_ level. That may include

* a particular order for multiple entities of the _same_ class
* a particular order for entities of _different_ classes, possibly even going back and forth (entity `$a1` of class `A`, then `$b` of class `B`, then `$a2` of class `A` – revisiting that class).

This PR is about preparing the `UnitOfWork` so that its methods will be able to perform inserts and deletions on that level of granularity. Subsequent PRs will deal with implementing that particular order computation.

#### Suggested change

Change the private `executeInserts` and `executeDeletions` methods so that they do not take a `ClassMetadata` identifying the class of entities that shall be processed, but pass them the list of entities directly.

The lists of entities are built in two dedicated methods. That happens basically as previously, by iterating over `$this->entityInsertions` or `$this->entityDeletions` and filtering those by class.

#### Potential (BC breaking?) changes that need review scrutiny

* `\Doctrine\ORM\Persisters\Entity\EntityPersister::addInsert()` was previously called multiple times, before the insert would be performed by a call to `\Doctrine\ORM\Persisters\Entity\EntityPersister::executeInserts()`. With the changes here, this batching effectively no longer takes place. `executeInserts()` will always see one entity/insert only, and there may be multiple `executeInserts()` calls during a single `UoW::commit()` phase.
* The caching in `\Doctrine\ORM\Cache\Persister\Entity\AbstractEntityPersister` previously would cache entities from the last executed insert batch only. Now it will collect entities across multiple batches. I don't know if that poses a problem.
* Overhead in `\Doctrine\ORM\Persisters\Entity\BasicEntityPersister::executeInserts` is incurred multiple times; that may, however, only be about SQL statement preparation and might be salvageable.
* The `postPersist` event previously was not dispatched before _all_ instances of a particular entity class had been written to the database. Now, it will be dispatched immediately after every single entity that has been inserted.
mpdude added a commit to mpdude/doctrine2 that referenced this pull request May 8, 2023
When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

#### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place. 

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

#### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`. 

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve doctrine#10531 or doctrine#10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

#### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support: 
* doctrine#5665
* doctrine#10548
mpdude added a commit to mpdude/doctrine2 that referenced this pull request May 8, 2023
When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

#### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place. 

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

#### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`. 

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve doctrine#10531 or doctrine#10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

#### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support: 
* doctrine#5665
* doctrine#10548
mpdude added a commit to mpdude/doctrine2 that referenced this pull request May 8, 2023
When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

#### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place. 

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

#### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`. 

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve doctrine#10531 or doctrine#10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

#### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support: 
* doctrine#5665
* doctrine#10548
mpdude added a commit to mpdude/doctrine2 that referenced this pull request May 8, 2023
When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

#### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place. 

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

#### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`. 

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve doctrine#10531 or doctrine#10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

#### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support: 
* doctrine#5665
* doctrine#10548
mpdude added a commit to mpdude/doctrine2 that referenced this pull request May 23, 2023
When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

 #### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place.

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

 #### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`.

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve doctrine#10531 or doctrine#10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

 #### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support:
* doctrine#5665
* doctrine#10548
SenseException pushed a commit that referenced this pull request May 30, 2023
…10566)

When computing the commit order for entity removals, we have to look out for `@ORM\JoinColumn(onDelete="SET NULL")` to find places where cyclic associations can be broken.

 #### Background

The UoW computes a "commit order" to find the sequence in which tables shall be processed when inserting entities into the database or performing delete operations.

For the insert case, the ORM is able to schedule _extra updates_ that will be performed after all entities have been inserted. Associations which are configured as `@ORM\JoinColumn(nullable=true, ...)` can be left as `NULL` in the database when performing the initial `INSERT` statements, and will be updated once all new entities have been written to the database. This can be used to break cyclic associations between entity instances.

For removals, the ORM does not currently implement up-front `UPDATE` statements to `NULL` out associations before `DELETE` statements are executed. That means when associations form a cycle, users have to configure `@ORM\JoinColumn(onDelete="SET NULL", ...)` on one of the associations involved. This transfers responsibility to the DBMS to break the cycle at that place.

_But_, we still have to perform the delete statements in an order that makes this happen early enough. This may be a _different_ order than the one required for the insert case. We can find it _only_ by looking at the `onDelete` behaviour. We must ignore the `nullable` property, which is irrelevant, since we do not even try to `NULL` anything.

 #### Example

Assume three entity classes `A`, `B`, `C`. There are unidirectional one-to-one associations `A -> B`, `B -> C`, `C -> A`. All those associations are `nullable= true`.

Three entities `$a`, `$b`, `$c` are created from these respective classes and associations are set up.

All operations `cascade` at the ORM level. So we can test what happens when we start the operations at the three individual entities, but in the end, they will always involve all three of them.

_Any_ insert order will work, so the improvements necessary to solve #10531 or #10532 are not needed here. Since all associations are between different tables, the current table-level computation is good enough.

For the removal case, only the `A -> B` association has `onDelete="SET NULL"`. So, the only possible execution order is `$b`, `$c`, `$a`. We have to find that regardless of where we start the cascade operation.

The DBMS will set the `A -> B` association on `$a` to `NULL` when we remove `$b`. We can then remove `$c` since it is no longer being referred to, then `$a`.

 #### Related cases

These cases ask for the ORM to perform the extra update before the delete by itself, without DBMS-level support:
* #5665
* #10548
@mpdude mpdude force-pushed the entity-persister-must-not-batch branch from 7a9c95c to 51f20a9 Compare June 2, 2023 18:24
@mpdude mpdude changed the base branch from 2.14.x to entity-level-commit-order June 2, 2023 18:24
@mpdude mpdude changed the title Failing test: Entity insertions must not happen table-wise Add test: Entity insertions must not happen table-wise Jun 2, 2023
@mpdude mpdude force-pushed the entity-persister-must-not-batch branch 3 times, most recently from fe8e1cb to 5289d44 Compare June 2, 2023 21:25
@mpdude mpdude force-pushed the entity-persister-must-not-batch branch from ba0967c to f76495c Compare June 3, 2023 10:34
@mpdude
Copy link
Contributor Author

mpdude commented Jun 3, 2023

@greg0ire do you know how to avoid the not-NULL problem in PostgreSQL when inserting the row that needs an auto-increment id?

Add tests for entity insertion and deletion that require writes to different tables in an interleaved fashion, and that have to re-visit a particular table.

 #### Background

In doctrine#10531, I've given an example where it is necessary to compute the commit order on the entity (instead of table) level.

Taking a closer look at the UoW to see how this could be achieved, I noticed that the current, table-level commit order manifests itself also in the API between the UoW and `EntityPersister`s.

 #### Current situation

The UoW computes the commit order on the table level. All entity insertions for a particular table are passed through `EntityPersister::addInsert()` and finally written through `EntityPersister::executeInserts()`.

 #### Suggested change

The test in this PR contains a carefully constructed set of four entities. Two of them are of the same class (are written to the same table), but require other entities to be processed first.

In order to be able to insert this set of entities, the ORM must be able to perform inserts for a given table repeatedly, interleaved with writing other entities to their respective tables.
@mpdude mpdude force-pushed the entity-persister-must-not-batch branch from f76495c to bf2937e Compare June 3, 2023 13:07
@greg0ire
Copy link
Member

greg0ire commented Jun 3, 2023

Not sure what you are talking about. When inserting without an ID and using postgres, there's an issue, is that it? Is that why you reverted to the previous situation?

@mpdude
Copy link
Contributor Author

mpdude commented Jun 3, 2023

{
/**
* @ORM\Id
* @ORM\GeneratedValue(strategy="AUTO")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I should resume work on #8931

See the value of LEGACY_DEFAULTS_FOR_ID_GENERATION vs RECOMMENDED_STRATEGY. The only RDBMS that shouldn't be using IDENTITY is Oracle, and we are not running the test suite with it. I'm really not sure if that will fix the issue, but if you could try with IDENTITY for science, that would be great.

@mpdude
Copy link
Contributor Author

mpdude commented Jun 21, 2023

Asking @SenseException for 👀 as well 🙏🏻

@mpdude
Copy link
Contributor Author

mpdude commented Jun 27, 2023

Maybe the last PR pending for the #10547 feature 🎉 ?

@greg0ire
Copy link
Member

ping @derrabus @SenseException 🙏

@greg0ire greg0ire merged commit 44d2a83 into doctrine:entity-level-commit-order Jun 28, 2023
@greg0ire
Copy link
Member

@mpdude time to merge entity-level-commit-order?

@mpdude
Copy link
Contributor Author

mpdude commented Jun 28, 2023

I'll try to prepare that via #10547 – there are merge conflicts with 2.16 that I need to resolve.

@mpdude mpdude deleted the entity-persister-must-not-batch branch June 28, 2023 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants