Spl_object_hash conflict on Merge #1465

Closed
wants to merge 8 commits into
from

Projects

None yet

5 participants

@moroine
Contributor
moroine commented Jul 16, 2015

This is proper PR which was originally #1461

Hi,

First of all, this is just a proposal and I'm sure there is a better solution for the problem described here. As this is my first hack into doctrine source code I'm not sure consequences that might be cause by my modification (even all unit tests passes).

I have encounter a problem due to spl_object_hash. I have written a functional test in order to reveal my issue.

The problem is when I merge an entity, here $user, UOW keep data on the original entity identified by it's spl_object_hash. Then if I unset this $user the spl_object_hash is now available for new object. So I experimented in my case reuse of previous hash which cause a managed+dirty entity error.

So I see two solutions

UOW keep reference to the entity given as even the given variable is unset there is remaining reference in UOW so the spl_hash will not be released.
Do not store data about the given entity, as merge operation isn't supposed to modified given entity.
I tried to implement the second solution as the first may consume a much more memory.

I don't why the merge operation need to do this, so I encapsulated it to prevent unwanted bug :)

@moroine moroine Fix Spl_object_hash conflict on merge
54bebab
@doctrinebot

Hello,

thank you for creating this pull request. I have automatically opened an issue
on our Jira Bug Tracker for you. See the issue link:

http://www.doctrine-project.org/jira/browse/DDC-3834

We use Jira to track the state of pull requests and the versions they got
included in.

@moroine
Contributor
moroine commented Jul 16, 2015

@DHager I provided a unit test that fail on version 2.5.0 but succeed with my PR. It seems that the merge operation stores the spl_object_hash of input object in originalEntityData array.

In order to be more understandable, I will explain my use case were the bug originally appears.

I realize a bulk operations for update data from external source. To do that without consuming all my memory, I use a lot of clear / merge operations. In a special case I encounter the DoctrineException "a managed+dirty entity ...", so I cannot flush my changes. After investigating it appears that I have an entity $a detached after a clear operation. In order to re-attach it, I merge it and directly affects the result to $a = $em->merge($a);. In this case the spl_object_hash of $a change from $spl1 to $spl2. So the object of value $spl1 doesn't exist anymore and it is garbage collected. Then I create a new entity which it's spl_object_hash is $spl1 (it doesn't appear always, but it's possible according to phpdocumentation). This entity cannot be persist as the $originalData[$spl1] isn't empty.

@DHager
Contributor
DHager commented Jul 16, 2015

OK, so if someone goes:

$foo = new Thing();
$bar = $entityManager->merge($foo);
unset($foo);

It looks like the "unreliable hash key" from $foo makes its way into UOW's $originalEntityData array here:

$this->originalEntityData[spl_object_hash($entity)][$name] = $managedCol;

What I don't know is whether this is intended/desirable behavior. If $originalEntityData is supposed to be keyed only by the official single managed copy (???) then it might be a simple fix of:

$this->originalEntityData[spl_object_hash($managedCopy)][$name] = $managedCol;

However, if $this->originalEntityData is supposed to be keyed by the originally-given-object(s), then I agree that storing a reference to the original input (to prevent garbage collection and reuse of the memory-address) is a possible solution.

@Ocramius I'm not very familiar with merge/detach. Does any of this ring a bell for you, or is there some other contributor we should ask?

@moroine
Contributor
moroine commented Jul 28, 2015

Hi,

I would like to know if anyone could resolve this issue ? I think this is a bug of merge operation and it could be unsefull to integrate this pull request or resolve the issue with another solution maybe as described by @DHager ?

Thank you,

@moroine
Contributor
moroine commented Nov 27, 2015

I refactor a little my PR.

Anyone could check it ?

Thanks,

moroine added some commits Nov 27, 2015
@moroine moroine Fix Spl_object_hash conflict on merge (reverted from commit 54bebab) …
…(reverted from commit 0408ffd793471863de6c3f8c88f3a1cbbb0750db)
8ef5207
@moroine moroine Improve test case 3c1d2d3
@moroine moroine Merge remote-tracking branch 'doctrine/master'
# Conflicts:
#	lib/Doctrine/ORM/UnitOfWork.php
6568f22
@Ocramius Ocramius commented on the diff Nov 28, 2015
...ine/Tests/ORM/Functional/MergeSplHashConflictTest.php
+ protected function setUp()
+ {
+ $this->useModelSet('cms');
+ parent::setUp();
+ }
+
+ /**
+ * @test
+ */
+ public function testMergeSplHashConflict()
+ {
+ // Create
+ $user = new CmsUser;
+ $hash = spl_object_hash($user);
+
+ $mergedUser = $this->_em->merge($user);
@Ocramius
Ocramius Nov 28, 2015 Member

Shouldn't self::assertSame($user, $mergedUser) here?

@moroine
moroine Nov 29, 2015 Contributor

I think we should assert that $user and $mergedUser are two different object (by there reference) as merge operation should return a managed copy of input object but the input object has to remain unmanaged ?

@moroine
moroine Dec 22, 2015 Contributor

We couldn't assertSame as for object assertSame check if the two variables reference the same object. In the case of merge, it will return a copy so it's another object

@Ocramius Ocramius commented on the diff Nov 28, 2015
...ine/Tests/ORM/Functional/MergeSplHashConflictTest.php
+ {
+ // Create
+ $user = new CmsUser;
+ $hash = spl_object_hash($user);
+
+ $mergedUser = $this->_em->merge($user);
+ unset($user);
+
+ $uow = $this->_em->getUnitOfWork();
+ $reflexion = new \ReflectionClass(get_class($uow));
+ $property = $reflexion->getProperty('originalEntityData');
+ $property->setAccessible(true);
+ $originalEntityData = $property->getValue($uow);
+
+ // As $user is unset, $hash could be reused by spl_object_hash
+ $this->assertArrayNotHasKey($hash, $originalEntityData);
@Ocramius
Ocramius Nov 28, 2015 Member

The object still exists in memory here, since it is merged into the UoW.

@moroine
moroine Nov 29, 2015 Contributor

Of what I understood, the $user reference is never kept by the merge operation. As the bug that I uncounter is the fact my $user doesn't exists anymore and then I try to merge a new object but the uow kept old hash in their originalEntityData array.

@Ocramius Ocramius and 1 other commented on an outdated diff Nov 28, 2015
lib/Doctrine/ORM/UnitOfWork.php
@@ -1901,10 +1902,60 @@ private function doMerge($entity, array &$visited, $prevManagedCopy = null, $ass
$this->cascadeMerge($entity, $managedCopy, $visited);
+ // Restore previous state (where the given entity was previously managed)
+ if (!$isKnown) {
@Ocramius
Ocramius Nov 28, 2015 Member

Can be replaced by a simple check that verifies that $entity !== $managedCopy or such

@moroine
moroine Nov 29, 2015 Contributor

This may should work as if the return managed entity is different from the input one we need to ensure the spl hash of the input entity is not used anymore as it could be garbage collected !

moroine added some commits Dec 14, 2015
@moroine moroine Merge remote-tracking branch 'doctrine/master'
# Conflicts:
#	lib/Doctrine/ORM/UnitOfWork.php
f4215ce
@moroine moroine Replace is known by a simple reference test
07b4a24
@moroine
Contributor
moroine commented Dec 22, 2015

@Ocramius I made up the merge operation in order to resolve conflict and include changes requested.

Don't take care about all commits, I removed them (I forgot this PR use my master branch)

@moroine
Contributor
moroine commented Dec 29, 2015

@Ocramius It seems that Travis is missconfigured as the composer.json requirements is not compatible with versions 5.4 and lower of php

@Ocramius
Member

@moroine fixed in master: a rebase should give you a green build

@Ocramius Ocramius added the Bug label Dec 29, 2015
@moroine moroine Merge remote-tracking branch 'doctrine/master'
ce8bb13
@moroine
Contributor
moroine commented Dec 30, 2015

Thanks ! It works fine !

Did you seen my other related PR ? #5570

@moroine
Contributor
moroine commented Feb 2, 2016

@Ocramius Is this PR schedule to be merged or I need to do something else ?

@rdevaissiere

FWIW, this fix is also working for me, my scenario is a typical data ingestion service with a lot of merging going on and it took me a while to track it down to spl_object_hash collisions. I was hitting many random ORMInvalidArgumentException exceptions:

A managed+dirty entity [...] can not be scheduled for insertion.

Hopefully we can merge this one in. Thanks for the diligent work @moroine!

@Ocramius Ocramius added this to the 2.5.5 milestone Sep 10, 2016
@Ocramius Ocramius self-assigned this Sep 10, 2016
@Ocramius Ocramius added the Duplicate label Sep 10, 2016
@Ocramius
Member

Manually applied these tests, then checked against #5689. The issue is resolved there, therefore closing this PR as duplicate, since the approach in #5689 seems to solve the actual cause of the issue (rather than the symptom, fixed here via UnitOfWork#forgetEntity().

This is considered "fixed" in 2.5.5

@Ocramius Ocramius closed this Sep 10, 2016
@moroine moroine deleted the AdactiveSAS:master branch Dec 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment