New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spl_object_hash conflict on Merge #1465

Closed
wants to merge 8 commits into
from

Conversation

Projects
None yet
5 participants
@moroine
Contributor

moroine commented Jul 16, 2015

This is proper PR which was originally #1461

Hi,

First of all, this is just a proposal and I'm sure there is a better solution for the problem described here. As this is my first hack into doctrine source code I'm not sure consequences that might be cause by my modification (even all unit tests passes).

I have encounter a problem due to spl_object_hash. I have written a functional test in order to reveal my issue.

The problem is when I merge an entity, here $user, UOW keep data on the original entity identified by it's spl_object_hash. Then if I unset this $user the spl_object_hash is now available for new object. So I experimented in my case reuse of previous hash which cause a managed+dirty entity error.

So I see two solutions

UOW keep reference to the entity given as even the given variable is unset there is remaining reference in UOW so the spl_hash will not be released.
Do not store data about the given entity, as merge operation isn't supposed to modified given entity.
I tried to implement the second solution as the first may consume a much more memory.

I don't why the merge operation need to do this, so I encapsulated it to prevent unwanted bug :)

@doctrinebot

This comment has been minimized.

Show comment
Hide comment
@doctrinebot

doctrinebot Jul 16, 2015

Hello,

thank you for creating this pull request. I have automatically opened an issue
on our Jira Bug Tracker for you. See the issue link:

http://www.doctrine-project.org/jira/browse/DDC-3834

We use Jira to track the state of pull requests and the versions they got
included in.

Hello,

thank you for creating this pull request. I have automatically opened an issue
on our Jira Bug Tracker for you. See the issue link:

http://www.doctrine-project.org/jira/browse/DDC-3834

We use Jira to track the state of pull requests and the versions they got
included in.

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Jul 16, 2015

Contributor

@DHager I provided a unit test that fail on version 2.5.0 but succeed with my PR. It seems that the merge operation stores the spl_object_hash of input object in originalEntityData array.

In order to be more understandable, I will explain my use case were the bug originally appears.

I realize a bulk operations for update data from external source. To do that without consuming all my memory, I use a lot of clear / merge operations. In a special case I encounter the DoctrineException "a managed+dirty entity ...", so I cannot flush my changes. After investigating it appears that I have an entity $a detached after a clear operation. In order to re-attach it, I merge it and directly affects the result to $a = $em->merge($a);. In this case the spl_object_hash of $a change from $spl1 to $spl2. So the object of value $spl1 doesn't exist anymore and it is garbage collected. Then I create a new entity which it's spl_object_hash is $spl1 (it doesn't appear always, but it's possible according to phpdocumentation). This entity cannot be persist as the $originalData[$spl1] isn't empty.

Contributor

moroine commented Jul 16, 2015

@DHager I provided a unit test that fail on version 2.5.0 but succeed with my PR. It seems that the merge operation stores the spl_object_hash of input object in originalEntityData array.

In order to be more understandable, I will explain my use case were the bug originally appears.

I realize a bulk operations for update data from external source. To do that without consuming all my memory, I use a lot of clear / merge operations. In a special case I encounter the DoctrineException "a managed+dirty entity ...", so I cannot flush my changes. After investigating it appears that I have an entity $a detached after a clear operation. In order to re-attach it, I merge it and directly affects the result to $a = $em->merge($a);. In this case the spl_object_hash of $a change from $spl1 to $spl2. So the object of value $spl1 doesn't exist anymore and it is garbage collected. Then I create a new entity which it's spl_object_hash is $spl1 (it doesn't appear always, but it's possible according to phpdocumentation). This entity cannot be persist as the $originalData[$spl1] isn't empty.

@DHager

This comment has been minimized.

Show comment
Hide comment
@DHager

DHager Jul 16, 2015

Contributor

OK, so if someone goes:

$foo = new Thing();
$bar = $entityManager->merge($foo);
unset($foo);

It looks like the "unreliable hash key" from $foo makes its way into UOW's $originalEntityData array here:

$this->originalEntityData[spl_object_hash($entity)][$name] = $managedCol;

What I don't know is whether this is intended/desirable behavior. If $originalEntityData is supposed to be keyed only by the official single managed copy (???) then it might be a simple fix of:

$this->originalEntityData[spl_object_hash($managedCopy)][$name] = $managedCol;

However, if $this->originalEntityData is supposed to be keyed by the originally-given-object(s), then I agree that storing a reference to the original input (to prevent garbage collection and reuse of the memory-address) is a possible solution.

@Ocramius I'm not very familiar with merge/detach. Does any of this ring a bell for you, or is there some other contributor we should ask?

Contributor

DHager commented Jul 16, 2015

OK, so if someone goes:

$foo = new Thing();
$bar = $entityManager->merge($foo);
unset($foo);

It looks like the "unreliable hash key" from $foo makes its way into UOW's $originalEntityData array here:

$this->originalEntityData[spl_object_hash($entity)][$name] = $managedCol;

What I don't know is whether this is intended/desirable behavior. If $originalEntityData is supposed to be keyed only by the official single managed copy (???) then it might be a simple fix of:

$this->originalEntityData[spl_object_hash($managedCopy)][$name] = $managedCol;

However, if $this->originalEntityData is supposed to be keyed by the originally-given-object(s), then I agree that storing a reference to the original input (to prevent garbage collection and reuse of the memory-address) is a possible solution.

@Ocramius I'm not very familiar with merge/detach. Does any of this ring a bell for you, or is there some other contributor we should ask?

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Jul 28, 2015

Contributor

Hi,

I would like to know if anyone could resolve this issue ? I think this is a bug of merge operation and it could be unsefull to integrate this pull request or resolve the issue with another solution maybe as described by @DHager ?

Thank you,

Contributor

moroine commented Jul 28, 2015

Hi,

I would like to know if anyone could resolve this issue ? I think this is a bug of merge operation and it could be unsefull to integrate this pull request or resolve the issue with another solution maybe as described by @DHager ?

Thank you,

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Nov 27, 2015

Contributor

I refactor a little my PR.

Anyone could check it ?

Thanks,

Contributor

moroine commented Nov 27, 2015

I refactor a little my PR.

Anyone could check it ?

Thanks,

moroine added some commits Nov 27, 2015

Fix Spl_object_hash conflict on merge (reverted from commit 54bebab) …
…(reverted from commit 0408ffd793471863de6c3f8c88f3a1cbbb0750db)
Merge remote-tracking branch 'doctrine/master'
# Conflicts:
#	lib/Doctrine/ORM/UnitOfWork.php
$user = new CmsUser;
$hash = spl_object_hash($user);
$mergedUser = $this->_em->merge($user);

This comment has been minimized.

@Ocramius

Ocramius Nov 28, 2015

Member

Shouldn't self::assertSame($user, $mergedUser) here?

@Ocramius

Ocramius Nov 28, 2015

Member

Shouldn't self::assertSame($user, $mergedUser) here?

This comment has been minimized.

@moroine

moroine Nov 29, 2015

Contributor

I think we should assert that $user and $mergedUser are two different object (by there reference) as merge operation should return a managed copy of input object but the input object has to remain unmanaged ?

@moroine

moroine Nov 29, 2015

Contributor

I think we should assert that $user and $mergedUser are two different object (by there reference) as merge operation should return a managed copy of input object but the input object has to remain unmanaged ?

This comment has been minimized.

@moroine

moroine Dec 22, 2015

Contributor

We couldn't assertSame as for object assertSame check if the two variables reference the same object. In the case of merge, it will return a copy so it's another object

@moroine

moroine Dec 22, 2015

Contributor

We couldn't assertSame as for object assertSame check if the two variables reference the same object. In the case of merge, it will return a copy so it's another object

$originalEntityData = $property->getValue($uow);
// As $user is unset, $hash could be reused by spl_object_hash
$this->assertArrayNotHasKey($hash, $originalEntityData);

This comment has been minimized.

@Ocramius

Ocramius Nov 28, 2015

Member

The object still exists in memory here, since it is merged into the UoW.

@Ocramius

Ocramius Nov 28, 2015

Member

The object still exists in memory here, since it is merged into the UoW.

This comment has been minimized.

@moroine

moroine Nov 29, 2015

Contributor

Of what I understood, the $user reference is never kept by the merge operation. As the bug that I uncounter is the fact my $user doesn't exists anymore and then I try to merge a new object but the uow kept old hash in their originalEntityData array.

@moroine

moroine Nov 29, 2015

Contributor

Of what I understood, the $user reference is never kept by the merge operation. As the bug that I uncounter is the fact my $user doesn't exists anymore and then I try to merge a new object but the uow kept old hash in their originalEntityData array.

Show outdated Hide outdated lib/Doctrine/ORM/UnitOfWork.php
@@ -1901,10 +1902,60 @@ private function doMerge($entity, array &$visited, $prevManagedCopy = null, $ass
$this->cascadeMerge($entity, $managedCopy, $visited);
// Restore previous state (where the given entity was previously managed)
if (!$isKnown) {

This comment has been minimized.

@Ocramius

Ocramius Nov 28, 2015

Member

Can be replaced by a simple check that verifies that $entity !== $managedCopy or such

@Ocramius

Ocramius Nov 28, 2015

Member

Can be replaced by a simple check that verifies that $entity !== $managedCopy or such

This comment has been minimized.

@moroine

moroine Nov 29, 2015

Contributor

This may should work as if the return managed entity is different from the input one we need to ensure the spl hash of the input entity is not used anymore as it could be garbage collected !

@moroine

moroine Nov 29, 2015

Contributor

This may should work as if the return managed entity is different from the input one we need to ensure the spl hash of the input entity is not used anymore as it could be garbage collected !

moroine added some commits Dec 14, 2015

Merge remote-tracking branch 'doctrine/master'
# Conflicts:
#	lib/Doctrine/ORM/UnitOfWork.php
@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Dec 22, 2015

Contributor

@Ocramius I made up the merge operation in order to resolve conflict and include changes requested.

Don't take care about all commits, I removed them (I forgot this PR use my master branch)

Contributor

moroine commented Dec 22, 2015

@Ocramius I made up the merge operation in order to resolve conflict and include changes requested.

Don't take care about all commits, I removed them (I forgot this PR use my master branch)

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Dec 29, 2015

Contributor

@Ocramius It seems that Travis is missconfigured as the composer.json requirements is not compatible with versions 5.4 and lower of php

Contributor

moroine commented Dec 29, 2015

@Ocramius It seems that Travis is missconfigured as the composer.json requirements is not compatible with versions 5.4 and lower of php

@Ocramius

This comment has been minimized.

Show comment
Hide comment
@Ocramius

Ocramius Dec 29, 2015

Member

@moroine fixed in master: a rebase should give you a green build

Member

Ocramius commented Dec 29, 2015

@moroine fixed in master: a rebase should give you a green build

@Ocramius Ocramius added the Bug label Dec 29, 2015

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Dec 30, 2015

Contributor

Thanks ! It works fine !

Did you seen my other related PR ? #5570

Contributor

moroine commented Dec 30, 2015

Thanks ! It works fine !

Did you seen my other related PR ? #5570

@moroine

This comment has been minimized.

Show comment
Hide comment
@moroine

moroine Feb 2, 2016

Contributor

@Ocramius Is this PR schedule to be merged or I need to do something else ?

Contributor

moroine commented Feb 2, 2016

@Ocramius Is this PR schedule to be merged or I need to do something else ?

@rdevaissiere

This comment has been minimized.

Show comment
Hide comment
@rdevaissiere

rdevaissiere Feb 20, 2016

FWIW, this fix is also working for me, my scenario is a typical data ingestion service with a lot of merging going on and it took me a while to track it down to spl_object_hash collisions. I was hitting many random ORMInvalidArgumentException exceptions:

A managed+dirty entity [...] can not be scheduled for insertion.

Hopefully we can merge this one in. Thanks for the diligent work @moroine!

FWIW, this fix is also working for me, my scenario is a typical data ingestion service with a lot of merging going on and it took me a while to track it down to spl_object_hash collisions. I was hitting many random ORMInvalidArgumentException exceptions:

A managed+dirty entity [...] can not be scheduled for insertion.

Hopefully we can merge this one in. Thanks for the diligent work @moroine!

@Ocramius Ocramius added this to the 2.5.5 milestone Sep 10, 2016

@Ocramius Ocramius self-assigned this Sep 10, 2016

@Ocramius Ocramius added the Duplicate label Sep 10, 2016

@Ocramius

This comment has been minimized.

Show comment
Hide comment
@Ocramius

Ocramius Sep 10, 2016

Member

Manually applied these tests, then checked against #5689. The issue is resolved there, therefore closing this PR as duplicate, since the approach in #5689 seems to solve the actual cause of the issue (rather than the symptom, fixed here via UnitOfWork#forgetEntity().

This is considered "fixed" in 2.5.5

Member

Ocramius commented Sep 10, 2016

Manually applied these tests, then checked against #5689. The issue is resolved there, therefore closing this PR as duplicate, since the approach in #5689 seems to solve the actual cause of the issue (rather than the symptom, fixed here via UnitOfWork#forgetEntity().

This is considered "fixed" in 2.5.5

@Ocramius Ocramius closed this Sep 10, 2016

@moroine moroine deleted the AdactiveSAS:master branch Dec 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment