Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced and fixed CachedDoubler behaviour in order to reduce memor… #420

Merged
merged 2 commits into from
Dec 17, 2019

Conversation

DonCallisto
Copy link
Contributor

@DonCallisto DonCallisto commented Nov 3, 2018

…y usage

With this PR we resolve #96 and phpspec/phpspec#884.

Our empirical tests:

~ 6000 PHPSpec examples

Before patch:

  • 1GB 900MB memory consumption
  • 31676 declared classes
  • 105 seconds per suite execution

After patch:

  • 146MB (-1.301 %)
  • 3148 declared classes (-1.006 %)
  • 9 seconds per suite execution (- 1.166 %)

Only thing that should be considered here is the static-ness nature of CachedDoubler.
Is this fine or is it improvable?

@DonCallisto
Copy link
Contributor Author

DonCallisto commented Nov 3, 2018

🎉 tests are now green.
WDYT?

@DonCallisto
Copy link
Contributor Author

@everzet @stof @ciaranmcnulty have you had the change to check this?
This is not only a performace improvement but also resolve a bug where we check objects with == as, at the moment, two different classes will always return false (as the double will be different) and so anyone would be able to write a use case for this in test.

Copy link
Member

@stof stof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need some functional tests proving that 2 different doubles don't get considered as equal when matching arguments and so on. I'm not confident at all that things keep working currently.


parent::registerClassPatch($patch);
}
private static $classes = array();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it really be static ? Registering different class patches in a different instance must not reusable the old class definition (and that's why registerClassPatch was resetting it)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that the PHP engine already has a set of registered classes: if we can do a lookup via hash name (class_exists($hash, false)) that should be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ocramius I can modify this but this PR seems to be "sleepy" and no one seems to care to get rid of that memory leak.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attention has been drawn here from Roave/no-leaks#14 ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I read the thread.

I mean, if neither @stof nor @ciaranmcnulty provide their feeback, following with a possible merge, this PR will stuck here forever.

I know that everybody is busy and this is OSS, but having those kind of feedback (like yours that is only an enhancement because I feel it doesn't really change the sense of this PR) is crucial to move to merging direction.

I can try in the following hours to make your fix but @stof was worried about other kind of issues (read below) not related to staticness of class hashes.

@DonCallisto
Copy link
Contributor Author

This would need some functional tests proving that 2 different doubles don't get considered as equal when matching arguments and so on

@stof instances are different whereas class definitios are the same. I've tested this patch against our 6000 test suite and all runs perfectly. I can - for sure - try those kind of tests when I have free time in these upcoming weeks.

should it really be static ? Registering different class patches in a different instance must not reusable the old class definition (and that's why registerClassPatch was resetting it)

That's a doubt I had. So you can confirm that, in some situation, same class could be "patched" in a different manner?
One solution could be to include applyed patches (don't know if this breaks incapsulation, I need to take a depper look) in class hash.
Other solution is to try avoid the staticness nature of the CachedDoubler.

WDYT?

@stof
Copy link
Member

stof commented Nov 13, 2018

@DonCallisto that fact that we allow registering patchers in the Doubler instance means that 2 instances might have different patchers.
Including a hash of the registered patchers might indeed be a solution.

@DonCallisto
Copy link
Contributor Author

DonCallisto commented Nov 13, 2018

@stof just some basic considerations here and there before work on this solution.

To keep a simple design and in order to prevent iteraring two times on patches, creating two nodes etc, I would like to return frorm createDoubleClass something called DoubleClassCreatedDTO(string $fqcn, array $appliedPatchesFQCNs) that I can reuse in the CachedDoubler. This way the Doubler does not require some tricky and dirty hacks (like keep in memory something that helps to retrieve applied patches; this seems more a job for the CachedDoubler, isn't it?).
This is somehow necessary, at least from my POV, in order to avoid re-creating the node two times in the same process and keep all actors well separated.
I haven't any other valid idea at the moment, do you?

One final note: how would you test this in a functional fashion? I've broke my head thinking about this but I'm currently to a dead end.
IIUC you would test, basically, ExactValueToken and IdenticalToken in a functional way (other arguments seems not to be "impacted" from this patch).
What I miss here is how can I actually test them.
Maybe we should use PHPUnit (that is already there) and use “real” objects, no mocks and test Arguments results? Is this what you mean?

If you give me a feedback on the first consideration and a pointer for the latter, I will be happy to work on this.

@DonCallisto
Copy link
Contributor Author

Moreover, thinking about a functional implementation, I’m still not sure we need that kind of test.

I mean, this patch will change only the way class definitions are created, not the instances.
As arguments works on instances, I’m pretty sure those tests have no value at all.

Am I missing something?

@DonCallisto
Copy link
Contributor Author

DonCallisto commented Nov 14, 2018

Maybe another option is to create a CachedClassMirror based on ClassMirror and construct CachedDoubler with this particular type.
In this way it could be safe to "demand" for "supported patches" from CachedDoubler without re-creating the node two times.
Only "issue" I see here is that this would require an override of createDoubleClass in CachedDoubler or to return a "cloned" Node (but that is what I'm trying to avoid) as patches in Doubler will be re-applied.
In both case I don't see any advantage and I would stay with my first idea (reported below from here)

I would like to return frorm createDoubleClass something called DoubleClassCreatedDTO(string $fqcn, array $appliedPatchesFQCNs) that I can reuse in the CachedDoubler.

WDYT?

@stof
Copy link
Member

stof commented Nov 14, 2018

@DonCallisto what is important is not the list of applied patches (which could not be determined to build the cache key, as you would still have to create the class and then discard it), but the list of registered patchers.
It is possible that 2 instances with different patchers would generate the same code for a double (because the extra patcher was not doing anything for that class), but I think that's fine. It is better to have cache misses sometimes for such case (which is not the common case anyway, as most projects use the same patchers for their whole testsuite) rather than having wrong cache hits (as that's a bug).

@DonCallisto
Copy link
Contributor Author

@stof whereas the idea behind applied/registered patch is not clear to me

as you would still have to create the class and then discard it

(that's not true as we create the class after it is patched)

and

It is possible that 2 instances with different patchers would generate the same code for a double (because the extra patcher was not doing anything for that class)

here there's still not instances, only definitions

I think that's fine to build with registered patches anyway because same class, with same interface and same registered patchers in the doubler will produce the same code. That means that there's no way for the same class to have applied one patch in a situation and not to be applied in other (unless the patch isn't registered, of course).

I've implemented this fix in a separate commit. I'm still writing some spec tests (as stated before, I don't think we need functional tests for this as we producing class definitions, not instances) to cover all the possibilities (permutation of same/different class, same/different/no interfaces, same/different/no patches).

Do it LGTY?

@stof
Copy link
Member

stof commented Nov 16, 2018

(that's not true as we create the class after it is patched)

Having to call the parent logic to create the class was what I was talking about.
The updated versions of the PR is precisely doing what I was suggesting: talking all registered patchers into account for the hash.

It is possible that 2 instances with different patchers would generate the same code for a double (because the extra patcher was not doing anything for that class)
here there's still not instances, only definitions

this sentence is talking about instances of the doubler, not instance of the double.

I've implemented this fix in a separate commit. I'm still writing some spec tests (as stated before, I don't think we need functional tests for this as we producing class definitions, not instances) to cover all the possibilities (permutation of same/different class, same/different/no interfaces, same/different/no patches).

Last time I tried enabling the CachedDoubler in my tests, the argument matching (when using a double as argument in a method of another double) broke for some cases, because 2 doubles of the same class were now considered the same for argument matching (without the cached doubler, they were instances of different classes, which is why it worked).
My concern here is not that this PR produces a bad double class. My concern is that it breaks compatibility with some other parts of Prophecy.

@DonCallisto
Copy link
Contributor Author

@stof

because 2 doubles of the same class were now considered the same for argument matching

Yes, I understand your point but I don't think that the following statement is true

because 2 doubles of the same class were now considered the same for argument matching

because instances are different so from my POV.

Yes, it could break some tests because == was never evaluated to true but that was a bug and will prevent users to write tests for such use case.

Could you try it against the code you broken when tried out CachedDoubler?
Moreover I don't think that internally we break anything from Prophecy POV. If you perceive we did, please point me out parts of Prophecy code so I can think more deeply about this possible issue and eventually try to find a solution.

Thanks in advance for your review and your time.

@DonCallisto
Copy link
Contributor Author

Any update? I would be happy to try and see what Stof claims but I really can't understand how to make (and on what) the functional test :\

@DonCallisto
Copy link
Contributor Author

Any news? I'm willing to continue my work but without any clue I just can't write litterally any line of those tests that stof mentioned above

@DonCallisto
Copy link
Contributor Author

DonCallisto commented Jan 4, 2019

@stof maybe the tests you’re waiting for are something like this repeated for all interesting matchers on doubles?

All the credits goes to my colleague @bsod85

Please let us know if this is acceptable (there is something that need to be fixed, we know, but the idea behind it should be clear) or if is not what you expect to be tested.

Thanks!

@DonCallisto
Copy link
Contributor Author

@stof Sorry to disturb you, could you give me any update on this. I know that probably you are busy but I would like only to have a feedback, even if this is "I can't take a look at this for a while".
Moreover I would like other contributors like @everzet or @ciaranmcnulty to take a look at this and the whole conversation and give any opinion if possibile.

Thank you in advance guys.

@romm
Copy link

romm commented Feb 7, 2019

Hi @DonCallisto I just tested your patch and could see a great performance improvement indeed, for a suite of 221 tests with 591 assertions.

Without the patch:

Time: 3.59 seconds, Memory: 32.00MB

With the patch on:

Time: 1.17 seconds, Memory: 22.00MB

However, a test of mine is broken with the patch applied:

/** @test */
public function fetch_service_returns_service_with_highest_priority()
{
    /* *** Initialization *** */
    $source = [];
    $targetType = $this->prophesize(Type::class)->reveal();

    $serviceA = $this->prophesize(Service::class);
    $serviceA->priority()
        ->shouldBeCalled()
        ->willReturn(100);

    $serviceB = $this->prophesize(Service::class);
    $serviceB->priority()
        ->shouldBeCalled()
        ->willReturn(200);

    /* *** Process *** */
    $this->registry->register(get_class($serviceA->reveal()));
    $this->registry->register(get_class($serviceB->reveal()));

    $service = $this->registry->fetchFor($source, $targetType);

    /* *** Assertion *** */
    $this->assertInstanceOf(get_class($serviceB->reveal()), $service);
}

Maybe I shouldn't have done the assertion based on the class of the double, but as far as I can see your patch doesn't allow testing cases like this one, maybe am I missing something?

@DonCallisto
Copy link
Contributor Author

@romm What does get_class returns in this example? Please provide a var dump that helps me to understand better.
Thanks 🙂

@romm
Copy link

romm commented Feb 8, 2019

@DonCallisto adding this dumps:

var_dump(get_class($serviceA->reveal()));
var_dump(get_class($serviceB->reveal()));

I get the following values:

Without your patch

string(17) "Double\Service\P4"
string(17) "Double\Service\P5"

With your patch

string(17) "Double\Service\P2"
string(17) "Double\Service\P2"

After some more analysis, the services are instantiated with the given class and that's probably not the best way to do it: an instance of the object should be given instead.

So I think I'm going to change the implementation anyway, but you better know what was going on here.

@DonCallisto
Copy link
Contributor Author

@romm Yes, it is was I suspected without seeing you implementation. That's a case where Prophecy didn't catch a possible problem with your implementation due to different class names. This patch will correct this scenario also.

@DonCallisto
Copy link
Contributor Author

@ciaranmcnulty could you please take a look at this? We're basically stuck but I feel like this PR could be merged with right PHPUnit tests done (see #420 (comment))

@DonCallisto
Copy link
Contributor Author

@everzet could you take a look at this? It's four months since this PR is open and I feel like we're there or at least very close to mergeable code.

@romm
Copy link

romm commented May 3, 2019

Hello, any news concerning this PR? Could we help to get it merged somehow? :)

@DonCallisto
Copy link
Contributor Author

@romm it seems not

@DonCallisto
Copy link
Contributor Author

@ciaranmcnulty as I see you're quite active these days, could you try to take a look a this and thread?
It would be great.
Thanks.

@ciaranmcnulty
Copy link
Member

@stof do you have some time to continue the conversation here? I am not clear what the issues you observed were

@DonCallisto
Copy link
Contributor Author

DonCallisto commented Oct 3, 2019

Just to provide more context (as one year has passed), an interesting question never answered was this.
From what I understood, @stof was worried about such things.
Don't know if what we're testing there is "the right thing" to clear any doubt and, if so, if those kind of tests should be included in this PR.
Still willing to work on this if we clear out what points needed to be tested and what kind of tests should I provide.

PS.: I'm using since a year a fork of mine with modification I've provided in this PR. Old examples (as pointed out in first comment) ~ 6000 didn't broke and new examples (can't quantify them) worked as expected.

@ciaranmcnulty
Copy link
Member

OK I'm going to merge this, we can always back it out if users find issues

We aren't maintaining actively enough to get into long 'what if' conversations if they're not resolved

@ciaranmcnulty ciaranmcnulty merged commit 6315aad into phpspec:master Dec 17, 2019
@DonCallisto
Copy link
Contributor Author

Thanks @ciaranmcnulty!
Could you please release a tagged version with this merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High memory usage in phpspec because of repeated double definitions
5 participants