Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated Hydrators can avoid reflection completely #62

Closed
Ocramius opened this issue Jun 27, 2013 · 44 comments
Closed

Generated Hydrators can avoid reflection completely #62

Ocramius opened this issue Jun 27, 2013 · 44 comments
Milestone

Comments

@Ocramius
Copy link
Owner

It is possible to avoid reflection completely in PHP 5.4 - credits to @asm89 for the initial idea:

class Bar
{
    private $baz = 'tab';
}

class Foo extends Bar
{
    private $baz = 'baz';

    public function __construct()
    {
        $this->accessor = function () { return $this->baz; };
        $this->writer = function ($bar) { $this->baz = $bar; };
    }
}

$foo = new Foo();
$bar = new Bar();

$accessor = $foo->accessor;

var_dump($accessor());

$accessor = Closure::bind($foo->accessor, $bar, 'Bar');

var_dump($accessor());

$writer = Closure::bind($foo->writer, $bar, 'Bar');

$writer('taz');

var_dump($bar);

This trick can also be used to support hydration of objects that are instances of a final class

A working example can be found at http://3v4l.org/L6PhM

Relevant problems in this implementation are:

  • Keep current implementation working with PHP 5.3 (reflection)
  • Skip this entire logic when no private properties exist
  • Create a blueprint instance of the object to be hydrated in order to correcly bind the closure
  • Make sure the blueprint is instantiated in a safe way - should use unserialize and do it safely, should fallback to reflection if __wakeup or Serializable are implemented
  • Handle multiple levels of inheritance - each accessor is should work for one class' private methods.
  • There should only be one accessor per level of inheritance and array_merge could then be used to aggregate results. Eventually consider if multiple accessors is better than array_merge with a benchmark. The accessor may also return a list to avoid the call to array_merge
@mnapoli
Copy link

mnapoli commented Jun 27, 2013

Why not go even further:

$this->accessor = function ($name) { return $this->$name; };

One accessor for all properties? (like a rootkit :p)

@Slamdunk
Copy link
Contributor

This is a design bug in PHP 5.4, and so should not be used as official method to access privare properties.

Private must be private, and if you want to read it, just keep using dedicated inspection code (aka Reflection).

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

How is awesomeness a bug? :trollface: (and I don't believe it's a "bug")

More seriously, why not simply use this:

class Bar
{
    private $baz = 'tab';
}

$accessor = function () { return $this->baz; };
$writer = function ($bar) { $this->baz = $bar; };

$bar = new Bar();

$accessor = Closure::bind($accessor, $bar, 'Bar');
$writer = Closure::bind($writer, $bar, 'Bar');

var_dump($accessor());

$writer('taz');

var_dump($bar);

Working example: http://3v4l.org/P5GL7

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

Even better, a generic accessor that you bind to any class/instance you want:

class Bar {
    private $baz = 'tab';
}

$accessor = function ($name) { return $this->$name; };
$writer = function ($name, $value) { $this->$name = $value; };

$bar = new Bar();

$accessor = Closure::bind($accessor, $bar, $bar);
$writer = Closure::bind($writer, $bar, $bar);

var_dump($accessor('baz'));

$writer('baz', 'taz');

var_dump($bar);

Working example: http://3v4l.org/W8JT1

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

I wonder if it is faster than Reflection though.

A performance test would be appropriate, but I don't have time right now :/

Still, if it is faster, it would be a good project to create on github, and also provide abstraction for 5.3 compatibility (fallback on reflection).

That would also allow to call private/protected methods.

@Ocramius
Copy link
Owner Author

@Slamdunk that's not really a design bug... Closure#bindTo() and Closure::bind() are specifically designed for that, even if the design is leaky

@Ocramius
Copy link
Owner Author

@mnapoli it's still faster than reflection I thin, but I'll be sure to check it - after all I have a performance test suite :)

@Ocramius
Copy link
Owner Author

@mnapoli re: project - you probably missed it, but this project already provides all that abstraction: https://github.com/Ocramius/ProxyManager/blob/master/docs/generated-hydrator.md

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

@Ocramius Ah yes, didn't see that before.

However I don't get this: why generating a Proxy for that? And why is it in this library? It doesn't seem to me like it's related to a proxy library (or I am probably missing something).

Why not create a separate project specifically filling that need?

@Ocramius
Copy link
Owner Author

@mnapoli it's part of the project because:

  • I needed that logic for ghost objects
  • It uses the same generation logic (and internal components of ProxyManager)
  • It currently extends the real subject because of how protected properties work, so it's not a proxy but it's a wanna-be

Moving it to another lib is on my todo list, I just couldn't sync them continuously until the API isn't stable, so for now it lives in here.

@Ocramius
Copy link
Owner Author

I wrote a more simplified example that is a bit more near to what the implementation will look like:

<?php

class Bar
{
    private $baz = 'tab';
}

class Foo extends Bar
{
    public function __construct()
    {
        $this->writer = function (Bar $obj, $bar) { $obj->baz = $bar; };

        $this->writer = Closure::bind($this->writer, $this, 'Bar');
    }
}

$foo = new Foo();
$bar = new Bar();

$foo->writer->__invoke($bar, 'taz');

var_dump($bar);

@lisachenko
Copy link

I used a closure binding to implement privileged advices in PHP: https://gist.github.com/lisachenko/5650454 ) Also I created an ultra-fast hydrator: look at https://gist.github.com/lisachenko/5877138

UPD1: Working example: http://3v4l.org/iArcE

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

@lisachenko Really amazing.

@Ocramius
Copy link
Owner Author

@lisachenko this one is actually a bit faster than that, the real problem is only writes, which I'll fix in the next days by implementing what described in this issue =D

Interesting approach on the advice!

@lisachenko
Copy link

@mnapoli actually, my first try was to do the following:

$hydrator = (new ReflectionFunction('get_object_vars'))->getClosure();
$data = $hydrator->bindTo($obj, get_class($ob))->__invoke($obj);

But PHP doesn't work as expected (

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

@lisachenko Why don't you do:

$hydrator = function() {
    return get_object_vars($this);
}

?

You don't really need it to be an object method, and that would avoid to use reflection ($hydrator = (new ReflectionMethod(__CLASS__, 'getContext'))->getClosure($this);)

@Ocramius
Copy link
Owner Author

@lisachenko this is much better than using reflection IMO:

<?php

// example class
class Foo{
    private $bar = 'baz';
    public function setBar($bar) { $this->bar = $bar; }
}

// hydrator (actually just an extractor)
$hydrator = Closure::bind(function ($obj) { return get_object_vars($obj); }, null, 'Foo');

// usage
var_dump($hydrator(new Foo()));
$foo2 = new Foo();
$foo2->setBar('TAB!');
var_dump($hydrator($foo2));

@lisachenko
Copy link

@mnapoli of course, this will work, but not so cool as binding get_object_vars directly (better performance). I think, there is a bug in PHP that prevents me from binding $this to system functions.

@lisachenko
Copy link

@Ocramius I think that OO solution will be better to reuse and will use less memory )

@Ocramius
Copy link
Owner Author

@lisachenko anyway, if we can get back on topic, we already have something that (I think) is the fastest possible extraction logic. Memory usage is not the problem, since these use cases are about 1 hydrator loading a huge amount of records (1000+), so the memory impact here is negligible.

The problem now is writing values into an inheritance:

class Foo {
    private $foo;
}

class Bar extends Foo {
    private $bar;
}

class Baz extends Bar {
    private $baz;
}

The problem here is hydrating Baz (filling it with ['foo' => 1, 'bar' => 2, 'baz' => 3])

@Ocramius
Copy link
Owner Author

So far I have 1 method call per private property in my mind, with following pseudo-code:

class BazHydrator
{
    public function __construct()
    {
        $this->fooWriter = // instantiate closure here
        $this->barWriter = // instantiate closure here
        $this->bazWriter = // instantiate closure here
    }

    public function hydrate($object, $data)
    {
        $this->fooWriter->__invoke($object, $data['foo']);
        $this->barWriter->__invoke($object, $data['bar']);
        $this->bazWriter->__invoke($object, $data['baz']);
    }
}

Can we reduce the number of method calls here?

@mnapoli
Copy link

mnapoli commented Jun 27, 2013

@Ocramius You will need to bind it to every parent class successively, i.e. in your example to Foo, Bar and Baz is that what the problem is?

I don't know if you can optimize that :/

( nevermind you posted in-between )

@Ocramius
Copy link
Owner Author

@mnapoli that's exactly where I'm stuck at :)

Other suggestions such as using list are also very welcome if anyone can think of ways of doing this

@lisachenko
Copy link

@Ocramius more cool:

class Foo {
    private $foo;
}

class Bar extends Foo {
    private $foo;
}

class Baz extends Bar {
    private $foo;
} 

(same property name)

@Ocramius
Copy link
Owner Author

@lisachenko yep, that's another nasty one :(

@macnibblet
Copy link

@Ocramius, just create a proxy for each part of the inheritance and extend that one ?

@Ocramius
Copy link
Owner Author

@lisachenko one way to treat that is keeping the internal data structure that PHP has too:

var_export((array) new Baz());

http://3v4l.org/cJcTE

This also means that in tools like the ORM, private properties on different levels of the inheritance have to keep their original name (\0Foo\0foo for example), and won't be overrideable. That's more like a design decision of course

@Ocramius
Copy link
Owner Author

@macnibblet and then merge results? that's super slow

@lisachenko
Copy link

@Ocramius I know about the internal structure, but this way is too ugly for me, however, parsing this information can give the fastest solution )

@lisachenko
Copy link

@Ocramius There is an old trick with serialize/unserialize )

UPD1 proof-link http://3v4l.org/gSD9K
UPD2 real-life usage in Mockery: Container.php#L376

@Ocramius
Copy link
Owner Author

@lisachenko that works for simple instantiations (assuming it's not implementing __wakeup or Serializable), but not for more complex cases where we need to smash together a string of various types of objects first :\

@Ocramius
Copy link
Owner Author

I'm going to try to implement this this evening =D

@Ocramius
Copy link
Owner Author

I just tested this extensively. Turns out that $reflectionProperty->setValue($object, $value); is ALWAYS faster than $closure($object, $value);

I tried different alternatives:

  • Reflection:

    $r = new ReflectionProperty('Foo', 'bar');
    $r->setAccessible(true);
    // benchmark on following operations:
    $r->setValue($object, $data['bar']);
  • Simple closure:

    $c = Closure::bind(
        function ($object, $value) { $object->bar = $value; },
        null,
        'Foo'
    );
    // benchmark on following operations:
    $c($object, $data['bar']);
  • Avoiding passing parameters to the closure

    $this->object = null;
    $this->data = null;
    $object = & $this->object;
    $data = & $this->data;
    $c = Closure::bind(
        function () use ($object, $data) { $object->bar = $data['bar']; },
        null,
        'Foo'
    );
    // benchmark on following operations:
    $this->object = $object;
    $this->data = $data;
    $c();

Reflection is the fastest, so I think I will just stick with it and eventually make its instantiation optional if no private properties are detected. Thoughts?

@lisachenko
Copy link

@Ocramius interesting results... Thank you for the research and testing )

@mnapoli
Copy link

mnapoli commented Jun 28, 2013

Yes interesting! You did not count the "set up" in the tests, but it would be interesting to know if the setup time changes the results?

For example if you end up (in some situation) instantiating ReflectionProperty and using it once, is that still faster than creating the closure and using it once?

That would mean running the same tests, but including all the setup inside the benchmark

After all, the most intensive operation is the creation of the ReflectionProperty instance (or Closure instance), and if your code only uses it once, then your benchmark results may be different.

@Ocramius
Copy link
Owner Author

@mnapoli setup time here is irrelevant, since I'm also generating classes at runtime (and autoloading them in case they're cached). See https://github.com/Ocramius/ProxyManager/tree/feature/issue-%2362-generated-proxy-without-reflection for the branch with my quick 'n dirty experiments

I already excluded reflection where there's no private properties (please review #63). Consider that these hydrators are ALWAYS meant to be used for big data chunks. I'd be interested in seeing how opcache makes this faster since there's no paths at all in the code.

Also consider that instantiating a ReflectionProperty takes a lot of time only for the first instantiation. Subsequent calls to new ReflectionProperty(...) are negligible (because of how the engine deals with it).

Again, instantiation time here is not a problem at all, we're talking about a very slow object at instantiation time, but very fast and optimized afterwards. Worst case is when you instantiate a hydrator for an object that has private properties - in that case you get a ReflectionClass instance created

@mnapoli
Copy link

mnapoli commented Jun 28, 2013

Ok

Also consider that instantiating a ReflectionProperty takes a lot of time only for the first instantiation. Subsequent calls to new ReflectionProperty(...) are negligible (because of how the engine deals with it).

Good to know! Thanks

@Ocramius
Copy link
Owner Author

Closing this since all hackers gave their feedback :)

@Ocramius Ocramius reopened this Jul 10, 2013
@Ocramius
Copy link
Owner Author

I actually made a huge mistake while running the test suite: I was using XDebug.

Everyone please throw eggs at me now.

Anyway, will work on it again so that we get some better performance ;)

I also updated the article at http://ocramius.github.io/blog/accessing-private-php-class-members-without-reflection/ to reflect this performance test error of mine

/cc @leedavis81

@mnapoli
Copy link

mnapoli commented Jul 11, 2013

:)

The best would be a repeatable test suite, that anyone can reproduce. I'm thinking of athletic, it seems very appropriate for that.

@Ocramius
Copy link
Owner Author

@mnapoli spawned jeremyFreeAgent/Clutch#2 and polyfractal/athletic#5 from this one: that would be MUCH better than trying to rewrite the wheel with our own implementation of a simple (possibly buggy/flawed, like what happened here) performance test case

@leedavis81
Copy link
Contributor

Whoops :)

Those libraries do look useful. Running a comparison test (rather than a "must be done in x seconds" one) with a number of other possible implementations is definitely the right move.

{your current library implementation} vs {n other possible implementations}

Error/Exception/Notice if anything beats your implementation with execution time and/or memory consumption. Be tricky to get an exact comparison (certainly on memory usage) unless each scenario is run in isolation (one after the other). Would also need to be on the same hardware for a valid time comparison.

I guess this wouldn't be suitable in the cases where readability is paramount.

Definitely some unexplored territory on performance tools for units of PHP.

@Ocramius
Copy link
Owner Author

@leedavis81 I just want to avoid further stupid mistakes ;) Any library that prevents me from rewriting rudimentary stuff like https://github.com/Ocramius/ProxyManager/blob/0.4.0/tests/ProxyManagerTest/Functional/BasePerformanceTest.php is good :)

@Ocramius
Copy link
Owner Author

Ocramius commented Aug 2, 2013

@Ocramius Ocramius closed this as completed Aug 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants