Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hydrators should not be proxy, now works gracefully with final classes #59

Merged
merged 6 commits into from
Jan 13, 2017

Conversation

pounard
Copy link
Contributor

@pounard pounard commented Jan 11, 2017

By not extending the original class, we can hydrate final classes. It makes the generate code simpler, potentially faster for complex classes to hydrate, and generation code a lot simpler.

It uses the Closure::bind() mecanism for all properties, no matter their visibility, but it creates a single closure for each parent class instead of a closure per method, which drastically reduces the number of potential function calls when hydrating objects with a complex hierarchy and a lot of private properties.

If also fixes a bug where when attempting to hydrate objects partially (without all properties within the array) notices were thrown, now they are not anymore!

It also fixes #35

And potentially #51 because Closures are ordered following the hierarchy: if a parent private property is hidden by a child class, the child will always get the private property, and not the parent.

It removes a few code for manipulating the AST, since it's not necessary anymore. Actually, we could probably remove ALL the AST manipulation dependency and just write a PHP template with the added Closures inside, which would make this code even faster and simpler.

@pounard
Copy link
Contributor Author

pounard commented Jan 11, 2017

It also adds tests for class hierarchy and values, it was untested.

@flip111
Copy link

flip111 commented Jan 11, 2017

can you do some benchmarks on php 5 & 7 for this PR ?

@pounard
Copy link
Contributor Author

pounard commented Jan 11, 2017

I did with PHP7, with no private properties, I have a 40% penalty, with 2 private properties across a parent and a child class, I get 20%, so as an educated guess I would say that reaching 4 or 6 private properties, I'm on-par in term of performances, but I'll do some more tomorrow and give you the methodology and results.

Please check issue #60 it explains it a bit.

If I'm not mistaken, this API doesn't work with PHP5 due to strict types usage.

@flip111
Copy link

flip111 commented Jan 11, 2017

i was using this library with php5 before and it was working then

not sure what you mean with 40% penalty and on-par performance. i mean to compare this pr with master. i will look at issue 60 later

@pounard
Copy link
Contributor Author

pounard commented Jan 11, 2017

40% penalty means that in ideal conditions (which never happens in real life environments) I am 40% slower, but I'm quite sure that with real life objects (much more complex data structures) there would be no significant difference.

$ast = $builder->fromReflection($originalClass);
$ast = array($class);
if ($namespace = $originalClass->getNamespaceName()) {
$ast = array(new Namespace_(new Name(explode('\\', $namespace)), $ast));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short array syntax

Copy link
Contributor Author

@pounard pounard Jan 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note as above applies, don't nitpick, please, saying it once would have been enough for me to search and fix all of them :)


$ast = $builder->fromReflection($originalClass);
$ast = array($class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short array syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of @Ocramius' code do not use the short array syntax, I did this on purpose to avoid frustrating him. It's actually never wrote with short array syntax in the whole library. I'm just keeping consistency with the existing codebase.

* @var PropertyAccessor[]
*/
private $propertyWriters = array();
private $classPropertyMap = array();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short array syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, dont nitpick.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a comment is not relevant, simply don't worry about it and defer to merger. I'll probably apply CS changes on merge anyway.

@@ -35,6 +37,7 @@
* {@inheritDoc}
*
* @author Marco Pivetta <ocramius@gmail.com>
* @author Pierre Rineau <pierre.rineau@makina-corpus.com>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed as it was only changes on existent files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, wrong file.

@@ -34,6 +36,7 @@
* Replaces methods `__construct`, `hydrate` and `extract` in the classes of the given AST
*
* @author Marco Pivetta <ocramius@gmail.com>
* @author Pierre Rineau <pierre.rineau@processus.org>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

@pounard pounard Jan 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually almost rewrote fully the file, so my name is relevant here. But if the original author don't want my name, I'll remove it, but I'll answer only to him.

return null;
}

$node->stmts[] = new Property(Class_::MODIFIER_PRIVATE, array(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

short array syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, just keeping consistency with the rest of the code. I am on your side with this, I'd be please to use the short array syntax, but once again, I prefer to have @Ocramius opinion before changing any code convention this library uses.

/**
* @param string $property0
*/
public function setProperty0($property0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's missing parameter type or it's not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was written from a copy past of another test class, which doesn't carry any return type. It's definitely not needed. I'm OK to write them, but for consistency all test classes should probably reviewed and fixed too. Not sure this is the scope of the issue, once again I'd like to have @Ocramius' opinion about this.

@pounard
Copy link
Contributor Author

pounard commented Jan 11, 2017

@malukenho I mostly agree with your requested changes, but some do change the codestyle of the library, so I really want to wait for @Ocramius opinion, it's his call to make.

@malukenho malukenho added this to the 2.0.1 milestone Jan 11, 2017
@pounard pounard mentioned this pull request Jan 12, 2017
@Ocramius
Copy link
Owner

with real life objects (much more complex data structures) there would be no significant difference.

The library was designed for inclusion in ORM v3, so any bit of performance we can squeeze out of it should be considered. If the hydrator codegen without inheritance is something that fixes handling of final and classes already implementing the hydrator interface, then this should probably be considered a fallback, rather than a replacement solution.

@Ocramius
Copy link
Owner

can you do some benchmarks on php 5 & 7 for this PR ?

Indeed, we need to add benchmarks (phpbench-based) to the library.
PHP 5 is no longer the target audience though.

@pounard
Copy link
Contributor Author

pounard commented Jan 12, 2017

The library was designed for inclusion in ORM v3, so any bit of performance we can squeeze out of it should be considered

It surely needs more benchmarks to tell, but I'd probably say that the performance impact is mostly due to isset() calls, and probably not by the fact there's closures. I'll test that.

@Ocramius
Copy link
Owner

I'll test that.

The amount of opcodes being generated usually gives you a good idea of where the slowness is.

@pounard
Copy link
Contributor Author

pounard commented Jan 12, 2017

The library was designed for inclusion in ORM v3

What are the requirements for ORM v3 regarding this library ?

If the hydrator codegen without inheritance is something that fixes handling of final and classes already implementing the hydrator interface, then this should probably be considered a fallback, rather than a replacement solution.

Not even sure why inheritance here is even wanted to start with, I think that it should get rid of, there is no need at all, and it replaces your objects by proxies and it actually makes no sense to me, except if it is required by ORM v3.

I think it make the overall code of this library much harder to read and maintain since you have to parse the class AST, replace its methods, add some others, change its name, add the extension, etc etc etc... Generating much more simple hydrators could also make the AST completly useless in the end.

And I often read your blog posts, and watch from time to time your sessions, isn't composition and single responsibility supposed to be better than inheritance ? :) My guess here, especially in this case, inheritance doesn't serve any purpose except making the whole much more complex.

Could you elaborate why this should be a fallback instead of being the common case?

Could you also elaborate about why it's a good thing to keep a very complex generation mechanism while it could be very much simpler and serve the exact same features?

I guess you have a roadmap I don't know anything about, but right now I see plenty of code that could be overly simplified and still be on-par in term of features and performance.

Thank you for your time and answers, looking forward to read your explanations!

@pounard
Copy link
Contributor Author

pounard commented Jan 12, 2017

The amount of opcodes being generated usually gives you a good idea of where the slowness is.

I'm doing it sometime, but nothing worth real figures, especially that opcode is itself an high-level IR and all operations do not cost the same, so less doesn't really mean faster. Except if you know pretty well how the zend internals are working, I guess it dangerous to analyse code performance only through this.

@Ocramius
Copy link
Owner

What are the requirements for ORM v3 regarding this library ?

Hydration in ORM has a massive performance impact, specifically method call time (reflection). If we can reduce that time in any way by providing an abstraction that does it for us, then the abstraction should always strive for best performance there.

Not even sure why inheritance here is even wanted to start with, I think that it should get rid of, there is no need at all, and it replaces your objects by proxies and it actually makes no sense to me, except if it is required by ORM v3.

The reason why inheritance was that any scope change would need a closure method call (expensive). Another possible solution is to return a closure that is already bound to the correct scope, and getting rid of the hydrator interface overall.

I think it make the overall code of this library much harder to read and maintain since you have to parse the class AST, replace its methods, add some others, change its name, add the extension, etc etc etc... Generating much more simple hydrators could also make the AST completly useless in the end.

Agree - if we can get rid of it, yet retain the same performance, then that's a very good thing.

And I often read your blog posts, and watch from time to time your sessions, isn't composition and single responsibility supposed to be better than inheritance ? :) My guess here, especially in this case, inheritance doesn't serve any purpose except making the whole much more complex.

As mentioned above, inheritance is used here as a hack to get rid of a single method call, yet retaining compat with the hydrator interfaces.

Could you elaborate why this should be a fallback instead of being the common case?

If the new approach cannot provide the same performance as the previous one, then it should only be used when a final class is involved, or when a class implementing hydrate or extract is involved, or a class with a final constructor.

@Ocramius
Copy link
Owner

Could you also elaborate about why it's a good thing to keep a very complex generation mechanism while it could be very much simpler and serve the exact same features?

The only feature provided by this library is speed. If that is not the absolute focus of the consumer of the library, then a simplistic, manually-written hydrator is sufficient in most scenarios.

I guess you have a roadmap I don't know anything about, but right now I see plenty of code that could be overly simplified and still be on-par in term of features and performance.

No, sorry, I don't have a roadmap for most OSS stuff, since I really work on it when I got time.
The on-par needs to be validated by a proper test suite there, which we should add separately, before eventually merging this patch.

@Ocramius Ocramius modified the milestones: 2.1.0, 2.0.1 Jan 12, 2017
@Ocramius
Copy link
Owner

I'm doing it sometime, but nothing worth real figures, especially that opcode is itself an high-level IR and all operations do not cost the same, so less doesn't really mean faster. Except if you know pretty well how the zend internals are working, I guess it dangerous to analyse code performance only through this.

Yes, but at a quick glance you see what is going on :-)

@pounard
Copy link
Contributor Author

pounard commented Jan 12, 2017

Hydration in ORM has a massive performance impact, specifically method call time (reflection). If we can reduce that time in any way by providing an abstraction that does it for us, then the abstraction should always strive for best performance there.

Definitely agree.

The reason why inheritance was that any scope change would need a closure method call (expensive).

Is the closure method call that expensive ? Looking at your code anyway, it does it as soon as you have private properties, you manually bind the object scope onto the closure because the current inherited scope cannot deal with parent privates, so in the end the negative impact of the closure is still here.

Most tutorials for Doctrine (so I guess would be the same for the future ORM v3) will tell you to write your entities properties as private, so I guess this call is somehow no-avoidable.

Another possible solution is to return a closure that is already bound to the correct scope, and getting rid of the hydrator interface overall.

True, but in that case this library wouldn't have any purpose anymore right ?

Out of the scope of ORM v3, I'm using it to dynamically hydrate objects from arbitrary SQL queries in a custom DBAL, and making it a bit more flexible is what I do need at this point. I guess that in the future, people that will use ORM v3 will also need that flexibility: partial fetches (hence the isset()) entities inheritance and making them final are all very valid use-cases!

As mentioned above, inheritance is used here as a hack to get rid of a single method call, yet retaining compat with the hydrator interfaces.

I'm quite sure a single method call won't hurt, especially when you consider the fact that as soon as you have private properties, you'll have that call in the end! It's already so much faster than other hydrators, and I my guess is as soon as you will want to add features or fix bugs, you'll make it slower, I think that having the flexibility right now might prevent tons of future issues.

If the new approach cannot provide the same performance as the previous one, then it should only be used when a final class is involved, or when a class implementing hydrate or extract is involved, or a class with a final constructor.

I'm quite sure that what costs the most are the isset() calls, they do hashmap lookups in the data array, need to be tested, but I'm quite sure my approach might be faster than the original one as soon as hydrated objects have private properties. In my code, the number of function calls is 1, in your, it is 0 to n where n is the number of private properties!

The only feature provided by this library is speed

Agree, and I like it, but I need to it to not yell when I have missing properties in the array and I need it to be able to use final classes :) Even with my patch, still more than 10 times faster than Zend's faster one!

No, sorry, I don't have a roadmap for most OSS stuff, since I really work on it when I got time

No worries, I understand, we all live in the same world I guess!

@pounard
Copy link
Contributor Author

pounard commented Jan 13, 2017

God I hate github, there is no way to re-attach the PR to the right commit.

@pounard pounard reopened this Jan 13, 2017
@pounard
Copy link
Contributor Author

pounard commented Jan 13, 2017

Sehr gut!

@Ocramius
Copy link
Owner

I was going to clean up some details but then realised that I have to first:

  • move away from hydrator classes
  • extract the code generator into two classes:
    • extractorFactoryGenerator
    • hydratorFactoryGenerator
      Both would return a reference to a callable that will instantiate the hydrator or extractor. Still need to check how to rewrite caching for that.
  • remove any array access that is not needed from inside the hydrator

Therefore, merging this, then moving this to milestone 3.0

@Ocramius Ocramius merged commit 725e765 into Ocramius:master Jan 13, 2017
@pounard
Copy link
Contributor Author

pounard commented Jan 13, 2017

By doing that you would also not need the AST dependency anymore. Hum the actual code itself could be rewritten without.

Thank you very much for merging this!

@flip111
Copy link

flip111 commented Jan 13, 2017

@pounard congratulations on this PR ! :bowtie:

@flip111
Copy link

flip111 commented Jan 13, 2017

maybe update readme performance numbers ?

Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
Ocramius added a commit that referenced this pull request Jan 14, 2017
@Ocramius Ocramius modified the milestones: 3.0.0, 2.1.0 Jul 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hydrators fails on objects that include hydrate / extract methods
4 participants