Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usefulness of attributes #312

Closed
felixfbecker opened this issue Oct 11, 2016 · 16 comments
Closed

Usefulness of attributes #312

felixfbecker opened this issue Oct 11, 2016 · 16 comments

Comments

@felixfbecker
Copy link
Contributor

felixfbecker commented Oct 11, 2016

I was wondering what was the design decision behind adding an associative array of "attributes" to each node instead of simply patching attributes on the node directly. Calling getAttribute('myAttribute') everytime is a lot of boilerplate and has the performance overhead of a function call every time.
Just like JavaScript, PHP has excellent support for decorating objects with additional attributes without any warning or performance impact - it's even used by native json_decode or PDO::FETCH_OBJECT.
If I see remember correctly, $namespacedNode actually is also patched this way by the NodeResolver and not a defined property. I have to work with custom attributes a lot and always calling getAttribute() with the string name feels like working with just a raw array instead of an object.

Besides less code and better performance, it would have benefits for static tooling:

$node->getAttribute('startFilePos'); // was it startFilePos or fileStartPos? No help from IDE :(

With attributes, you could give static tooling a hint:

/** 
 * @property Node $parentNode
 */
interface LinkedNode extends Node {}

/**
 * @param LinkedNode $node
 */
public function enterNode($node) {
  $node->parentNode; // IDE knows $node has parentNode property
}
@nikic
Copy link
Owner

nikic commented Oct 19, 2016

Attributes date back from sometime around the start of this project. At that time nodes extended ArrayObject in property mode and everything on that ArrayObject was considered a subnode. This changed at some point, switching to explicitly declared properties and a getSubNodeNames() method specifying the structure of the node. At this point, I don't think there is anything preventing us from using overloaded properties for attributes. I agree that this is both syntactically more convenient and would allow to at least declare all the standard attributes.

Do you know whether PhpStorm has some kind of mechanism to extend the properties of a class in a separate declaration? The override method seems to target factories exclusively.

@felixfbecker
Copy link
Contributor Author

Awesome, this will make code so much less verbose. The best thing is that this can totally happen in a backwards-compatible manner, getAttribute() and setAttribute() can just get/set properties on the object.

Do you know whether PhpStorm has some kind of mechanism to extend the properties of a class in a separate declaration? The override method seems to target factories exclusively.

Sorry, what do you mean by that? I'm not working with PhpStorm but I imagine something like the example I gave in the OP should work

@nikic
Copy link
Owner

nikic commented Oct 19, 2016

Draft implementation: https://github.com/nikic/PHP-Parser/compare/attributes

Note that this keeps the backing $attributes property and uses __get/__set to handle access, for two reasons: Firstly, we operate on the full array in some places, in particular the initial construction is always from an array (we'd have to create a dynamic property table and copy otherwise). Secondly, this declares intent for IDEs, e.g. PhpStorm downgrades the warning level for access to undefined properties. Of course, it has the disadvantage that there's no performance benefit...

Sorry, what do you mean by that? I'm not working with PhpStorm but I imagine something like the example I gave in the OP should work

Right, of course you aren't ^^ I was hoping for some magic to make this work without having to explicitly declare the node to be of a different type.

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Oct 19, 2016

__get()/__set() are about 3.4x slower than properties, about twice as slow as a normal function call.
Could you link where the implementation with native properties would be problematic ("initial construction")?

An IDE should be able to read @property with or without __get(). If not that is a problem of the IDE and we certainly should not base performance decisions on this. We also only need @property for interfaces - for classes we can just declare the property in the class body (just document that it will be null if you don't decide to include it in the options).

@nikic
Copy link
Owner

nikic commented Oct 19, 2016

Could you link where the implementation with native properties would be problematic ("initial construction")?

The initial construction is https://github.com/nikic/PHP-Parser/blob/master/lib/PhpParser/NodeAbstract.php#L15. This would require copying all the attributes into the dynamic object properties. If you're considering this from a performance angle, this is expensive for the common case where attributes are not actually used much.

From an implementation perspective another problematic part is https://github.com/nikic/PHP-Parser/blob/master/lib/PhpParser/NodeAbstract.php#L105. This would need to gather the object properties, while excluding subnodes and declared properties that are not subnodes -- this comes down to using reflection, which I'm not keen on.

An IDE should be able to read @Property with or without __get(). If not that is a problem of the IDE and we certainly should not base performance decisions on this.

The IDE can read @property, but will (correctly) treat any custom attributes as undefined properties.

Of course you are right that magic get/set is slow. This certainly does not make sense if the goal is to improve performance -- is it? Is attribute fetching a bottleneck for you?

@felixfbecker
Copy link
Contributor Author

Hm, valid points. Cannot confirm that it is a performance bottleneck for me, but it is verbose for sure. I mostly see the benefit in cleaner code. But if the implementation is too hard of course we can leave it.
Is it needed to pass $attributes to the ctor? Can't the callee set them directly?

@felixfbecker
Copy link
Contributor Author

@nikic I'm a bit lost in the source code, could you give me a link where the nodes are instantiated with attributes?

@nikic
Copy link
Owner

nikic commented Oct 21, 2016

@felixfbecker This happens in generated code. Something like Node\Const_[$1, $3] in the grammar expands to new Node\Const_(..., ..., $this->startAttributesStack[...] + $this->endAttributes). The attribute arrays come from the lexer. The generating code is in https://github.com/nikic/PHP-Parser/blob/master/grammar/rebuildParsers.php#L86.

@nikic
Copy link
Owner

nikic commented Oct 21, 2016

Btw, another potential complication with using dynamic properties is that you can no longer have a subnode and attribute with the same name, which is currently supported. (Personally not particularly concerned about that.)

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Oct 21, 2016

Ah, that explains why I couldn't fin it. Well the easy solution would be to just do

foreach ($attributes as $attr => $val) {
  $this->$attr = $val;
}

in AbstractNode. We should do that anyway for BC.
For better performance though, would it be possible to generate an assignment for every attribute? Like

$node->attribute1 = value1;
$node->attribute2 = value2;

For me, this makes a lot of sense. They become actual attributes of the node.

setAttribute / getAttribute(s) would get deprecated and provided for BC.
For getAttributes, would this be enough?

return array_diff_key(get_object_vars($this), array_flip($this->getSubNodeNames()));

Even if getAttributes returned more attributes in some cases than before it shouldn't be a problem because that shouldn't break anything, and it's only for BC.

As it is targeted for a major release people will have to do some changes to their code anyway and converting to the new style is a simple regexp search/replace.

Btw, another potential complication with using dynamic properties is that you can no longer have a subnode and attribute with the same name, which is currently supported. (Personally not particularly concerned about that.)

Thought about that too, which is why it's safer to do in a major version. But can you really imagine having a function node with an additional property stmts? That does not contain the statements? 😄

@felixfbecker
Copy link
Contributor Author

felixfbecker commented Oct 28, 2016

@nikic I profiled my language server because I need to improve performance and it turns out getAttribute() actually is a performance hit. As I iterate all ASTs and index FQNs, getAttribute() is called many many times for each Node for each AST for each file. For just parsing 100 files it was called 40,000 times and is actually the most called function in the whole project. Just being able to access the attributes directly instead would save quite some CPU (PHP function invocations have more overhead than in any other language).

@nikic
Copy link
Owner

nikic commented Oct 29, 2016

What percentage of the time is it though?

@felixfbecker
Copy link
Contributor Author

Don't have the profile open atm but it was about 4%

@TomasVotruba
Copy link
Contributor

What is this issue actually about?

I've noticed one note about autocomplete of getAttribute(...). To fix that, you can use constants defined in single class: https://github.com/RectorPHP/Rector/blob/master/src/Node/Attribute.php

And use like this:

$node->getAttribute(Attributes::TYPES);

Having constants makes it very hard to make a typo, compared to string naming.

Also I prefer current state using explicit methods than hiding it behind magic. I have very bad experience with that due to assumptions that other packages don't have to make. It was good and fast idea on first site though, so I understand your need.

I suggest either closing this or resolving it. Depending on your real current needs for this.

@TomasVotruba
Copy link
Contributor

TomasVotruba commented May 13, 2022

I think this one can be closed too, as:

  • get line start/end, token start/end are part of NodeAbstract methods
  • existing properties and attributes are now clearly separated in php-parser 4 architecture

@nikic
Copy link
Owner

nikic commented May 21, 2023

Given that dynamic properties are deprecated now, I think it's safe to close this.

@nikic nikic closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants