-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce named children #12
Comments
While we internally have an AST pretty printer, it's not so easy to expose it, because it works on our raw internal representation. So we'd have to import the userland AST representation back into the internal one. And that's rather problematic, because we'd have to ensure that AST nodes are only nested correctly (there are certain assumptions about what can be used where, otherwise you'll get segfaults). |
If I understand correctly, at some lower layer, PHP has an internal raw representation of an AST, as well as a pretty printer that works on this representation. At some higher layer (the php-ast extension), this internal AST is converted to a "userland" AST and exposed via the ast\* API. Thus, if a userland AST created/manipulated in userland is to be pretty printed, the php-ast extension cannot use the internal pretty printer on the lower layer, because this pretty printer does not know anything about this userland AST. Is that correct? Please correct me if I'm wrong. :-) Then, the following questions come to my mind:
Forgive me if these questions are not too naive. I'm fairly new to PHP, and only seek to learn. :) |
|
I see. Thank you for your detailed answers. I hope that (2) will be implemented someday, that would be awesome. In the meantime, (3) is probably not too hard to do. If I find the time, I may try myself at it. ;) On a related side question, where would I find more information about the structure of the ASTs as represented by the php-ast extension?
So if I wanted, say, to find out the meaning of the fourth child of an |
There is currently no good way to find out what things mean apart from trial and error, or alternatively looking at http://lxr.php.net/xref/PHP_TRUNK/Zend/zend_language_parser.y. For the future we should either document the meaning of the offsets or generate string keys for child names (for non-list nodes). The latter would probably be preferable... The last child of class nodes is always NULL, it's just an artifact of decl-nodes being fixed-size. I've suppressed it in e52afee. |
Here's some docs to start: $funcNames = ['params', 'uses', 'stmts', 'returnType'];
$names = [
/* special nodes */
'AST_NAME' => ['name'],
'AST_CLOSURE_VAR' => ['name'],
/* declaration nodes */
'ZEND_AST_FUNC_DECL' => $funcNames,
'ZEND_AST_CLOSURE' => $funcNames,
'ZEND_AST_METHOD' => $funcNames,
'ZEND_AST_CLASS' => ['extends', 'implements', 'stmts'],
/* 0 child nodes */
'ZEND_AST_MAGIC_CONST' => [],
'ZEND_AST_TYPE' => [],
/* 1 child node */
'ZEND_AST_VAR' => ['name'],
'ZEND_AST_CONST' => ['name'],
'ZEND_AST_UNPACK' => ['expr'],
'ZEND_AST_UNARY_PLUS' => ['expr'],
'ZEND_AST_UNARY_MINUS' => ['expr'],
'ZEND_AST_CAST' => ['expr'],
'ZEND_AST_EMPTY' => ['expr'],
'ZEND_AST_ISSET' => ['var'],
'ZEND_AST_SILENCE' => ['expr'],
'ZEND_AST_SHELL_EXEC' => ['expr'],
'ZEND_AST_CLONE' => ['expr'],
'ZEND_AST_EXIT' => ['expr'],
'ZEND_AST_PRINT' => ['expr'],
'ZEND_AST_INCLUDE_OR_EVAL' => ['expr'],
'ZEND_AST_UNARY_OP' => ['expr'],
'ZEND_AST_PRE_INC' => ['var'],
'ZEND_AST_PRE_DEC' => ['var'],
'ZEND_AST_POST_INC' => ['var'],
'ZEND_AST_POST_DEC' => ['var'],
'ZEND_AST_YIELD_FROM' => ['expr'],
'ZEND_AST_GLOBAL' => ['var'],
'ZEND_AST_UNSET' => ['var'],
'ZEND_AST_RETURN' => ['expr'],
'ZEND_AST_LABEL' => ['name'],
'ZEND_AST_REF' => ['var'],
'ZEND_AST_HALT_COMPILER' => ['offset'],
'ZEND_AST_ECHO' => ['expr'],
'ZEND_AST_THROW' => ['expr'],
'ZEND_AST_GOTO' => ['label'],
'ZEND_AST_BREAK' => ['depth'],
'ZEND_AST_CONTINUE' => ['depth'],
/* 2 child nodes */
'ZEND_AST_DIM' => ['expr', 'dim'],
'ZEND_AST_PROP' => ['expr', 'prop'],
'ZEND_AST_STATIC_PROP' => ['class', 'prop'],
'ZEND_AST_CALL' => ['expr', 'args'],
'ZEND_AST_CLASS_CONST' => ['class', 'const'],
'ZEND_AST_ASSIGN' => ['var', 'expr'],
'ZEND_AST_ASSIGN_REF' => ['var', 'expr'],
'ZEND_AST_ASSIGN_OP' => ['var', 'expr'],
'ZEND_AST_BINARY_OP' => ['left', 'right'],
'ZEND_AST_GREATER' => ['left', 'right'],
'ZEND_AST_GREATER_EQUAL' => ['left', 'right'],
'ZEND_AST_AND' => ['left', 'right'],
'ZEND_AST_OR' => ['left', 'right'],
'ZEND_AST_ARRAY_ELEM' => ['value', 'key'],
'ZEND_AST_NEW' => ['class', 'args'],
'ZEND_AST_INSTANCEOF' => ['expr', 'class'],
'ZEND_AST_YIELD' => ['value', 'key'],
'ZEND_AST_COALESCE' => ['left', 'right'],
'ZEND_AST_STATIC' => ['varName', 'default'],
'ZEND_AST_WHILE' => ['cond', 'stmts'],
'ZEND_AST_DO_WHILE' => ['stmts', 'cond'],
'ZEND_AST_IF_ELEM' => ['cond', 'stmts'],
'ZEND_AST_SWITCH' => ['cond', 'stmts'],
'ZEND_AST_SWITCH_CASE' => ['cond', 'stmts'],
'ZEND_AST_DECLARE' => ['declares', 'stmts'],
'ZEND_AST_PROP_ELEM' => ['name', 'default'],
'ZEND_AST_CONST_ELEM' => ['name', 'value'],
'ZEND_AST_USE_TRAIT' => ['traits', 'adaptations'],
'ZEND_AST_TRAIT_PRECEDENCE' => ['method', 'insteadof'],
'ZEND_AST_METHOD_REFERENCE' => ['class', 'method'],
'ZEND_AST_NAMESPACE' => ['name', 'stmts'],
'ZEND_AST_USE_ELEM' => ['name', 'alias'],
'ZEND_AST_TRAIT_ALIAS' => ['method', 'alias'],
'ZEND_AST_GROUP_USE' => ['prefix', 'uses'],
/* 3 child nodes */
'ZEND_AST_METHOD_CALL' => ['expr', 'method', 'args'],
'ZEND_AST_STATIC_CALL' => ['class', 'method', 'args'],
'ZEND_AST_CONDITIONAL' => ['cond', 'trueExpr', 'falseExpr'],
'ZEND_AST_TRY' => ['tryStmts', 'catches', 'finallyStmts'],
'ZEND_AST_CATCH' => ['exception', 'varName', 'stmts'],
'ZEND_AST_PARAM' => ['type', 'name', 'default'],
/* 4 child nodes */
'ZEND_AST_FOR' => ['init', 'cond', 'loop', 'stmts'],
'ZEND_AST_FOREACH' => ['expr', 'valueVar', 'keyVar', 'stmts'],
]; |
Thank you, that is exactly what I needed! Here's two more very small questions in the same vein:
|
|
Concerning (2), I agree that the inline doccomment looks really weird, but I (wrongly) assumed that it had to be right before the closure declaration instead of before the assignment for the parser to make the connection. Thanks :) So, in general, a pretty printer should probably not render the doc comment right before the closure declaration, but instead go back to the closest parent that is a child of an For (1): What I would personally find much more beautiful was if the order of the children of all "function-ish" types were switched, with the $funcNames = ['params', 'stmts', 'returnType', 'uses'];
...
'ZEND_AST_CLOSURE' => $funcNames,
'ZEND_AST_FUNC_DECL' => $funcNames,
'ZEND_AST_METHOD' => $funcNames, Then, the last child could simply be suppressed for functions and methods, just as you did for class declarations: 'ZEND_AST_CLOSURE' => ['params', 'stmts', 'returnType', 'uses'],
'ZEND_AST_FUNC_DECL' => ['params', 'stmts', 'returnType'],
'ZEND_AST_METHOD' => ['params', 'stmts', 'returnType'], Thus, the children match up perfectly, and code for handling closures, functions and methods at once is straightforward to write, without having to use any conditionals concerning the type of the node. Of course, such code would still have to check whether the last child is set, but this is no different from the current paradigm, since such code currently also needs to check whether the Of course, this may be a matter of taste, and switching the order would probably require some changes to the underlying Zend engine, and it might possibly even conflict with some other design conventions that are in place... I'd very much like to hear your opinion. :) |
I agree that moving |
Cool, that sounds great! :) Actually I didn't worry too much about downstream code... your README does say very clearly that
I, for one, am prepared to cope with changes to the AST representation in my downstream code. ;) |
Btw, contrarily to what I said before, I would make 'ZEND_AST_CLOSURE' => ['params', 'returnType', 'stmts', 'uses'],
'ZEND_AST_FUNC_DECL' => ['params', 'returnType', 'stmts'],
'ZEND_AST_METHOD' => ['params', 'returnType', 'stmts'],
'ZEND_AST_CLASS' => ['extends', 'implements', 'stmts'], This way, for all decl nodes, |
I've now added versioning support and did some first tweaks. Next up is switching to named children, that will require some more work. |
This will also require some more design work to choose good names. I don't think the list posted above is ideal. There's two things to look out for: Predictable names (so you can make a reasonable guess about how something is called without looking at a dump) and consistent names (similar nodes should have similar names). I suspect that the fewer unique names the better and it's preferable to use a common generic name than a rare specific name. Current list:
|
The implementation for string child names is up in a separate branch: https://github.com/nikic/php-ast/tree/stringChildren |
Looks like @tpunt has implemented a pretty printer: https://github.com/tpunt/php-ast-reverter |
Sorry for not answering in a long time, I was quite busy. I've had a look at the named children version. I do agree that names should be predictable and consistent, though I think your above list is, if not ideal, at least a good start. :) As far as I can see, your Here's a few thoughts / comments / questions:
To conclude, it is great news that you introduced named children! I cannot use this in my own code so far since I do not want to depend on a version of Lastly, I'm happy someone did write a pretty printer. I had too many other things in my own oven... does this close this thread? Though I have to admit, I do appreciate this place for discussing some internals concerning the AST representation. ;) Maybe the issue should be renamed into "Introduce named children." ;) |
Answering selectively: To 1: The master/stringChildren branches weren't in sync. I've now merged master into stringChildren, so the behavior of version 10 and 20 should match. I've also switched to using version 30 for the named children, as it's unrelated to the other version 20 changes. I'll mark 20 as stable soon and I'll also make the version argument non-optional (the default version can't really be bumped without breaking BC...) To 2: For As to the rest, not sure... Using To 5: |
Switched name nodes to being the current version (30) and also added a node list to the README: https://github.com/nikic/php-ast#ast-node-kinds I think with this I'll consider this FR as "implemented" :) |
PHP-Parser had a pretty printer that allowed to compile an AST (as generated by PHP-Parser) back to PHP code:
https://github.com/nikic/PHP-Parser/blob/1.x/doc/2_Usage_of_basic_components.markdown#pretty-printer
It would be great to have something like this in the php-ast extension, allowing for example to manipulate ASTs exposed by this extension and converting them back to PHP code.
The text was updated successfully, but these errors were encountered: