Skip to content

Conversation

@TysonAndre
Copy link
Contributor

@TysonAndre TysonAndre commented Mar 4, 2020

Currently, it's possible to do,
but would require a lot of code duplication and pinning to an exact
version of tolerant-php-parser, which discourages overriding the lexer.

I can think of the following use cases for this.
My main reason is to reuse token_get_all() elsewhere,
but being able to parse T_FN in php < 7.4 (for short arrow functions) is also convenient.

  • Multiple applications needing to use the result of token_get_all
    for the same file. If none of them modify the array,
    it's much faster to reuse the same array than to create this. (benchmarking, the time savings aren't as large as I thought, but there might be some memory savings for reusing the strings in the array)

    For example, Phan's language server mode will potentially use
    tolerant-php-parser. In addition to that, it also uses token_get_all
    in InlineHTMLPlugin (to check for misuse of inline HTML) and sometimes
    in BuiltinSuppressionPlugin
    (to list T_COMMENT/T_DOC_COMMENT containing @phan-suppress-*)

    Aside: https://wiki.php.net/rfc/token_as_object would be faster and
    more memory efficient than token_get_all() in php 8.0, if it gets approved

  • Needing to call tolerant-php-parser on the same token stream,
    multiple times.
    (e.g. an application that modifies the Microsoft\PhpParser\Node
    instances (but not tokens), or which creates data structures from the
    original Node but usually discards them to save memory)

Currently, it's possible to do,
but would require a lot of code duplication and pinning to an exact
version of tolerant-php-parser, which discourages that.

I can think of the following use cases for this.
My main reason is to reuse token_get_all() elsewhere,
but being able to parse `T_FN` in php < 7.4 is also convenient.

- Multiple applications needing to use the result of token_get_all
  for the same file. If none of them modify the array,
  it's much faster to reuse the same array than to create this.

  For example, Phan's language server mode will potentially use
  tolerant-php-parser. In addition to that, it also uses token_get_all
  in InlineHTMLPlugin (to check for misuse of inline HTML) and sometimes
  in BuiltinSuppressionPlugin
  (to list T_COMMENT/T_DOC_COMMENT containing `@phan-suppress-*`)

  Aside: https://wiki.php.net/rfc/token_as_object would be faster and
  more memory efficient than token_get_all() in php 8.0, if it gets approved
- Needing to call tolerant-php-parser on the same token stream,
  multiple times.
  (e.g. an application that modifies the Microsoft\PhpParser\Node
  instances (but not tokens), or which creates data structures from the
  original Node but usually discards them to save memory)
@TysonAndre
Copy link
Contributor Author

This would also be useful for T_MATCH in #326 . @roblourens - thoughts?

Copy link
Member

@roblourens roblourens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this seems valid to me, thanks!

@roblourens roblourens merged commit b4f5f2c into microsoft:master Jul 12, 2020
@TysonAndre TysonAndre deleted the token_get_all-override branch March 27, 2021 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants