Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Lexer Adapters #528

Open
3 of 6 tasks
bd82 opened this issue Jul 4, 2017 · 7 comments · Fixed by #531
Open
3 of 6 tasks

Custom Lexer Adapters #528

bd82 opened this issue Jul 4, 2017 · 7 comments · Fixed by #531

Comments

@bd82
Copy link
Member

bd82 commented Jul 4, 2017

  • Interface.
  • Implementation for default TokenVector based Adapter.
  • JS Docs.
  • Performance check --> Regressions.
  • Examples:
    • Just In Time Lexing (without creating a Token Vector first - saving memory...).
    • Async Lexing (using Atomics.wait?)
  • Investigate alternative using only method overrides.
@bd82
Copy link
Member Author

bd82 commented Jul 4, 2017

Performance seems virtually the same (±1%). So alles gute. 😀

bd82 pushed a commit that referenced this issue Jul 4, 2017
@bd82
Copy link
Member Author

bd82 commented Jul 8, 2017

Farther testing has shown a small performance regression of 3-4% for the full flow scenario (lexing + parsing).

Need to update the benchmark to support a parser only scenario.
To measure the actual impact on parsing speed.

@bd82
Copy link
Member Author

bd82 commented Jul 8, 2017

benchmarking (on Chrome 59) only the parsing part showed a 12-13% regression for the JSON
grammar, although the CSS grammar only showed only 3% regression.

It may be that being extra generic is not worth it considering the performance regression.
While there may be a way to mitigate the performance issue while retaining the customizability.
This may lead to too much complexity.

Currently there is only one productive relevant scenario that requires this adapter.
This is the ECMAScript grammar example #521, in which the parser's performance is very important.
Perhaps the custom behavior for the ECMAScript grammar can be achieved using plain method overrides without official APIs and dependency injection mechanisms.

bd82 pushed a commit that referenced this issue Jul 9, 2017
bd82 pushed a commit that referenced this issue Jul 10, 2017
bd82 pushed a commit that referenced this issue Jul 12, 2017
bd82 pushed a commit that referenced this issue Jul 18, 2017
bd82 pushed a commit that referenced this issue Jul 18, 2017
bd82 pushed a commit that referenced this issue Jul 18, 2017
bd82 pushed a commit that referenced this issue Jul 20, 2017
bd82 pushed a commit that referenced this issue Jul 23, 2017
bd82 pushed a commit that referenced this issue Jul 23, 2017
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
@bd82
Copy link
Member Author

bd82 commented Jul 24, 2017

Closing this.

See above comments for details.
194c782
Contains a small POC to deal with limited LexerLess mode which should suffice for ECMAScript parsing. (even though it will not be an "official" API).

@bd82 bd82 closed this as completed Jul 24, 2017
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 pushed a commit that referenced this issue Jul 24, 2017
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 added a commit that referenced this issue Jul 24, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 added a commit that referenced this issue Jul 25, 2017
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
bd82 pushed a commit that referenced this issue Jul 25, 2017
* WIP - ScannerLess Parser example.

fixes #528

* Refactor the Parser to allow implementing a simple LexerLess parser
by using method overrides.

A more proper solution with depedency injection proved too damaging to performance...

Part of #521
Fixes #528
@bd82 bd82 reopened this Jul 20, 2018
@bd82
Copy link
Member Author

bd82 commented Jul 21, 2018

Need to investigate this again as such an adapter may also be used to conserve memory
and implement token channels.

@bd82
Copy link
Member Author

bd82 commented Sep 11, 2018

Additional testing has showed a large strange regression in the CSS benchmark (x3/x4 slower).
This is probably due to some de-optimization happening in V8 internals.

The other performance scenarios (JSON/ECMA5) showed a slight (3%) regression which is expected.

I will once again close this, as at the end of the day this abstraction is just a convenience to help override
certain parser internals, its not mandatory to accomplish those unique flows, rather is it only mandatory
to make those unique flows supports as official apis.

Perhaps such unique flows (e.g token channels) should be provided as examples for extending the parser
rather than official apis.

@bd82 bd82 closed this as completed Sep 11, 2018
@bd82 bd82 mentioned this issue Sep 11, 2018
@bd82
Copy link
Member Author

bd82 commented Oct 15, 2018

Refactoring the parser to multiple traits may make it easier to extract the Lexer Adapter API
and provide a custom implementation without suffering a performance penalty.

This should be evaluated again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant