-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom Token Patterns to Support Lexer Context. #360
Comments
An initial implementation is available here: Performance on V8However invoking .exec with the additional context arguments Creating wrapper functions dynamically which either invoke the .exec with these arguments or without causes and even worse 20-30% performance degradation. Currently the only solution I can think of will mean additional code duplication. Another option is trying to create the wrapper matcher functions dynamically but outside the scope |
The performance issues can be mitigated by using a runtime if-else. if (hasCustomTokens) {
match = currModePatterns[i].exec(text, matchedTokens, groups)
} else {
match = currModePatterns[i].exec(text)
} This still has some overhead (~1% on JSON benchmark) but it is a small one. Perhaps some of the lexer source code should be created dynamically using templates |
Merged version using ugly duplicated "if-else" reduces the performance penalty to 1-2% which is acceptable for now. These 1-2% can be gained back (and more?) If and when the Lexer runtime code will be refactored to be auto generated. There are simply too many different combinations of lexing features combined with the fact that for maximum performance each combination of these features must have its own tokenizeInternal method. |
The information about the previous tokens already matched should be provided to the custom token matcher.
This will allow implementing lexing patterns which require information about the previous successfully matched tokens, for example:
The text was updated successfully, but these errors were encountered: