Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support pluggable lookahead strategy #1852

Merged
merged 3 commits into from
Oct 20, 2022
Merged

Conversation

msujew
Copy link
Collaborator

@msujew msujew commented Sep 12, 2022

Supersedes and closes #1793

Allows users to supplement the parser with their own lookahead strategy. Mostly introduces new interfaces to override the existing lookahead builder functions and exposes some internals (such as getLookaheadPaths). The latter is necessary, as most lookahead strategy will still try to perform LL(k) style lookahead in one way or another. For example LL(*) needs it to identify LL(1) decisions.

The PR contains some minor updates to the documentation on this. Once this is merged and TypeFox/Langium releases their LL(*) lookahead strategy, I would update it with a reference.

Since this is basically just restructuring existing code, it wasn't necessary to write tests for it. Instead the ecma_quirks.ts tests were adapted to make use of the new API, providing an integration test for this feature.

The changes seem to have absolutely no impact on performance:

Library Ops/sec Relative Speed
JSON 9474.71 101.66%
CSS 3140.82 99.75%
ECMA5 538.65 100.02%

@bd82
Copy link
Member

bd82 commented Sep 13, 2022

Thanks for looking into the performance numbers as part of this PR.
I will try to review it on the weekend.
Worse case it should be reviewed on the upcoming holidays (next weekend)

Copy link
Member

@bd82 bd82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments.

I think once we do an iteration or two of minor changes we will merge
this and I may add this API under an experimental namespace.
So it would be easy to break and modify it if needed.
e.g

import {experimental} from "chevrotain"
const { LLkLookaheadStrategy } = experimental

@@ -2119,6 +2127,147 @@ export interface IParserErrorMessageProvider {
}): string
}

export declare class LLkLookaheadStrategy implements ILookaheadStrategy {
validateNoLeftRecursion(rules: Rule[]): ILookaheadValidationError[]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be quite a bit of duplication here in the methods declarations.

Is this due to TypeScript syntax limitations with declare class?
Could a constructor interface resolve the duplicated signatures?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was confused at first as well, but TypeScript requires declare class to explicitly declare all implemented methods of an interface:

image

Could a constructor interface resolve the duplicated signatures?

I don't think so. I believe we would need a factory to solve this, but it'd be great if Chevrotain would explicitly export its own lookahead strategy, so that other strategies can build on top of that. This is also why the validation methods are declared in the api.d.ts, since the LL-Star package would make use of that (they have a lot of overlapping, LL-specific validations).

@@ -30,3 +30,20 @@ longRule:
Chevrotain will throw a an error during the parser initialization in this case.
This is because there is no fixed number of tokens we can use to choose between the alternatives
that is due to a potentially **infinite** number of "A" tokens that can appear before the "B" - "C" tokens.

## Flexible Lookahead Strategy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this should be a separate feature item (md file)
Particularly as in the future it will contain a link to the LL-Star implementation plugin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although this could be moved out later to a separate LL-Star item...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this for now. It probably doesn't make a lot of sense to document this as long as 1. it's experimental API and 2. there's no LL-Star implementation yet.

@@ -2032,6 +2032,14 @@ export interface IParserConfig {
* - For example: via a conditional that checks an env variable.
*/
skipValidations?: boolean
/**
* A custom lookahead strategy.
* Can be used to override the normal LL(*k*) lookahead behavior.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normal --> default

*/
validate?(options: {
rules: Rule[]
maxLookahead: number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the maxLookahead be part of the validate() method
or should it be instead be part of the lookahead strategy state?

Does LL-Star strategy use the maxLookahead parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does LL-Star strategy use the maxLookahead parameter?

No, it doesn't make use of it.

should it be instead be part of the lookahead strategy state?

We could make it that way, but then we'd have either some level of redundancy in the parser constructor (one maxLookahead in the parser options, one in the lookahead strategy) and need to deal with consolidating that somehow, or have to remove the maxLookahead in the parser options, which would be a relatively large breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will take a deeper look at this once we merge the PR

packages/types/api.d.ts Show resolved Hide resolved
packages/chevrotain/src/parse/parser/traits/looksahead.ts Outdated Show resolved Hide resolved
@@ -0,0 +1,160 @@
import {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _public naming convention is old irelevant legacy
you can give this file a proper name...

Maybe even a folder with one file for the class and another file for the utility function

packages/chevrotain/src/parse/grammar/lookahead_public.ts Outdated Show resolved Hide resolved
packages/chevrotain/src/parse/grammar/lookahead_public.ts Outdated Show resolved Hide resolved
@msujew
Copy link
Collaborator Author

msujew commented Sep 25, 2022

@bd82 Thanks for your review. I've incorporated your comments (and left a comment where I had something to add to your comment). I agree, having an experimental namespace sounds good to me for faster iterations on the feature 👍

@bd82
Copy link
Member

bd82 commented Sep 26, 2022

Hi @msujew

I think some of the code that was removed actually does have a purpose (see comments above).
Once you do so I will:

  1. re-review and hopefully merge.

And in a different PR I will:
3. investigate possibility for a small refactor with the maxLookahead param.
4. Move the APIs to an experimental namespace.

@msujew
Copy link
Collaborator Author

msujew commented Oct 8, 2022

@bd82 I came back from vacation and integrated your comment. Good to know that the code does serve a purpose :D

Anyway, hopefully this should be good to go now.

@bd82 bd82 merged commit f4e393f into master Oct 20, 2022
@bd82 bd82 deleted the msujew/pluggable-lookahead branch October 20, 2022 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants