Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left factor parser for function types #606

Merged
merged 6 commits into from
Sep 28, 2018
Merged

Conversation

Gabriella439
Copy link
Collaborator

@Gabriella439 Gabriella439 commented Sep 27, 2018

Fixes #108

This gives a massive (~30x) parsing performance improvement for the benchmark
code from the above issue:

benchmarking Issue #108/Text
time                 169.9 ms   (167.6 ms .. 172.8 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 174.8 ms   (172.6 ms .. 177.3 ms)
std dev              3.525 ms   (2.008 ms .. 5.131 ms)
variance introduced by outliers: 12% (moderately inflated)

After:

time                 5.860 ms   (5.826 ms .. 5.904 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.921 ms   (5.902 ms .. 5.939 ms)
std dev              56.42 μs   (46.69 μs .. 67.52 μs)

The root cause was that the parser for function types was introducing
excessive backtracking. This led to parsing performance being exponential
in the number of atomic operatorExpressions.

This change left-factors the expression parser by gutting
annotatedExpression. Specifically, this moves the logic for parsing unannotated
List/Optional/merge into expression and then consolidates the
remaining logic for parsing an ordinary Annot into expression, applying
a fixup for unannotated List/Optional/merge expressions to tag them with
their annotation.

Fixes #108

This gives a massive (~30x) parsing performance improvement for the benchmark
code from the above issue:

```
benchmarking Issue #108/Text
time                 169.9 ms   (167.6 ms .. 172.8 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 174.8 ms   (172.6 ms .. 177.3 ms)
std dev              3.525 ms   (2.008 ms .. 5.131 ms)
variance introduced by outliers: 12% (moderately inflated)

time                 5.860 ms   (5.826 ms .. 5.904 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 5.921 ms   (5.902 ms .. 5.939 ms)
std dev              56.42 μs   (46.69 μs .. 67.52 μs)
```

The root cause was that the parser for function types was introducing
excessive backtracking.  This led to parsing performance being exponential
in the number of atomic `operatorExpression`s.

This change left-factors the `operatorExpression` parser by gutting
`annotatedExpression`.  Specifically, this moves the logic for parsing annotated
`List`/`Optional`/`merge` into `primitiveExpression` and then consolidating the
remaining logic for parsing an ordinary `Annot` into `operatorExpression` so
that it no longer has to backtrack.
This was referenced Sep 27, 2018
@phadej phadej mentioned this pull request Sep 27, 2018
Closed
@Gabriella439
Copy link
Collaborator Author

Note that still needs a little cleanup before merging because it discards the Noted constructors from type annotations. I will polish this a bit more later tonight

@f-f
Copy link
Member

f-f commented Sep 27, 2018

I can confirm that this also fixes #580 🎉

Running time with this branch is ~1s, which I'd consider good enough.

Thanks for the good work @Gabriel439 and @phadej 👏

`alternative4` now subsumes `alternative5`
@Gabriella439 Gabriella439 merged commit 218e90a into master Sep 28, 2018
@Gabriella439 Gabriella439 deleted the gabriel/left_factor branch September 28, 2018 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve parser performance
2 participants