Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce lexer #99

Closed
zbraniecki opened this issue Apr 2, 2019 · 8 comments
Closed

Introduce lexer #99

zbraniecki opened this issue Apr 2, 2019 · 8 comments
Labels
crate:fluent-syntax Issues related to fluent-syntax crate enhancement
Milestone

Comments

@zbraniecki
Copy link
Collaborator

In my early experiments, lexer seems to have a very nice perf impact on parsing.

I'll investigate more, but if someone gets to it first, feel free to take it!

A good background read - https://medium.com/@retep007/javascript-lexing-for-high-performance-f9a800ec930d

@zbraniecki zbraniecki added this to the 0.9 milestone Apr 2, 2019
@zbraniecki
Copy link
Collaborator Author

@zbraniecki zbraniecki added enhancement crate:fluent-syntax Issues related to fluent-syntax crate labels Apr 17, 2019
@zbraniecki
Copy link
Collaborator Author

new rustc lexer is very similar to what I've been exploring - https://github.com/rust-lang/rust/blob/e2b4165a6c2fbab4c1bde97d0c2e47b4602f7bc0/src/librustc_lexer/src/lib.rs

@zbraniecki
Copy link
Collaborator Author

@zbraniecki zbraniecki modified the milestones: 0.12, 0.11 Feb 12, 2020
@zbraniecki
Copy link
Collaborator Author

I got to the point where I have a lexer branch (lexer5) which passes all-but-one fixtures and benchmark fixtures.

The performance is promising:

parse/"simple"          time:   [8.2312 us 8.2357 us 8.2410 us]                            
                        change: [-45.982% -45.885% -45.795%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
parse/"preferences"     time:   [171.52 us 171.65 us 171.78 us]                                
                        change: [-33.596% -33.473% -33.353%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low severe
  3 (3.00%) high mild
  3 (3.00%) high severe
parse/"menubar"         time:   [40.406 us 40.440 us 40.482 us]                             
                        change: [-28.479% -28.049% -27.562%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  6 (6.00%) high mild
  3 (3.00%) high severe

parse_ctx/"browser"     time:   [185.77 us 185.99 us 186.22 us]                                
                        change: [-27.016% -26.814% -26.633%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
parse_ctx/"preferences" time:   [440.48 us 441.21 us 442.02 us]                                    
                        change: [-30.603% -30.229% -29.775%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high severe

Gnuplot not found, disabling plotting

This is what I'd call a lower bound of what I think is possible. This is a pretty dummy and dirty lexer which doesn't have any peeking, keeps stack and does many other things that I think we can avoid.
But even with that we get 25-30% perf win and a separated lexer which helps maintain cleaner code.

I'm going to continue toying with a lexer for the forseeable future as it's not a blocker for any work, but I like tinkering with it in my spare time.

My hope is to get lexer6 to build a cleaner lexer on top of lexer5 next.

@Stupremee
Copy link

I will try to do some experiments in the next few days and try to achieve faster perf win.
If I succeed, I will open a PR and share the results here.

@zbraniecki
Copy link
Collaborator Author

great! here's my last attempt - https://github.com/zbraniecki/fluent-rs/tree/lexer5

@zbraniecki zbraniecki modified the milestones: 0.12, 0.13, 0.14 Sep 18, 2020
@zbraniecki
Copy link
Collaborator Author

At this point, I don't believe we should be trying to add a lexer. The Parser is Really Fast, and if someone can come up with a significant performance improvement, that should be a separate issue filed by the person who's able to make that PR :)

@Stupremee - please, open a issue/PR if you get to it!

@zbraniecki
Copy link
Collaborator Author

In case we revisit - https://github.com/maciejhirsz/logos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate:fluent-syntax Issues related to fluent-syntax crate enhancement
Projects
None yet
Development

No branches or pull requests

2 participants