New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first-mate not taking advantage of caching in Oniguruma #93

Closed
50Wliu opened this Issue Apr 18, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@50Wliu
Member

50Wliu commented Apr 18, 2017

I've been investigating recently why first-mate takes so long to tokenize files with very long lines. For reference, here's the current performance (in milliseconds):

Tokenizing jQuery v2.0.3
1341

Tokenizing jQuery v2.0.3 minified
1403255

Tokenizing Bootstrap CSS v3.1.1
523

Tokenizing Bootstrap CSS v3.1.1 minified
20760

As you can see, it takes around 23 minutes to fully tokenize jquery.min.js, which is absolutely unacceptable.

It turns out the reason for this is that we haven't been utilizing the caching that Oniguruma offers. Here's a breakdown of the history:

In order to enable caching, it appears that we need to send Oniguruma an OnigString of the line we want to tokenize, rather than a JavaScript String. Unfortunately, I have thus far been unable to make this work, as I get differing results depending on whether I pass in an OnigString or a String.

/cc: @nathansobo

@nathansobo

This comment has been minimized.

Show comment
Hide comment
@nathansobo
Contributor

nathansobo commented Apr 18, 2017

@50Wliu

This comment has been minimized.

Show comment
Hide comment
@50Wliu

50Wliu Apr 19, 2017

Member

❤️ Thanks @maxbrunsfeld!

Member

50Wliu commented Apr 19, 2017

❤️ Thanks @maxbrunsfeld!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment