Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple alternate matching performance #184

Open
olbrich opened this issue Jan 1, 2018 · 2 comments
Open

multiple alternate matching performance #184

olbrich opened this issue Jan 1, 2018 · 2 comments
Assignees

Comments

@olbrich
Copy link
Contributor

olbrich commented Jan 1, 2018

I have a use case where I end up creating a custom rule like..

def words
  %w[longestword longword word w].map {|w| str(w)}.reduce(:|)
end

Where the list of words can be quite long. The performance of this is pretty bad, as it creates and checks a regular expression for each of the words in turn until it finds one that works.

There are a couple of optimizations that can be done to speed this up.
First, would be to fail the str match if the remaining string is shorter than the word that is trying to be matched.
Using a single regex that looks like /(longestword|longword|word|w)/ would probably help match in one try instead of several.

Any advice on how to approach making this performant given that the list of words is somewhat dynamic?

@kschiess
Copy link
Owner

Is this solved by PR #185?

Parslet performance is an issue, I know. I tried a lot of things; in the end, I find it hard to keep the reflective nature of the code intact and improve performance at the same time. Since there are a lot of parsers that don't reflect on themselves and are plenty fast, I won't duplicate that effort. I hope that makes sense to you.

@olbrich
Copy link
Contributor Author

olbrich commented Feb 13, 2018

If you look at the parser I linked to in that PR you can see the trick I use to work around this... Basically I figure out how many characters are left and immediately reject any of those patterns since there is no way they can match. It seems to help a fair amount. I'll take another look at this now that #185 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants