New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix slow execution time when using CharacterSet or CharacterSetComple… #11991
Fix slow execution time when using CharacterSet or CharacterSetComple… #11991
Conversation
…ment to express delimiters with String findTokens:
Thanks. I'm surprised to see that includes: is faster than a =. May be I did not understand it :) |
I didn't time it exactly but paste this into a playground in an image without my change (P10 is good) and you will find it hangs your image.
After my change it runs instantaneously. The reason is that the current code iterates the delimiters collection to search for a matching character. When you have a CharacterSetComplement based on just alpha characters, that is a HUGE number of characters to walk through when really all you want to do is test membership. Old Version: `skipDelimiters: delimiters startingAt: start
New Version:
So (delimiters anySatisfy: [ :delim | delim = (self at: i) ]) is going to call delimiters do: with something like [:d | d = c ifTrue: [^true]] which makes a huge number of comparisons but all you really want to know is if delimiters includes character c. So this is highly inefficient when using a CharacterSet - especially if CharacterSet is cleverly implemented as a bit vector or hashed collection and you just need to test membership. eg - you are ignoring the available O(1) lookup and doing an O(n) iteration for no good reason. I hope that helps. |
Thanks a lot for the explanation!!!! Could you add some of this logic in the method comments because I would like that we document such design point. I think that this is important to educate readers. |
…erSet when number of delimiters is large.
Sure, I have added this comment to findTokens: (where I think people will be most likely to find it) and the two methods I changed.
|
Tx! |
Checking why the build is failing. |
Changes looks good to me. |
Broken tests are unrelated |
…ment to express delimiters with String findTokens: