-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Rabin-Karp algorithm for String#index #3873
Implement Rabin-Karp algorithm for String#index #3873
Conversation
Wow, this is amazing! Good job @makenowjust 👍 |
One thing that would be interesting to see is how it compares for smaller strings. The setup cost for rabin-karp looks to be a lot higher - perhaps it would be a good idea to do approach it similar to sorting, ie if the size is less than a threshold, use index_old. But only benchmarking can show that. |
I've updated the benchmark to include a short source with same target https://gist.github.com/sdogruyol/968802c00924528a0b37a73f4f3b4022
It shows that |
@sdogruyol Thanks! |
Thanks @makenowjust ⚡️ ! |
@bcardiff Thanks too for fast response ⚡️ |
This seems to be a relatively good improvement to the algorithm. I didn't verify that the specs pass, so maybe something is wrong it. I'm suprised by how much of an improvement this shows, considering that ascii text is the case that should be gaining the less benefit. I guess the I'm basically just reducing the number of bigloop iterations needed by doing multibytes jumps, the same way it is done when finding the offset. https://gist.github.com/MaxLap/0ab65d0e1466a7ba1107f66d0983be52
|
@MaxLap Your CPU is...? I think your CPU's no zero comparison is slower than mine. (My CPU is 1.6 GHz Intel Core i5 on MacBook Air, below is my result)
Of course, it is efficient for multi-byte text. And I rewrite your code by using macros to reduce code duplication (here). |
I'm on Windows. So the benchmark was through a VM of Ubuntu 12.04 32bit with 2 cores. I7-3517U at 2.4GHz I tried on another VM, Ubuntu 16.04 64bit. The improvement is less (but still higher than yours) A bit sad that it's less impressive... Oh well, it's still an improvement! |
btw, @makenowjust, are you plan to merge @MaxLap hint? |
Benchmark is here.
My result is:
(I recommend you run this benchmark on your environment ;)