Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Improvements to string comparison efficiency. #38
This pull request changes the data structures used within the
The table below show the results of an informal performance experiment run on my 2012 Macbook Pro (Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz; 16GB RAM). Each comparison algorithm was applied to a set of ~ 200,000 candidate links with the original algorithms and the modified algorithms. Each trial was repeated five times, with very little variance between replications. Results show significant performance increases in all string comparison methods that were modified (note that
This table shows the mean time in seconds (i.e. the average observed across five trials) to compare 200,000 candidate links with the original and modified algorithms:
Note that due to technical problems, I was unable to run formal benchmarks using
@@ Coverage Diff @@ ## master #38 +/- ## ========================================== - Coverage 86.68% 86.66% -0.02% ========================================== Files 20 20 Lines 1449 1447 -2 Branches 265 265 ========================================== - Hits 1256 1254 -2 Misses 127 127 Partials 66 66
You are going to make a lot of users very happy with this PR!!! Thanks for finding this bottleneck.
By the end of this week (I am on holiday without laptop and poor internet), I will do some performance testing myself and then merge the PR.
(Good catch that the