Performance of `hamming()` can be improved #4

IBUzPE9 · 2016-08-02T10:09:38Z

....chars().count() iterates over whole string. It can be avoided.

tests::new_same_len ... bench:       4,756 ns/iter (+/- 74)
tests::old_same_len ... bench:       7,089 ns/iter (+/- 400)

tests::new_diff_len ... bench:       4,867 ns/iter (+/- 136)
tests::old_diff_len ... bench:       3,024 ns/iter (+/- 90)

`....chars().count()` iterates over whole string. It can be avoided.

dguo · 2016-08-12T01:21:32Z

Thank you!

`usize` implements copy, thus we should prefer copying rather than cloning, as recommended by the std documentation

Taking the j-w optimisations further, this makes use of the prefix splitting helper within the inner Jaro algorithm. The function has been modified such that instead of taking a char-count of the size of the common prefix removed from the pair of strings, it now optionally takes a pointer to return the count, obtaining it within the function through use of the helper internally. Using the prefix splitting helper within the function means that we avoid doing a `.chars().count()` iteration over the prefix twice, once going over `a` and once going over `b`. It also then allows the main part of the algorithm to completely avoid processing the common prefix portion of the strings.

Improved hamming performance

ff56d69

`....chars().count()` iterates over whole string. It can be avoided.

dguo merged commit 1c37418 into rapidfuzz:master Aug 12, 2016

jnqnfe added a commit to jnqnfe/strsim-rs that referenced this pull request Nov 4, 2018

osa optimisation rapidfuzz#4

9ce8df6

`usize` implements copy, thus we should prefer copying rather than cloning, as recommended by the std documentation

jnqnfe added a commit to jnqnfe/strsim-rs that referenced this pull request Nov 4, 2018

d-l optimisation rapidfuzz#4

53a54c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance of `hamming()` can be improved #4

Performance of `hamming()` can be improved #4

IBUzPE9 commented Aug 2, 2016 •

edited

dguo commented Aug 12, 2016

Performance of hamming() can be improved #4

Performance of hamming() can be improved #4

Conversation

IBUzPE9 commented Aug 2, 2016 • edited

dguo commented Aug 12, 2016

Performance of `hamming()` can be improved #4

Performance of `hamming()` can be improved #4

IBUzPE9 commented Aug 2, 2016 •

edited