-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Rework Levenshtein distance #483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework Levenshtein distance #483
Conversation
Also, I think there's a duplicate between:
Not sure which one should take precedence but both are implementing Levenshtein distance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge edit distance file into this one and delete the original. As for your changes, could you remove comments with example from the code? Examples don't need to be included if this can be looked up on Wikiipedia
* Explain differences between edit_distance_se and edit_distance from previous implementation * Add complexity
Thanks for the feedback.
As for the rest:
I can remove it if you want, let me know. For the comments I removed them but I truly think this is a shame. Surely it can be looked be on Wikipedia, but they walked through the algorithm in a shorter yet understandable way. |
Thanks, looks good! It's supposed to be as close to a real-life implementation as possible so that people can not only get familiar with the algorithm, but also with the language and coding conventions. So, it's different from a book and not supposed to be a complete self-explanatory text read as a book. We don't include details that are verbose and can be googled. It's like when using patterns, saying that we use "visitor" pattern here is enough and an explanation of what that is can be easily googled. Hope that makes sense. |
Makes total sense. I probably need to rewrite the count-min-sketch ( #485 one) and a few others I was preparing (bloom filters, hyperloglog) as well then. |
Rework Levenshtein distance
Description
Rework a bit the Levenshtein distance implementation by using a more Rust-idiomatic way (iterators).
Add a detailed documentation on how the algorithm works.
Type of change
Please delete options that are not relevant.
Checklist:
cargo clippy --all -- -D warnings
just before my last commit and fixed any issue that was found.cargo fmt
just before my last commit.cargo test
just before my last commit and all tests passed.COUNTRIBUTING.md
and my code follows its guidelines.Please make sure that if there is a test that takes too long to run ( > 300ms), you
#[ignore]
that ortry to optimize your code or make the test easier to run. We have this rule because we have hundreds of
tests to run; If each one of them took 300ms, we would have to wait for a long time.