New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SequenceMatcher finds suboptimal sequenc #37551
Comments
The algorithm used for approximate string matching Example: >>> from difflib import SequenceMatcher
>>> sm = SequenceMatcher()
>>> sm.set_seqs('axfot', 'aoftax')
>>> sm.ratio()
0.36363636363636365
>>> sm.get_matching_blocks()
[(0, 4, 2), (5, 6, 0)]
>>> sm.get_opcodes()
[('insert', 0, 0, 0, 4), ('equal', 0, 2, 4, 6),
('delete', 2, 5, 6, 6)] What's wrong? Levenshtein distance with weight 2 for item replacement And really, the maximal matching blocks are: The impact of this ``feature'' on diff-like |
Logged In: YES Sorry, I've changed my mind. This definitely should be sm.set_seqs('Observation: What seems as a small glitch at Unfortunately this probably means complete rewrite, I can't |
Logged In: YES Please read the docs first. This isn't the Levenshtein |
Logged In: YES OK, I know it's not Levenshtein because I've read the However, the docs say `This does not yield minimal edit sequences, but does tend This is not true -- see my last posting. Or perhaps you Thank you. |
Logged In: YES I looked at possible wording changes but prefer the docs as Marking this one as won't fix and closing. We do appreciate the report. Since this is an area of interest |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: