Better grouping algorithm #51
Closed
Comments
(from [116]) [#51 state:fixed] Improved the grouping algorithm to reduce the number of discarded matches in a scan. |
[not-tagged:"task" bulk edit command] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A soon as dupeGuru is looking for duplicates that are not exactly the same, there's the issue of discarded matches coming up. For some discarded matches, it's impossible not to discard them because one side of the match is already part of a group that the other side of the match can't be in.
But after a quick glance at the grouping code, it seems possible that a match is discarded on the basis that one side is an unconfirmed part of a group. If that file is never confirmed, it means that some discarded matches could be used to safely make new groups without conflicting with any other group.
The text was updated successfully, but these errors were encountered: