New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better grouping algorithm #51

Closed
hsoft opened this Issue Jun 22, 2013 · 2 comments

Comments

1 participant
@hsoft
Owner

hsoft commented Jun 22, 2013

A soon as dupeGuru is looking for duplicates that are not exactly the same, there's the issue of discarded matches coming up. For some discarded matches, it's impossible not to discard them because one side of the match is already part of a group that the other side of the match can't be in.

But after a quick glance at the grouping code, it seems possible that a match is discarded on the basis that one side is an unconfirmed part of a group. If that file is never confirmed, it means that some discarded matches could be used to safely make new groups without conflicting with any other group.

@hsoft hsoft closed this Jun 22, 2013

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

(from [116]) [#51 state:fixed] Improved the grouping algorithm to reduce the number of discarded matches in a scan.

@hsoft

This comment has been minimized.

Owner

hsoft commented Jun 22, 2013

[not-tagged:"task" bulk edit command]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment