-
-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Images from reference folders being compared to each other #686
Comments
This is not a bug, but sub-optimal algorithm. Due to simplicity of adding at the end of functions ~40 lines of code I used this solution. For duplicate mode I don't think that this is needed to modify algorithm since hashes are cached and comparing them is quite cheap. |
Do you mean that each added folder is being treated as one? I think this is the way I prefer it to work. |
Just my opinion but e.g. if there are 3 folders added [a b c] and [a] is checked as reference then files within folder [a] should only be compared with the ones in folders [b] and [c] and not with each other. Files in [b] and [c] are all compared individually (each file is compared to every other in [a] [b] and [c]). Same system that's implemented in https://github.com/arsenetar/dupeguru/. Many use cases for that, one of which I already mentioned; if I have 10 images and want to see if there are any similar images in a folder containing 10000, program should just compare these 10 images with the 10000, and not additionally compare every image in the 10000 set with each other. Not only is it unnecessary but also greatly increases processing time. Reference folder feature as it is implemented now is not needed really because you can already select custom path with custom select. |
Might be getting something wrong here but why are images from reference folders being compared to each other? As I understand it when selecting reference folder, all other non reference folder images are compared with one another and with the ones from reference folders. But the case is that even images in the reference folder are being compared with each other for no reason. If the point is to delete files from specific destination that can be achieved by custom select, thus making reference option unnecessary and just slows things down. In my specific case there is main dataset with ~500k images and if want to check folder with 100 images for possible duplicates before adding to the main dataset, the program compares images within reference folder making the whole process way longer than is actually needed. Again, might be wrong about this and just picked wrong options, but if not it really should be fixed.
The text was updated successfully, but these errors were encountered: