-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offer a way of deleting / hardlinking / softlinking duplicated files automatically #27
Comments
What do you means, exactly? I like your aproach to using Python, maybe bash is not enought, althought it's more powerful than people would expect, and this can be done with it in a more portable way, while the Python wrapper would need to be an independent project since it would not be just a helper command anymore... But yes, a |
Most of the python code above deals with reconstructing proper datastructures from the fclones output. I guess such datastructures are probably already available in fclones. A dedicated flag could bypass the need for implementing (and maintaining) a parser. I'm not very happy with the python dependency either. IMHO the link between an independent python project and fclones would be so tight that I don't think it's worth the split. I'd prefer a shell-based approach as well. It would be more portable, but I fear it could be rather limiting later, though (as it becomes pretty complex, not very readable nor reliable when compared to Python when tests, additional switches or edge cases handling are needed). Anyhow, a postprocessing step would probably limit (if not defeat) the speed advantage of fclones vs jdupes/fdupes. |
I think bottleneck are in hashes... |
@aurelg The postprocessing step would be fast and definitely not a bottleneck. The main bottleneck is I/O for reading files to compute the hashes. I generally agree this feature is much easier to implement inside fclones. This: if isfile(dst):
unlink(dst)
link(src, dst) might end up deleting the only existing file. Better to move the file first, before deleting, then create a link, then if all ok, drop the moved file. |
It might also be nice to avoid creating |
I added a few things - love the code. Assumes you output the csv file to /tmp for tidyness. Remember to put the primary directory last in the fclones path to keep those as a priority (contrast to rdfind where its the first directory that is kept priority)
|
and here is a version just to move files to a duplicates directory ($HOME/Duplicates) for safety
|
Better if it gets the info directly from |
I like to check before deleting!! :-) and the move one loses directory structure so equally want to check first |
Implemented in #53 released as v0.12.0. |
fclone should offer a way of deleting / hardlinking / softlinking duplicated files automatically.
In #25:
@pkolaczk wrote:
and @piranna replied:
IMHO, a postprocessing script parsing the fclones output might require more complexity than adding a CLI switch. For instance, here's an (untested) python implementation that leverages the CSV output (expected in
fclones_out.csv
) to replace duplicates with hard links:PS: I think this deserves a ticket on its own, feel free to delete it if you don't agree. :-)
The text was updated successfully, but these errors were encountered: