-
Notifications
You must be signed in to change notification settings - Fork 0
timstaley/lofar_data_management
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A few quick scripts to ease removal of duplicate files from a multiple user system. Intended for use with 'findup' from the fslint package: http://www.pixelbeat.org/fslint/ http://en.flossmanuals.net/FSlint/ 'parse_findup_output.py' is a script to organize fslint/findup output and format it into a csv. The files are scanned to identify who they belong to, and then csv files are output to identify duplicates for which both/all copies belong to a single user. This identifies the easy cases where a single user can decide which copy to retain. In the case of LOFAR data, the script also scrapes some minimal tags (subband, obs. id) from the folder name. Usage: Run findup, e.g. using: ./findup_script.sh /some_big_folder > big_folder_dupes.txt or ./findup_script.sh /some_folder /some_other_folder > multiple_folder_dupes.txt (You can tweak the search criteria as per the findup --help reference). Then run: ./parse_findup_output.py folder_dupes.txt Which will output the summary file 'all_dupes.csv', and also a folder 'user_self_dupes' containing csv's pertaining to single user duplicates.
About
Scripts to ease removal of duplicate files from a multiple user system.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published