Organizer is a tool for organizing your digital assets. On a given directory, it will scan the files, de-duplicating them by MD5 hash and organizing them by YYYY/MM path.
I developed it for de-duplicating and organizing my digital assets in my Google Drive.
- Scans the current folder (recursively), skipping hidden/system folders.
- Computes MD5 to find duplicates. Keeps the first path and marks the rest for deletion.
- Derives Year/Month from earliest of metadata (ExifTool when available, else EXIF/PDF), filename dates (YYYY or YYYY-MM), birthtime, ctime, mtime.
- Generates a CSV report. By default, it is a dry run.
Safety First - Take a backup before running with --organize or --delete-duplicates.
ExifTool is required (used to read original creation timestamps). Install it first:
brew install exiftoolThe tool is a small CLI with a clear pipeline:
- Scanner reads files, hashes content, and resolves the earliest creation timestamp.
- Planner groups duplicates by hash and plans target paths.
- Actions move files, delete duplicates, and optionally clean empty folders.
Code layout:
src/organizer/scanner.py: file discovery, MD5, and timestamp resolutionsrc/organizer/planner.py: grouping and output path planningsrc/organizer/actions.py: moves, deletes, timestamp preservationsrc/organizer/cli.py: CLI arguments and report generation
- ExifTool for robust metadata timestamps
- Exif and PDF metadata parsing
- MD5 for duplicate detection
- CSV for reports
- PyInstaller for single-file binaries
- uv for dependency management
- PyPDF2 for PDF metadata fallback
- Pillow for image EXIF fallback
- ExifTool (external binary)
- PyInstaller (build extra)
- pytest (test extra)
From the project root, build a single-file binary:
makeor
uv run --extra build pyinstaller --onefile -n organizer src/organizer/cli.pyCopy dist/organizer into your $PATH and run it directly:
organizer --versionShow version and build time:
organizer --versionDry run (CSV only, no changes) in the current directory:
organizerApply changes (move originals only):
organizer --organize /path/to/folderOrganize into a separate output folder:
organizer --organize /path/to/folder --output-root /path/to/outputDelete duplicates only:
organizer --delete-duplicates /path/to/folderDelete empty folders in current working directory:
organizer --delete-empty-foldersCustom report path (dry run):
organizer --report /path/to/report.csvuv run --extra test pytest- Default behavior is a dry run (CSV only).
- Use either
--organizeor--delete-duplicatesto apply changes. - Organizing and deleting are separate runs.
