Skip to content

orsenthil/organizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Organizer

Organizer is a tool for organizing your digital assets. On a given directory, it will scan the files, de-duplicating them by MD5 hash and organizing them by YYYY/MM path.

I developed it for de-duplicating and organizing my digital assets in my Google Drive.

What it does

  • Scans the current folder (recursively), skipping hidden/system folders.
  • Computes MD5 to find duplicates. Keeps the first path and marks the rest for deletion.
  • Derives Year/Month from earliest of metadata (ExifTool when available, else EXIF/PDF), filename dates (YYYY or YYYY-MM), birthtime, ctime, mtime.
  • Generates a CSV report. By default, it is a dry run.

Safety First - Take a backup before running with --organize or --delete-duplicates.

ExifTool is required (used to read original creation timestamps). Install it first:

brew install exiftool

Architecture

The tool is a small CLI with a clear pipeline:

  • Scanner reads files, hashes content, and resolves the earliest creation timestamp.
  • Planner groups duplicates by hash and plans target paths.
  • Actions move files, delete duplicates, and optionally clean empty folders.

Code layout:

  • src/organizer/scanner.py: file discovery, MD5, and timestamp resolution
  • src/organizer/planner.py: grouping and output path planning
  • src/organizer/actions.py: moves, deletes, timestamp preservation
  • src/organizer/cli.py: CLI arguments and report generation

Technologies used

  • ExifTool for robust metadata timestamps
  • Exif and PDF metadata parsing
  • MD5 for duplicate detection
  • CSV for reports
  • PyInstaller for single-file binaries
  • uv for dependency management

Dependencies

Build and run

From the project root, build a single-file binary:

make

or

uv run --extra build pyinstaller --onefile -n organizer src/organizer/cli.py

Copy dist/organizer into your $PATH and run it directly:

organizer --version

Common usage

Show version and build time:

organizer --version

Dry run (CSV only, no changes) in the current directory:

organizer

Apply changes (move originals only):

organizer --organize /path/to/folder

Organize into a separate output folder:

organizer --organize /path/to/folder --output-root /path/to/output

Delete duplicates only:

organizer --delete-duplicates /path/to/folder

Delete empty folders in current working directory:

organizer --delete-empty-folders

Custom report path (dry run):

organizer --report /path/to/report.csv

Tests

uv run --extra test pytest

Safety

  • Default behavior is a dry run (CSV only).
  • Use either --organize or --delete-duplicates to apply changes.
  • Organizing and deleting are separate runs.

About

organizer for digital assets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published