Duplicate image finder and remover — Scans folders recursively to find and remove duplicate/near-duplicate images using perceptual hashing (dHash).
![]()
✨ Automatic Duplicate Detection — Finds exact and near-duplicate images without manual reference selection
⚡ Multi-threaded Processing — Fast parallel hashing optimized for large collections (tested with 24,000+ images)
📊 Progress Tracking — Real-time progress bar, status updates, and detailed activity log
🎯 Configurable Similarity — Adjustable Hamming threshold for near-duplicate detection (0-64 bits)
🗂️ Safe File Management — Moves duplicates to specified folder with automatic conflict resolution
🚫 Cancellation Support — Stop scanning safely at any time
📈 Summary Reports — Detailed statistics on files scanned, duplicates found, and errors
- Perceptual Hashing — Computes a compact 64-bit dHash (difference hash) for each image
- Grouping — Groups images by identical hash values
- Near-Duplicate Detection — Finds similar images using Hamming distance comparison
- Safe Removal — Keeps one original per group, moves duplicates to
deleted_imagesfolder
- Python 3.10+ (Python 3.11 recommended for best compatibility)
- Windows, macOS, or Linux
# Create and activate virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1
# Upgrade pip and install dependencies
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
# If opencv-python fails to install, try:
python -m pip install --only-binary=:all: numpy
python -m pip install --no-deps opencv-python# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install --upgrade pip setuptools wheel
pip install -r requirements.txtpython main.pyw- Select Folder — Click "Select Folder to Scan" and choose the root folder containing your images
- Configure Settings (optional):
- Deleted Images Folder — Customize where duplicates are moved (default:
<scan_folder>\deleted_images) - Hamming Threshold — Adjust sensitivity (0=exact matches only, 5=recommended, higher=more permissive)
- Max Threads — Set number of parallel threads (default: up to 8)
- Deleted Images Folder — Customize where duplicates are moved (default:
- Start Scan — Click "Start Scan" to begin processing
- Monitor Progress — Watch the progress bar and activity log
- Review Results — View summary dialog with statistics when complete
Click the Cancel button at any time to safely stop the scan. Files already moved will remain in the deleted folder.
Controls how similar images must be to be considered duplicates:
- 0 — Only exact matches (identical hashes)
- 1-5 — Very similar images (recommended: 5)
- 6-15 — Similar images with minor differences
- 16+ — Increasingly permissive (may group dissimilar images)
BMP, DIB, JPEG, JPG, JPE, JP2, PNG, WebP, PBM, PGM, PPM, PXM, PNM, SR, RAS, TIFF, TIF
- ~24,000 images — Tested and optimized for large collections
- Multi-threaded hashing — Parallel processing speeds up scanning significantly
- Low memory footprint — Stores only hashes and file paths, not full images
- Responsive UI — All processing runs in background threads; UI remains interactive
If pip install opencv-python fails with numpy build errors:
# Install numpy binary wheel first
python -m pip install --only-binary=:all: numpy
# Then install OpenCV without rebuilding dependencies
python -m pip install --no-deps opencv-pythonPython 3.14+ may have limited wheel availability. Use Python 3.10 or 3.11 for best compatibility.
MIT License
Copyright (c) 2026 QuantumPixelator
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.