AI-powered desktop application that automatically sorts thousands of gymnastics competition photos into folders organised by team and then by individual gymnast — fully offline, no cloud APIs required, NVIDIA CUDA accelerated.
- Features
- Architecture
- Installation
- GPU / CUDA Setup
- Usage Guide
- AI Model Explanations
- Performance Tuning
- Troubleshooting
- Project Structure
- License
| Category | Capability |
|---|---|
| Detection | YOLOv8x-pose person detection — works upside-down, airborne, partial |
| Team ID | Colour analysis + CLIP + OCR + temporal grouping ensemble |
| Person ID | InsightFace + OSNet Re-ID + pose skeleton + hair descriptor |
| Clustering | HDBSCAN / DBSCAN with FAISS nearest-neighbour search |
| Rotation | Automatic upside-down / flip / handstand normalisation |
| Duplicates | MD5 + perceptual hash + embedding similarity + burst detection |
| UI | Gradio web UI — team browser, gymnast browser, live logs |
| Corrections | Merge / split / rename with active-learning centroid updates |
| Output | Copy / Move / Symlink mode, SQLite DB, JSON export |
| Scale | 10 000+ photos, batched GPU processing, resume mode |
Image-Sorter/
├── main.py # CLI / launch entry point
├── requirements.txt
├── pinokio.js # Pinokio app descriptor
├── install.js # Pinokio install script
├── start.js # Pinokio start script
├── update.js # Pinokio update script ← one-click update
├── package.json
│
├── backend/
│ ├── models/
│ │ ├── person_detector.py # YOLOv8-pose wrapper
│ │ ├── team_identifier.py # CLIP + colour + OCR team features
│ │ ├── person_identifier.py # Face + ReID + pose + hair fusion
│ │ └── duplicate_detector.py # Multi-stage duplicate detection
│ │
│ ├── pipeline/
│ │ ├── processor.py # End-to-end orchestration
│ │ ├── clustering.py # HDBSCAN team + person clustering
│ │ └── job_queue.py # Async job queue + progress tracker
│ │
│ ├── ui/
│ │ └── app.py # Gradio Blocks UI
│ │
│ ├── utils/
│ │ ├── config.py # AppConfig dataclass + singleton
│ │ ├── database.py # SQLite WAL database layer
│ │ ├── file_ops.py # Copy / move / symlink organiser
│ │ ├── image_utils.py # Load, hash, blur, quality, colour
│ │ └── logging_utils.py # Coloured rotating-file logging
│ │
│ ├── cache/
│ │ └── embeddings_cache.py # FAISS indices + pickle persistence
│ │
│ └── data/
│ └── schemas.py # Dataclass schemas
│
├── scripts/
│ └── download_models.py # Pre-download all AI weights
│
├── cache/ # Auto-created at runtime
│ ├── models/ # Downloaded model weights
│ ├── thumbnails/
│ └── embeddings/
│
├── data/ # Auto-created at runtime
│ ├── sorter.db # SQLite database
│ └── config.json # Saved settings
│
└── logs/
└── image_sorter.log
Input Folder
│
▼
[collect_images] ──► [duplicate_detector] ──► skip duplicates
│
▼
[load_image_rgb + EXIF + quality]
│
▼
[PersonDetector / YOLOv8-pose] ──► bounding boxes + 17-pt pose
│
├── [TeamIdentifier]
│ ├── Torso colour histogram (HSV 96-d)
│ ├── CLIP ViT-L/14 embedding (768-d)
│ └── EasyOCR jersey text
│
└── [PersonIdentifier]
├── InsightFace 512-d face embedding
├── OSNet Re-ID 512-d body embedding
├── Pose skeleton limb-ratio vector (12-d)
└── Hair colour histogram (48-d)
│
▼
[EmbeddingsCache / FAISS] ── persist to SQLite
│
▼
[TeamClusterer (HDBSCAN)] ──► team_001 … team_N
│
▼
[PersonClusterer (HDBSCAN per team)] ──► gymnast_001 … gymnast_M
│
▼
[organise_file (copy/move/symlink)]
│
▼
OUTPUT/
TEAM_NAME/
GYMNAST_NAME/
image001.jpg …
- Install Pinokio 7.x from pinokio.computer.
- In Pinokio, click Discover → paste the URL of this repository.
- Click Install — Pinokio runs
install.jswhich:- Creates
.venv(Python 3.11 virtual environment) - Detects CUDA and installs the correct PyTorch build
- Installs all dependencies from
requirements.txt - Downloads all AI model weights
- Creates
- Click Start to launch the web UI (opens automatically in your browser).
- To update: click Update — Pinokio runs
update.jswhich:- Stops the app if running
- Pulls the latest code (
git pull --rebase --autostash) - Upgrades pip dependencies
- Re-detects CUDA and upgrades PyTorch if needed
- Refreshes any updated model weights
Requirements: Python 3.11+, Git
git clone https://github.com/Arnold2006/Image-Sorter.git
cd Image-Sorter
python -m venv .venv
# Windows: .venv\Scripts\activate
# Mac/Linux: source .venv/bin/activate
pip install --upgrade pip setuptools wheel
# CUDA 12.1 GPU:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# CPU only:
pip install torch torchvision torchaudio
pip install -r requirements.txt
python scripts/download_models.py
python main.pyOpen your browser at http://127.0.0.1:7860
| Scenario | Action |
|---|---|
| NVIDIA GPU, CUDA 12.x | pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121 |
| NVIDIA GPU, CUDA 11.8 | pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118 |
| Apple Silicon (M1/M2) | pip install torch torchvision (MPS enabled automatically) |
| CPU only | pip install torch torchvision |
Verify GPU: python -c "import torch; print(torch.cuda.is_available())"
For InsightFace GPU: pip uninstall onnxruntime && pip install onnxruntime-gpu
For FAISS GPU: pip uninstall faiss-cpu && pip install faiss-gpu
- Process tab → set input/output folders → choose file mode → Start
- Teams tab → review and rename auto-detected teams
- Gymnasts tab → browse gymnasts per team
- Corrections tab → merge/split/rename identities (active learning)
- Export tab → JSON export and DB stats
| Model | Role |
|---|---|
| YOLOv8x-pose | Person detection + 17-keypoint skeleton; orientation normalisation |
| CLIP ViT-L/14 | Semantic outfit similarity embeddings for team clustering |
| InsightFace buffalo_l | 512-d ArcFace face embeddings for individual ID |
| OSNet (torchreid) | 512-d body Re-ID embeddings robust to pose/viewpoint changes |
| HDBSCAN | Auto-K density clustering; handles noise/outliers naturally |
| FAISS | Sub-millisecond nearest-neighbour search over millions of vectors |
| EasyOCR | Jersey number / team name text detection |
| Setting | Recommendation |
|---|---|
| Batch Size | 16 default; increase to 32+ if VRAM > 8 GB |
| YOLO Confidence | 0.35 default; lower to 0.25 for distant athletes |
| Clustering Method | HDBSCAN (auto-K); switch to dbscan if too many clusters |
| VRAM < 6 GB | Switch CLIP to ViT-B/32 in settings |
| 10 000+ photos | Resume mode on (default); checkpoint every 500 images |
ModuleNotFoundError → activate .venv then pip install -r requirements.txt
CUDA not detected → reinstall PyTorch with correct CUDA index URL
InsightFace download fails → manually place buffalo_l.zip in cache/models/
No persons detected → lower YOLO confidence to 0.25 in Settings
All photos in one team → lower team_min_cluster_size or switch to dbscan
Port already in use → python main.py --port 7861
| File | Purpose |
|---|---|
main.py |
CLI entry point |
update.js |
Pinokio one-click update script |
install.js |
Pinokio install script |
start.js |
Pinokio launch script |
backend/pipeline/processor.py |
Full pipeline orchestrator |
backend/pipeline/clustering.py |
HDBSCAN team + person clustering |
backend/models/person_detector.py |
YOLOv8-pose wrapper |
backend/models/team_identifier.py |
CLIP + colour + OCR team features |
backend/models/person_identifier.py |
Face + Re-ID + pose + hair |
backend/ui/app.py |
Gradio UI |
backend/utils/database.py |
SQLite persistence layer |
scripts/download_models.py |
Pre-download all AI weights |
MIT License