VisionSort

Rename hundreds of screenshots in minutes instead of hours — powered by local AI.

VisionSort scans a folder of images, sends each one to a vision model running on your machine through Ollama, and proposes descriptive filenames. You review the suggestions in a clean web interface, tweak anything you want, then apply all renames with one click.

Everything runs locally. No uploads, no API keys, no cloud services.

What It Does

Analyzes images with AI — A local vision model looks at each image and generates a short, descriptive filename (e.g., Screenshot_2025-03-14_091823.png → vscode-python-debugging.png).
Batch processing — Point it at a folder with thousands of images and walk away. A progress bar with ETA keeps you informed.
Review before applying — Every suggestion shows up as an editable card. Approve, edit, skip, or re-analyze individual files. Bulk-approve when you're happy.
Two modes — Rename files in place, or move them to a target directory.
Duplicate detection — Finds visually similar images using perceptual hashing so you can clean up before renaming.
NSFW detection — Flags sensitive content automatically with an nsfw- prefix.
Persistent image IDs — Embeds a unique identifier into each image's metadata (EXIF for JPEG/WebP, tEXt chunk for PNG). Files keep their identity across renames and moves.
Undo support — Every Apply generates a PowerShell undo script and JSON manifest. One click to reverse.
Pause & resume — Pause analysis mid-batch and pick up where you left off.
History — Full audit trail of all operations in a searchable carousel.
Dark & light mode — Follows your system preference, with a manual toggle.
System monitoring — Live CPU, RAM, and GPU usage in the header.

Requirements

Python 3.11+
Ollama running with at least one vision model
NVIDIA GPU recommended for faster inference (works on CPU too, just slower)

Quick Start

Windows

git clone https://github.com/IamAngusU/vision-sort.git
cd vision-sort
start.bat

macOS / Linux

git clone https://github.com/IamAngusU/vision-sort.git
cd vision-sort
chmod +x start.sh
./start.sh

The start script creates a virtual environment, installs dependencies, and opens the browser automatically.

Pull a vision model (if you haven't already)

ollama pull qwen2.5vl:7b

VisionSort will also suggest models directly in the UI and can pull them for you.

Manual setup

If you prefer to set things up yourself:

git clone https://github.com/IamAngusU/vision-sort.git
cd vision-sort
pip install -r requirements.txt
python run.py

Then open http://localhost:8899.

How To Use

Start Ollama — Make sure Ollama is running before you start VisionSort.
Select a folder — Browse to the directory with your screenshots using the sidebar.
Pick a model — Choose from the installed vision models or pull a new one.
Start analysis — Hit "Start" and watch the AI work through your images.
Review suggestions — Edit or approve the generated names. Use filters and search to find specific files.
Apply — Click "Apply Approved" to rename/move all approved files at once.
Stop the server — Click "Stop Server" in the sidebar or close the terminal.

Stopping the Server

Three ways to stop VisionSort:

Click Stop Server in the sidebar
Press Ctrl+C in the terminal
Close the terminal window

Configuration

All settings are optional and controlled through environment variables:

Variable	Default	Description
`VISION_SORT_OLLAMA_URL`	auto-detected	Ollama API endpoint
`VISION_SORT_DEFAULT_MODEL`	`qwen2.5vl:7b`	Default vision model
`VISION_SORT_HOST`	`127.0.0.1`	Server bind address
`VISION_SORT_PORT`	`8899`	Server port
`VISION_SORT_BASE_PATH`		Base path for reverse proxy
`VISION_SORT_DATA_DIR`	`data`	State and thumbnail storage
`VISION_SORT_DEBUG`	`false`	Enable debug mode and API docs

Project Structure

vision-sort/
├── start.bat / start.sh      # One-click launcher
├── run.py                     # Entry point
├── requirements.txt
├── src/
│   ├── app.py                 # FastAPI application factory
│   ├── config.py              # Environment-driven configuration
│   ├── models/
│   │   └── schemas.py         # Data models
│   ├── routers/
│   │   ├── events.py          # Server-Sent Events (live updates)
│   │   ├── files.py           # File management & thumbnails
│   │   ├── models.py          # Ollama models & system info
│   │   └── process.py         # Analysis, rename, undo
│   ├── services/
│   │   ├── analyzer.py        # Vision AI pipeline
│   │   ├── duplicates.py      # Perceptual hash duplicate detection
│   │   ├── image_id.py        # Persistent EXIF/PNG metadata IDs
│   │   ├── history.py         # Operation audit trail
│   │   ├── ollama_client.py   # Ollama HTTP client
│   │   ├── renamer.py         # Rename, move, backup, undo
│   │   ├── scanner.py         # Directory scanning
│   │   └── state.py           # Thread-safe state persistence
│   └── static/                # Web interface (vanilla JS, no build step)
└── data/                      # Auto-created: state, thumbnails

Reverse Proxy (nginx)

To serve behind nginx at /vision-sort:

location /vision-sort/ {
    proxy_pass http://127.0.0.1:8899/;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_http_version 1.1;
    proxy_set_header Connection '';
    proxy_buffering off;
    proxy_cache off;
    proxy_read_timeout 86400s;
}

VISION_SORT_BASE_PATH=/vision-sort python run.py

License

PolyForm Noncommercial 1.0.0 — free for personal and non-commercial use. Commercial use is not permitted.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.bat		build.bat
nginx.example.conf		nginx.example.conf
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.py		run.py
start.bat		start.bat
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisionSort

What It Does

Requirements

Quick Start

Windows

macOS / Linux

Pull a vision model (if you haven't already)

Manual setup

How To Use

Stopping the Server

Configuration

Project Structure

Reverse Proxy (nginx)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VisionSort

What It Does

Requirements

Quick Start

Windows

macOS / Linux

Pull a vision model (if you haven't already)

Manual setup

How To Use

Stopping the Server

Configuration

Project Structure

Reverse Proxy (nginx)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages