Disk Scan Tools

A Python desktop application for scanning drives on Windows, identifying large or unnecessary files, and performing AI-assisted analysis to decide what can be safely deleted — all with a clean Tkinter GUI and a full CLI.

Features

Fast Python-native disk scan — uses os.scandir() for significantly faster traversal than PowerShell, with real-time progress (dirs scanned / files found / GB)
Heuristic classification — automatically categorises files as low / medium / high risk based on extension, age, size, and path patterns (temp, cache, crash dumps, old installers, etc.)
AI analysis via OpenRouter — sends candidates to any LLM (GPT-4o, Claude, Gemini, ...) for enriched recommendations; results include action, risk, confidence, and a plain-language explanation
Model browser — searchable, filterable, sortable tile view of 350+ OpenRouter models with context window and cost per 1M tokens; model list cached locally
Prompt management — write and save custom prompts to ~/.disk-scan-tools/prompts/; select the active prompt from a dropdown
Secure API key storage — OpenRouter key stored in Windows Credential Manager (DPAPI encrypted), never in a plain-text file
Settings cascade — defaults -> ~/.disk-scan-tools/config.json -> CLI flags / GUI overrides
Export & delete — export candidates to CSV (full / low-risk / programs); delete with dry-run mode and a CSV audit log
Full CLI — every operation available from the command line

Requirements

Windows 10 / 11
Python 3.10 or later (3.11+ recommended)
No external pip packages required — uses only the standard library (tkinter, winreg, ctypes, urllib, os.scandir, ...)

Installation

git clone https://github.com/prachwal/disk-scan-tools.git
cd disk-scan-tools
python gui.py          # launch GUI

No pip install required.

Project Structure

disk-scan-tools/
|-- config.py       # Settings cascade + Windows Credential Manager helpers
|-- scanner.py      # Python-native disk scanner + registry program list
|-- analyzer.py     # Heuristic engine + OpenRouter LLM enrichment
|-- models.py       # OpenRouter model catalogue with local cache
|-- exporter.py     # CSV export (full / low-risk / programs)
|-- deleter.py      # Safe file deletion with dry-run and audit log
|-- cli.py          # Full command-line interface
`-- gui.py          # Tkinter GUI (calls modules in-process)

User data is stored in ~/.disk-scan-tools/:

~/.disk-scan-tools/
|-- config.json         # Persisted settings (no API key ever stored here)
|-- models_cache.json   # Cached OpenRouter model list (refreshed every 12 h by default)
`-- prompts/            # Custom LLM prompt files (*.txt)

Scan results and candidates are written to C:\Users\Public\Documents\disk-scan-tools\ by default (configurable via the out_dir setting).

GUI Walkthrough

Pliki tab

Set the root path (or click Przegladaj)
Adjust Top (max files returned) and Min MB (minimum file size)
Click Skanuj — watch the live progress bar and log
Click Analizuj — heuristics run instantly; if a model is selected and an API key is set, LLM enrichment runs in batches of 20 files
The candidate grid auto-loads; click column headers to sort
Use Dry-run low-risk to preview what would be deleted, then Usun low-risk to actually remove them

Prompt tab

Edit the system prompt directly in the text editor
Save named prompts to ~/.disk-scan-tools/prompts/
Select the active prompt from the dropdown

Modele AI tab

Paste your OpenRouter API key and click Zapisz klucz (stored in Credential Manager)
Click Pobierz modele
Filter by text search, select a provider from the dropdown, sort by:
- Name (A-Z)
- Prompt cost ($/1M tokens)
- Completion cost ($/1M tokens)
- Context length (tokens)
Click Wybierz on the model tile you want

CLI Reference

# Full pipeline in one command
python cli.py run --path E:\Data --perform-delete

# Individual steps
python cli.py scan     --path E:\Data --top 300 --min-mb 5
python cli.py analyze  --model anthropic/claude-3-haiku
python cli.py export
python cli.py delete   --perform

# Model browser
python cli.py models --filter gpt --refresh

# Prompt management
python cli.py prompts list
python cli.py prompts save my_prompt --text "Your prompt here"
python cli.py prompts show my_prompt

# Configuration
python cli.py config show
python cli.py config set openrouter_model openai/gpt-4o-mini
python cli.py config set-key          # interactive API key prompt (Credential Manager)
python cli.py config delete-key

Settings cascade

Priority	Source
Lowest	Hardcoded defaults in `config.py`
(middle)	`~/.disk-scan-tools/config.json`
Highest	CLI flags / GUI inputs

Configuration reference

Key	Default	Description
`scan_path`	`C:\`	Root path for scan
`top`	`200`	Max files returned
`min_size_mb`	`10`	Minimum file size filter (MB)
`include_programs`	`true`	Scan installed programs registry
`openrouter_model`	`""`	Default OpenRouter model ID
`models_cache_hours`	`12`	How long to reuse the cached model list
`out_dir`	Public Documents	Directory for scan results and candidates
`selected_prompt`	`""`	Name of active prompt file
`skip_dirs`	see config.py	Path fragments excluded from scan

How AI analysis works

Files pass through heuristic rules first (no API call needed).
If an API key and model are configured, candidates are sent to the LLM in batches of 20.
The system prompt instructs the model to return a pure JSON array.
Each result is matched back by path and merged into the candidate list as ai_action, ai_confidence, ai_note.
The CSV export includes all AI fields.

Security

The OpenRouter API key is stored exclusively in Windows Credential Manager via CredWriteW / CredReadW (DPAPI encryption tied to the Windows user account).
It is never written to any file on disk.
config.json explicitly excludes the api_key field before writing.

License

MIT -- see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Disk Scan Tools

Features

Requirements

Installation

Project Structure

GUI Walkthrough

Pliki tab

Prompt tab

Modele AI tab

CLI Reference

Settings cascade

Configuration reference

How AI analysis works

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analyze_report.py		analyze_report.py
analyzer.py		analyzer.py
cli.py		cli.py
config.py		config.py
deleter.py		deleter.py
export_candidates_csv.py		export_candidates_csv.py
exporter.py		exporter.py
gui.py		gui.py
models.py		models.py
remove_low_risk_dryrun.ps1		remove_low_risk_dryrun.ps1
scan_c.ps1		scan_c.ps1
scanner.py		scanner.py

Folders and files

Latest commit

History

Repository files navigation

Disk Scan Tools

Features

Requirements

Installation

Project Structure

GUI Walkthrough

Pliki tab

Prompt tab

Modele AI tab

CLI Reference

Settings cascade

Configuration reference

How AI analysis works

Security

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages