Skip to content

pratham15541/disktracker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DiskTracker πŸš€

DiskTracker is a high-performance incremental filesystem observability engine and CLI tool. Built entirely in Rust, it captures directory snapshots, diffs them historically, and monitors filesystems dynamically to track exactly where, when, and why disk growth occurs.

Unlike simple directory-traversal tools, DiskTracker acts as a persistent indexer and real-time event reconciler, allowing developers to observe millions of files with microscopic latency.


πŸ—ΊοΈ Why DiskTracker?

Standard tools like du, ncdu, or gdu excel at displaying instant disk usage but fall short when answering historical questions: β€œWhich directory grew by 10GB in the last 24 hours?” or β€œWhat live process is writing to my disk right now?”

DiskTracker bridges this gap with incremental tracking and live synchronization.

Feature / Capability DiskTracker ncdu gdu dua WizTree
Warm Snapshot Validation βœ… Yes (SnapshotTree fingerprints) ❌ No ❌ No ❌ No ⚠️ Partial
Real-Time Watch Mode βœ… Yes (Active Event Loop) ❌ No ❌ No ❌ No ❌ No
Historical Snapshot Diffs βœ… Yes (disktracker diff) ❌ No ❌ No ❌ No ❌ No
Zero-Copy mmap Index βœ… Yes (*.mmap) ❌ No ❌ No ❌ No ❌ No
Attributed Growth Heuristics βœ… Yes (disktracker explain) ❌ No ❌ No ❌ No ❌ No
Incremental Reconciliation βœ… Yes ($O(\text{changes})$) ❌ No ❌ No ❌ No ⚠️ Partial

πŸ“Š Real Benchmark Metrics

These metrics come from the current bench_tool harness on May 22, 2026. The run generated 5,001 directories and 45,000 files under /tmp/disktracker_bench; this machine reported /tmp as rotational storage, so auto parallelism was capped to one worker to avoid seek thrashing.

Run it yourself with:

cargo run --release -p disktracker-cli --bin bench_tool
Run Wall Time Files Dirs Total Bytes Total Syscalls Peak RSS
Serial cold scan 143 ms 45,000 5,001 45,000 65,004 12,736 KB
Auto cold scan, HDD-capped 292 ms 45,000 5,001 45,000 70,006 13,556 KB
Warm unchanged scan 130 ms 45,000 5,001 45,000 70,004 15,176 KB
Warm scan, 1 file changed 138 ms 45,000 5,001 45,015 70,004 15,580 KB
Warm scan, deep subtree changed 149 ms 45,001 5,001 45,031 70,005 15,984 KB
Warm scan, 100 scattered changes 151 ms 45,001 5,001 47,516 70,005 16,388 KB

The same codebase also includes a CI regression baseline for a smaller 51-directory / 450-file fixture at crates/disktracker-core/tests/perf_baseline.json. That test enforces syscall and timing ceilings for cold and warm scans.

Instrumented Syscalls

DiskTracker currently tracks openat, statx, and getdents through atomic telemetry. On non-Linux platforms, storage read/write byte counters may report 0 when the OS does not expose the same /proc/self/io data.

Dynamic Concurrency Scaling

DiskTracker automatically identifies the hardware type to maximize data transfer rates without causing hardware thrashing:

Storage Type Concurrency Mode Current Behavior
NVMe/SATA SSD Auto-tuned multi-threaded scan Uses the scheduler's work-stealing path to queue directory reads concurrently.
Mechanical/rotational disk Single-thread cap Caps auto parallelism at one worker to avoid head thrashing.

πŸ–₯️ Interactive Terminal Walkthroughs

Here is how DiskTracker feels and behaves in your terminal.

πŸ“₯ Real-Time FS Watch Mode

Monitors the directory tree and records disk writes incrementally as they occur:

$ disktracker watch /home/user/projects --debounce-ms 300
[watch] Starting real-time monitoring of /home/user/projects (Ctrl+C to stop)
[watch] Hydrating instantly from memory-mapped index: /home/user/.disktracker/data.mmap ...
[watch] Reusing snapshot #14 from watch_state.
[watch] Watching /home/user/projects (snapshot #14, 15,243 entries). Watching...

[watch] /home/user/projects/disktracker/crates/disktracker-watch/src/watcher.rs +2.3 KB
[watch] /home/user/projects/disktracker/crates/disktracker-watch/src/mmap_index.rs +8.4 KB
[watch] /home/user/projects/disktracker/temp_file -12.0 KB

βš–οΈ Historical Snapshot Diffing

Compares size distributions between two snapshot sessions to find growing nodes:

$ disktracker diff --from 12 --to 14
 Comparing Snapshot #12 -> #14
  └─ Base:   2026-05-18 10:14:02
  └─ Target: 2026-05-20 09:00:00

Path                                                    Size Change   Direction
───────────────────────────────────────────────────────────────────────────────
/home/user/projects/disktracker/target                  +124.84 MB    [GROWTH]
/home/user/projects/disktracker/crates                  +14.50 MB     [GROWTH]
/home/user/projects/disktracker/.git/objects            -10.20 MB     [SHRUNK]
───────────────────────────────────────────────────────────────────────────────
Net Delta Change: +129.14 MB

Snapshot diffs are root-isolated. disktracker diff only compares snapshots from the same filesystem root, and fails fast if --from and --to point at different roots.

πŸ“Š Hierarchical Report Trees

Generates clean, interactive terminal folders displaying exact growth distributions:

$ disktracker report --depth 3
πŸ“Š Hierarchical Storage Growth Report (Last 7d)
───────────────────────────────────────────────────────────────────────────────
[+] /home/user/projects (+184.20 MB)
β”œβ”€β”€ [+] /home/user/projects/disktracker (+154.20 MB)
β”‚   β”œβ”€β”€ [+] /home/user/projects/disktracker/target (+140.00 MB)
β”‚   └── [+] /home/user/projects/disktracker/crates (+14.20 MB)
└── [+] /home/user/projects/web-app (+30.00 MB)
    β”œβ”€β”€ [+] /home/user/projects/web-app/node_modules (+28.00 MB)
    └── [+] /home/user/projects/web-app/src (+2.00 MB)

When a database contains multiple roots or drives, disktracker report --all --last 7d resolves the oldest and newest snapshots inside the window for each root independently, diffs each root on its own timeline, then merges the results. If no path is provided, it automatically scopes to the current working directory ($PWD).

Unchanged subtrees with a net delta of exactly 0 bytes are intelligently pruned from the tree layout to remove noise.

disktracker report --last 7d --path "D:\"

For a deterministic single-root comparison, pass an explicit pair:

disktracker report --from 12 --to 14

🏷️ Attributed Storage Growth

Auto-categorizes disk consumption using intelligent path heuristics:

$ disktracker explain --last 14d
🏷️ Storage Growth Explainer (Last 14 days)
───────────────────────────────────────────────────────────────────────────────
Category           Matches Heuristic                      Growth Delta
───────────────────────────────────────────────────────────────────────────────
Build Outputs      **/target/**, **/dist/**, **/build/**  +140.00 MB
Dependency Trees   **/node_modules/**, **/vendor/**       +28.00 MB
Git Objects        **/.git/objects/**                     -10.20 MB
Application Src    **/src/**, **/lib/**                   +4.30 MB
───────────────────────────────────────────────────────────────────────────────
Attributed growth explains 99.1% of net delta change (+162.10 MB).

disktracker explain uses the same root-isolated snapshot resolution as report, so attribution heuristics only receive valid same-root diffs. Use --path to explain growth inside one root or subtree, and use explicit snapshot pairs when you need deterministic bounds:

disktracker explain --last 14d --path /home/user/projects
disktracker explain --from 12 --to 14 --path /home/user/projects

⚑ Install

npm (all platforms)

npm i -g disktracker

Linux/MacOS (curl)

curl -fsSL https://raw.githubusercontent.com/pratham15541/disktracker/main/scripts/install.sh | bash

Windows (winget)

winget install --id pratham15541.disktracker

Windows (Chocolatey)

choco install disktracker

Build from source

git clone https://github.com/pratham15541/disktracker.git
cd disktracker
cargo build --release
# Binary available at: target/release/disktracker

πŸ“‚ Deep Technical Architecture

DiskTracker moves systems optimization details into specialized sub-documents. If you want to explore the technical internals, read the deep dives below:


πŸ› οΈ Concrete Examples

Below are quick examples of day-to-day DiskTracker usage:

1. Cold Scan and Warm Scan

# Force a complete physical traversal scan
disktracker scan /home/user/projects --cold

# Run a warm scan using previous snapshot metadata
disktracker scan /home/user/projects --warm

2. Differencing Snapshot States

# Display the top 10 changes that occurred over the last 7 days
disktracker diff --from 7d --top 10

3. Chronological Size Timeline

# Print size history of a target directory
disktracker timeline /home/user/projects/disktracker/target

4. Database Capacity Pruning

# Remove snapshots older than 30 days and reclaim storage space
disktracker prune --older-than 30d

CLI Command Reference

disktracker <command> [options]

Snapshot references used by --from, --to, and --last accept:

  • A numeric snapshot id (example: 42)
  • A relative duration (examples: 7d, 2w, 1m)
  • A calendar date (example: 2026-05-22)

Historical analysis preserves root isolation. diff, report --from/--to, and explain --from/--to require both snapshots to belong to the same scan_root. Windowed --last commands group snapshots by root first, compare each root independently, and then merge the per-root results.

If no path is explicitly provided, most commands will intelligently fall back to your current working directory ($PWD), automatically filtering the results to your current subtree. Use the --all flag to bypass this default and query every tracked root.

scan

Scan the filesystem and store a snapshot.

Arguments:

  • path (optional) Root path to scan. Defaults to / on Unix or the drive root on Windows.

Options:

  • --max-depth <N> Limit traversal depth.
  • --skip <NAME> Skip directories by name. Can be passed multiple times.
  • --one-filesystem Do not cross filesystem boundaries.
  • --db <PATH> Use a custom database path.
  • --quiet Suppress progress output.
  • --json Emit JSON output.
  • --bench Emit benchmark JSON and skip DB writes.
  • --parallelism <N> Number of scan threads (0 = auto). Default: 0.
  • --cold Force a full cold scan.
  • --warm Force a warm scan (fails if no prior snapshot exists).
  • --all Scan all drives/roots on the system.

diff

Show changes between two snapshots.

disktracker diff rejects cross-root snapshot pairs, for example C:\ to D:\. By default, it scopes the diff to your current working directory.

Arguments:

  • path (optional) Directory path to scope. Defaults to the current working directory.

Options:

  • --from <REF> Base snapshot reference. Default: previous snapshot.
  • --to <REF> Target snapshot reference. Default: latest snapshot.
  • --top <N> Max entries to show. Default: 20.
  • --min-delta <BYTES> Minimum size change threshold. Default: 1048576 (1 MB).
  • --db <PATH> Use a custom database path.
  • --all Compare latest snapshots across all tracked roots instead of scoping to the current directory.
  • --json Emit JSON output.

report

Human-readable hierarchical growth report.

By default, --last aggregates per-root diffs within the current working directory ($PWD). Zero-delta branches are pruned to highlight only the data that changed. Use --all to aggregate across all roots, or provide an explicit path.

Arguments:

  • path (optional) Directory path to report. Defaults to the current working directory.

Options:

  • --last <REF> Window start reference. Default: 7d.
  • --from <REF> Explicit base snapshot reference. Must be used with --to.
  • --to <REF> Explicit target snapshot reference. Must be used with --from.
  • --top <N> Max entries to show. Default: 15.
  • --depth <N> Depth limit for reported paths. Default: 4.
  • --db <PATH> Use a custom database path.
  • --all Report on all tracked roots instead of scoping to the current directory.
  • --json Emit JSON output.

list

List stored snapshot timestamps.

Options:

  • --db <PATH> Use a custom database path.
  • --json Emit JSON output.

watch

Watch the filesystem for changes in real time.

Arguments:

  • path (optional) Root path to watch. Defaults to / on Unix or the drive root on Windows.

Options:

  • --db <PATH> Use a custom database path.
  • --quiet Suppress progress output.
  • --one-filesystem Do not cross filesystem boundaries.
  • --skip <NAME> Skip directories by name. Can be passed multiple times.
  • --debounce-ms <MS> Debounce window in milliseconds. Default: 500.
  • --flush-secs <SECS> Flush state to DB every N seconds. Default: 3600.
  • --all Watch all drives/roots on the system.

explain

Explain what caused disk growth with human-readable attribution.

By default, --last attributes growth within the current working directory ($PWD). Use --all to aggregate explanations across all roots, or provide an explicit path.

Arguments:

  • path (optional) Directory path to explain. Defaults to the current working directory.

Options:

  • --last <REF> Window start reference. Default: 7d.
  • --from <REF> Explicit base snapshot reference. Must be used with --to.
  • --to <REF> Explicit target snapshot reference. Must be used with --from.
  • --top <N> Max entries to show. Default: 15.
  • --db <PATH> Use a custom database path.
  • --all Explain growth across all tracked roots instead of scoping to the current directory.
  • --json Emit JSON output.

timeline

Show growth history for a specific directory.

Arguments:

  • path (optional) Directory path to inspect. Defaults to the current working directory.

Options:

  • --db <PATH> Use a custom database path.
  • --all Show the timeline for all tracked roots.
  • --json Emit JSON output.

reconcile

Validate watcher consistency and repair stale state using an append-only snapshot.

Options:

  • --db <PATH> Use a custom database path.
  • --full Perform a fresh full scan to detect and fix drift.
  • --all Reconcile all watched roots in the database.
  • --json Emit JSON output.

tui

Open the live terminal dashboard for an existing DiskTracker database.

Arguments:

  • path (optional) Directory path to scope the UI. Defaults to the current working directory.

Options:

  • --db <PATH> Use a custom database path.
  • --refresh-ms <MS> Refresh interval in milliseconds. Default: 500.
  • --growth Start directly in the growth explorer view.
  • --tree Start directly in the collapsible directory tree explorer.
  • --diff Start directly in the historical diff visualizer.
  • --from <REF> Base snapshot reference for --diff.
  • --to <REF> Target snapshot reference for --diff.
  • --top <N> Maximum diff rows to display. Default: 80.
  • --all Load and visualize all roots instead of scoping to the current directory.

prune

Delete old snapshots to reclaim database space.

Options:

  • --keep-last <N> Keep only the N most recent snapshots.
  • --older-than <DURATION> Delete snapshots older than a duration (examples: 90d, 12w, 6m).
  • --dry-run Preview deletions without making any changes.
  • --db <PATH> Use a custom database path.
  • --json Emit JSON output.

πŸ“„ License

DiskTracker is open-source software distributed under the MIT License. For details, see the LICENSE file.

Support DiskTracker by starring this repository 🌟

About

DiskTracker is a high-performance incremental filesystem observability engine and CLI tool

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages