Skip to content

SuperInstance/uv-cache-guardian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

uv-cache-guardian 🛡️

Resource-aware caching for uv package management.

CI Crates.io License: MIT OR Apache-2.0

uv is fast. This makes it smart about resources. Your CI cache budget is $50/month. This keeps it there.


The Problem

You use uv — the fastest Python package manager (85K stars, orders of magnitude faster than pip). It downloads everything, caches everything, and never asks if you actually need everything.

The result? Your CI bills are insane because:

  • The cache grows unchecked (disk budget blown)
  • Every PR downloads the same packages (bandwidth budget blown)
  • Pipeline runs creep past 5 minutes (time budget blown)
  • You've got 47 projects and they're all caching independent copies of numpy

The Solution

uv-cache-guardian monitors uv's cache and enforces three conservation laws:

Law Default Budget What Happens When Exceeded
💾 Disk 5 GB Intelligent eviction of least-shared packages
🌐 Bandwidth 10 GB/day Warming suggestions for overlapping PR downloads
Time 5 min/CI run Profiling and dependency reuse analysis

Quick Start

Install

# From crates.io
cargo install uv-cache-guardian

# Or build from source
git clone https://github.com/SuperInstance/uv-cache-guardian.git
cd uv-cache-guardian
cargo build --release
./target/release/uv-cache-guardian --help

Basic Usage

# Check current cache status
uv-cache-guardian status

# Output JSON for CI
uv-cache-guardian status --format json

# Check with custom budgets
uv-cache-guardian status \
  --max-disk 10GB \
  --max-bandwidth 5GB \
  --max-time-secs 180

# Generate a snapshot
uv-cache-guardian snapshot --output cache-state.json

# Analyze project dependency profiles
uv-cache-guardian check --projects requirements.txt uv.lock

# CI optimization from history
uv-cache-guardian analyze --history ci-downloads.csv

Conservation Laws

💾 Disk Conservation

The disk budget caps uv's cache directory size. When exceeded, the guardian recommends evictions using KL divergence between project dependency profiles.

# See what would be evicted
uv-cache-guardian evict --projects requirements.txt --target 1GB

🌐 Bandwidth Conservation

The bandwidth budget tracks daily package downloads. When multiple PRs download the same packages, the guardian detects the overlap and suggests cache warming.

# Analyze CI history
uv-cache-guardian analyze --history ci-logs.csv

⏱ Time Conservation

The time budget monitors per-CI-run install duration. When exceeded, the guardian profiles which packages cost the most time and suggests caching them.

Phase Detection

uv-cache-guardian uses a phase model inspired by crackle-runtime to adapt its behavior:

Phase Emoji Condition Action
Stable < 70% disk utilization No action needed
PreTransition ⚠️ 70–90% utilization, growing Proactive pruning recommended
Transitioning 🔥 > 90% utilization Aggressive eviction needed
PostTransition 🔄 Cache recently pruned Monitor recovery
$ uv-cache-guardian status
📦 Cache: 3.42 GB / 5.00 GB (68.4%) | 🌐 Bandwidth: 2.10 GB / 10.00 GB (21.0%) | ⏱ Budget: 300s
Phase: ✅ Stable
Action: No action needed

💾 ✓ Disk Conservation — 68.4% used (3.42 GB / 5.00 GB)
🌐 ✓ Bandwidth Conservation — 21.0% used (2.10 GB / 10.00 GB)
⏱ ✓ Time Conservation — 0.0% used (0.00 / 300.00)

Intelligent Eviction with KL Divergence

The eviction strategy is the crown jewel. It doesn't just evict the largest packages — it evicts the packages your projects are least likely to share.

How it works

  1. Each project gets a dependency profile — a probability distribution over its package dependencies
  2. The KL divergence between profiles measures how "surprising" one project's dependencies are to another
  3. Projects with high KL divergence from the group mean are outliers — their packages are evicted first
  4. Projects with similar profiles to the majority keep their cache intact
Similar projects:     numpy(0.5) + pandas(0.3) + flask(0.2)
                      numpy(0.4) + pandas(0.4) + flask(0.2)
                      ↕ JS divergence = 0.01 → high similarity → keep cache

Outlier project:      obscure-lib(1.0)
                      ↕ JS divergence = 1.0+ → evict first

The EvictionStrategy provides:

  • KL divergence — asymmetric, sensitive to missing packages
  • Smoothed KL divergence — handles zero probabilities gracefully
  • Jensen-Shannon divergence — symmetric, bounded [0, 1]
  • Similarity metric — 1/(1+JS) — intuitive 0–1 scale
  • Eviction scoring — combines dissimilarity with cache size for a priority score

CI Optimization

When analyzing CI history, the guardian identifies packages downloaded by multiple PRs within a time window:

$ uv-cache-guardian analyze --history ci.csv
📊 CI Optimization Report
========================
Records analyzed: 3
Total downloads: 13
Estimated savings: 40.00 MB

Top packages:
  numpy — downloaded 3 times
  pandas — downloaded 1 times
  flask — downloaded 1 times

Overlapping packages (same pkg, multiple PRs):
  numpy

Cache warming suggestions:
  🔥 Pre-warm numpy in CI cache — 3 PR(s) downloaded it — save 30.00 MB

Create a CSV with columns: pr_id,packages,total_bytes,duration_seconds

PR-1234,"numpy,pandas,python-dateutil,pytz",45000000,45
PR-1235,"numpy,scipy,scikit-learn,joblib",85000000,90
PR-1236,"flask,requests,jinja2,werkzeug,numpy",30000000,35

Serde Snapshots

Every check generates a JSON snapshot for CI audit trails. Snapshots include cache stats, conservation results, phase detection, project profiles, and eviction records.

# Capture a snapshot
uv-cache-guardian snapshot --output audit-2026-06-01.json

# Read it back programmatically
use uv_cache_guardian::snapshot::CacheSnapshot;

let snapshot = CacheSnapshot::from_json_file("audit-2026-06-01.json")?;
println!("{}", snapshot.summary_line());
// ✅ [2026-06-01 12:00:00 UTC] Cache: 3.42 GB - 3 conservation check(s), phase: Stable (No action needed), 0 project(s)

Library Usage

Add to your Cargo.toml:

[dependencies]
uv-cache-guardian = "0.1.0"

Programmatic API

use uv_cache_guardian::monitor::CacheGuardian;
use uv_cache_guardian::conservation::ConservationChecker;
use uv_cache_guardian::phase::PhaseDetector;

// Monitor cache
let guardian = CacheGuardian::new("~/.cache/uv");
let stats = guardian.measure()?;
println!("Cache size: {}", CacheGuardian::format_bytes(stats.cache_size_bytes));

// Check conservation
let results = ConservationChecker::check_all(
    stats.cache_size_bytes, 5_000_000_000,  // disk
    0, 10_000_000_000,                      // bandwidth
    0, 300,                                  // time
);
for r in &results {
    println!("{}", r.message);
}

// Detect phase
let detector = PhaseDetector::default();
let disk_util = stats.cache_size_bytes as f64 / 5_000_000_000.0;
let phase = detector.detect(disk_util);
println!("Phase: {} {}", phase.emoji(), phase.label());

Dependency Profiling & Eviction

use uv_cache_guardian::eviction::{
    DependencyProfile, EvictionStrategy, ProjectProfile,
};

let proj_a = ProjectProfile::new(
    "data-science",
    DependencyProfile::from_counts(
        vec![("numpy", 3), ("pandas", 2)],
    ),
    50_000_000,
);

let proj_b = ProjectProfile::new(
    "web-app",
    DependencyProfile::from_counts(
        vec![("flask", 2), ("requests", 1)],
    ),
    20_000_000,
);

// Find which project to evict first
let ranked = EvictionStrategy::rank_eviction_candidates(&[proj_a, proj_b]);
for (project, score) in &ranked {
    println!("{} — eviction score: {:.4}", project.name, score);
}

Architecture

uv-cache-guardian/
├── src/
│   ├── main.rs          — CLI entrypoint
│   ├── lib.rs           — Re-exports
│   ├── monitor.rs       — CacheGuardian, CacheStats, budgets
│   ├── conservation.rs  — Conservation law checking
│   ├── phase.rs         — Phase detection (crackle-runtime model)
│   ├── eviction.rs      — KL divergence eviction strategy
│   ├── snapshot.rs      — Serde snapshots for CI audit
│   ├── ci.rs            — CI optimization & cache warming
│   ├── cli.rs           — Clap CLI definition
│   └── examples/
│       ├── basic_monitoring.rs
│       └── ci_optimizer.rs
├── .github/workflows/ci.yml
├── Cargo.toml
└── README.md

Testing

# Run all tests
cargo test --all-targets

# With clippy
cargo clippy --all-targets -- -D warnings

# Check formatting
cargo fmt --all -- --check

# Run examples
cargo run --example basic_monitoring
cargo run --example ci_optimizer

Costs & ROI

Scenario Without Guardian With Guardian Savings
10 devs, daily CI $120/month $40/month $80/month
50 projects, weekly CI $350/month $100/month $250/month
Monorepo, 20 PRs/day $200/month $50/month $150/month

Based on:

  • GitHub Actions cache pricing ($0.008/GB/day)
  • Download bandwidth at $0.12/GB
  • CI runner time at $0.008/min

License

Licensed under either of:

  • MIT License (MIT)
  • Apache License, Version 2.0 (Apache-2.0)

at your option.

Contributing

PRs welcome! Check our CI pipeline for standards.


Built for the uv ecosystem. uv-cache-guardian is not affiliated with Astral or the uv project.

About

Resource-aware caching for uv — conservation laws for disk, bandwidth, and CI time budgets

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages