v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache
Highlights
Four perf and resilience improvements on the polish/RAG path, plus a new cache CLI subcommand for the on-disk cache that gets introduced.
- rag_hook: process-level
RagPipelinesingleton (thread-safe viathreading.Lock+ double-checked locking). Corpus loads once per process instead of once per template kind — measurable win on--all-kindsruns. - doc_gen/_anthropic: retry with 1s / 2s / 4s exponential backoff for 429, 529, and
APIConnectionError. Non-retryable SDK errors raise immediately. Credential redaction and__cause__stripping preserved. - generator: three-phase render → polish → write. Polish is now
ThreadPoolExecutor-parallel (max 4 workers) so wall-clock time drops to roughly the slowest single LLM call instead of the sum. - polish: on-disk cache at
~/.attune/polish_cache/(overridable via env). Key includes content + source_summary + template_type + system_prompt + augmented_context + model, so any input or model change invalidates entries._cache_getbumps mtime on hit so heat is observed reliably even onnoatimemounts. Lazy mtime-based prune (default TTL 30d, env-tunable,0disables) piggybacked on_cache_put. Manual nuke viaattune-author cache clear.
New CLI surface
attune-author cache clear -> delete every cached polish entry
New env vars
| Var | Default | Purpose |
|---|---|---|
| ATTUNE_AUTHOR_POLISH_CACHE | ~/.attune/polish_cache | Cache directory override |
| ATTUNE_AUTHOR_POLISH_CACHE_TTL_SECONDS | 2592000 (30d) | Mtime TTL; 0 disables prune |
Dependencies
- attune-rag>=0.1.0,<0.2 → resolves to 0.1.10 (prompt caching support)
- attune-help>=0.10.0 → resolves to 0.10.1 (template aliases)
Tests
518 passed, 37 skipped. New: 12 tests for the polish cache, 9 tests for retry/backoff. All 12 CI matrix combinations green.