Release v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache · Smart-AI-Memory/attune-author

Highlights

Four perf and resilience improvements on the polish/RAG path, plus a new cache CLI subcommand for the on-disk cache that gets introduced.

rag_hook: process-level RagPipeline singleton (thread-safe via threading.Lock + double-checked locking). Corpus loads once per process instead of once per template kind — measurable win on --all-kinds runs.
doc_gen/_anthropic: retry with 1s / 2s / 4s exponential backoff for 429, 529, and APIConnectionError. Non-retryable SDK errors raise immediately. Credential redaction and __cause__ stripping preserved.
generator: three-phase render → polish → write. Polish is now ThreadPoolExecutor-parallel (max 4 workers) so wall-clock time drops to roughly the slowest single LLM call instead of the sum.
polish: on-disk cache at ~/.attune/polish_cache/ (overridable via env). Key includes content + source_summary + template_type + system_prompt + augmented_context + model, so any input or model change invalidates entries. _cache_get bumps mtime on hit so heat is observed reliably even on noatime mounts. Lazy mtime-based prune (default TTL 30d, env-tunable, 0 disables) piggybacked on _cache_put. Manual nuke via attune-author cache clear.

attune-author cache clear -> delete every cached polish entry

Var	Default	Purpose
ATTUNE_AUTHOR_POLISH_CACHE	~/.attune/polish_cache	Cache directory override
ATTUNE_AUTHOR_POLISH_CACHE_TTL_SECONDS	2592000 (30d)	Mtime TTL; 0 disables prune

518 passed, 37 skipped. New: 12 tests for the polish cache, 9 tests for retry/backoff. All 12 CI matrix combinations green.

#7 — #7