Skip to content

v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache

Choose a tag to compare

@silversurfer562 silversurfer562 released this 01 May 10:08
f32efba

Highlights

Four perf and resilience improvements on the polish/RAG path, plus a new cache CLI subcommand for the on-disk cache that gets introduced.

  • rag_hook: process-level RagPipeline singleton (thread-safe via threading.Lock + double-checked locking). Corpus loads once per process instead of once per template kind — measurable win on --all-kinds runs.
  • doc_gen/_anthropic: retry with 1s / 2s / 4s exponential backoff for 429, 529, and APIConnectionError. Non-retryable SDK errors raise immediately. Credential redaction and __cause__ stripping preserved.
  • generator: three-phase render → polish → write. Polish is now ThreadPoolExecutor-parallel (max 4 workers) so wall-clock time drops to roughly the slowest single LLM call instead of the sum.
  • polish: on-disk cache at ~/.attune/polish_cache/ (overridable via env). Key includes content + source_summary + template_type + system_prompt + augmented_context + model, so any input or model change invalidates entries. _cache_get bumps mtime on hit so heat is observed reliably even on noatime mounts. Lazy mtime-based prune (default TTL 30d, env-tunable, 0 disables) piggybacked on _cache_put. Manual nuke via attune-author cache clear.

New CLI surface

attune-author cache clear -> delete every cached polish entry

New env vars

Var Default Purpose
ATTUNE_AUTHOR_POLISH_CACHE ~/.attune/polish_cache Cache directory override
ATTUNE_AUTHOR_POLISH_CACHE_TTL_SECONDS 2592000 (30d) Mtime TTL; 0 disables prune

Dependencies

  • attune-rag>=0.1.0,<0.2 → resolves to 0.1.10 (prompt caching support)
  • attune-help>=0.10.0 → resolves to 0.10.1 (template aliases)

Tests

518 passed, 37 skipped. New: 12 tests for the polish cache, 9 tests for retry/backoff. All 12 CI matrix combinations green.

Pull request

#7#7