v0.1.10 — prompt caching + protocol kwarg + dep pin fix

silversurfer562 released this 01 May 09:44

· 179 commits to main since this release

be61425

Highlights

Prompt caching support: ClaudeProvider.generate now accepts an optional cached_prefix and sends a two-block message with cache_control: ephemeral on the prefix when supplied. Cache hits drop the per-call input-token bill on warm paths.
Protocol parity: cached_prefix added to the LLMProvider protocol; OpenAIProvider and GeminiProvider accept and ignore it so the pipeline can pass the kwarg uniformly.
Pipeline wiring: run_and_generate splits its augmented prompt at the ### USER REQUEST marker and passes the prefix as cached_prefix only when it clears the 1024-char minimum.
Bug fix: attune-help optional-dep pin >=0.7.0,<0.10 already excluded the live attune-help 0.10.0, so pip install 'attune-rag[attune-help]' was broken at HEAD. Bumped to >=0.10.0,<0.11.
Windows CI fix: _validate_output_path in the dashboard module now uses Path.as_posix() so the system-dir guard works uniformly on Windows. Three previously-red dashboard tests now pass on Windows runners.

Tests

229 passed, 2 xfailed, 1 xpassed
4 new tests for the prompt-caching contract: two-block content shape, plain-string fallback, threshold-driven prefix population
All 12 CI matrix combinations (Win/Mac/Linux × 3.10–3.13) green

Pull request

#3 — #3

Assets 2