Skip to content

v0.1.10 — prompt caching + protocol kwarg + dep pin fix

Choose a tag to compare

@silversurfer562 silversurfer562 released this 01 May 09:44
· 179 commits to main since this release
be61425

Highlights

  • Prompt caching support: ClaudeProvider.generate now accepts an optional cached_prefix and sends a two-block message with cache_control: ephemeral on the prefix when supplied. Cache hits drop the per-call input-token bill on warm paths.
  • Protocol parity: cached_prefix added to the LLMProvider protocol; OpenAIProvider and GeminiProvider accept and ignore it so the pipeline can pass the kwarg uniformly.
  • Pipeline wiring: run_and_generate splits its augmented prompt at the ### USER REQUEST marker and passes the prefix as cached_prefix only when it clears the 1024-char minimum.
  • Bug fix: attune-help optional-dep pin >=0.7.0,<0.10 already excluded the live attune-help 0.10.0, so pip install 'attune-rag[attune-help]' was broken at HEAD. Bumped to >=0.10.0,<0.11.
  • Windows CI fix: _validate_output_path in the dashboard module now uses Path.as_posix() so the system-dir guard works uniformly on Windows. Three previously-red dashboard tests now pass on Windows runners.

Tests

  • 229 passed, 2 xfailed, 1 xpassed
  • 4 new tests for the prompt-caching contract: two-block content shape, plain-string fallback, threshold-driven prefix population
  • All 12 CI matrix combinations (Win/Mac/Linux × 3.10–3.13) green

Pull request

#3#3