LLM Council 0.7.18 Release Notes

Highlights

Added provider-specific prompt-cache support without pretending every provider
uses the same cache model.
Added Gemini API and Vertex Gemini cached-content lifecycle handling:
create, TTL refresh, expiry recreation, best-effort cleanup, billing warnings,
and response metadata.
Added OpenRouter route-gated prompt-cache passthrough for eligible Anthropic
routes.
Added OpenAI cache telemetry parsing without sending unsupported request
controls.

Live Proof

OpenRouter live proof returned cache_read_tokens=8116 on the second
identical call through an Anthropic route.
Gemini API live proof created cachedContents/..., returned
cache_read_tokens=12603, and deleted the resource successfully.
Vertex Gemini live proof created projects/.../cachedContents/..., returned
cache_read_tokens=14002, and deleted the resource successfully.

Live proof currently covers non-streaming generation paths. Streaming
cached-content lifecycle behavior is covered by regression tests and documented
with a stricter safety contract: retry expiry only before any chunk is yielded,
then emit final cleanup metadata on successful streams.

Upgrade Notes

Install all provider SDKs with:

pip install 'the-llm-council[all]>=0.7.18'

Gemini/Vertex cached-content creation requires explicit stable source_text;
Council does not infer cacheable source content from dynamic prompts.
Created or refreshed cached-content resources can incur storage-duration
billing until TTL expiry or successful cleanup.

Validation

Focused cached-content lifecycle tests: 28 passed.
Broader prompt-cache suite: 157 passed.
Full suite: 431 passed.
Final council review: approved with no blocking issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.18

Choose a tag to compare

Sorry, something went wrong.