v0.3.0 — middleware complete
The provider-neutral middleware line is complete. Every chat/embedding client can now be composed with the full resilience + observability stack behind the same stable interfaces.
Added since v0.2.0
llms/middleware— the full set:ValidationChat(structural/app-neutral: text-onlySystem, role/part legality, tool-call↔result pairing)RetryChat/RetryEmbeddings(classify viallms.Retryable, honorRetryAfter, backoff + optional jitter)TimeoutChat/TimeoutEmbeddings(per-attempt deadline)CircuitChat/CircuitEmbeddings(3-state breaker, single-flight half-open, non-retryable*CircuitOpenError)MetricsChat/MetricsEmbeddings(narrow app-neutralObserver, oneEventper attempt)RecommendedChat/RecommendedEmbeddings— wires the spec's recommended composition order;ChainChatstays the primitive
llms.RetryAfteraccessor, symmetric withllms.Retryableapierr:context.Canceledreturned as-is (non-retryable,errors.Is-matchable);context.DeadlineExceededstays a retryable timeout- Docs: ADR log adopted (
docs/adr/, 0001–0006);MAESTRO_DIVERGENCESextended; README rewritten as a project + usage guide
Decisions of record
ADR-0003 (middleware is Complete/Embed-only; streaming deferred), ADR-0004 (single error classifier), ADR-0005 (non-retryable CircuitOpenError), ADR-0006 (structural ValidationError).
Not in this release — Maestro cut-over readiness gate
v0.3.0 is a complete middleware milestone, not a "cut-over ready" claim. Two tracked, separate readiness gates remain before the Maestro cut-over:
ToolChoiceRequired— "the model must call one of the offered tools" (general capability; divergences OC2/G2)- Provider-neutral prompt-cache hint — restores Maestro's
CacheControlbehavior (divergence A5)
Streaming and vLLM remain deliberately deferred. Pre-1.0: v0.x minor versions may break.