Skip to content

v0.3.0 — middleware complete

Choose a tag to compare

@dratner dratner released this 17 May 20:19
· 20 commits to main since this release
6023418

The provider-neutral middleware line is complete. Every chat/embedding client can now be composed with the full resilience + observability stack behind the same stable interfaces.

Added since v0.2.0

  • llms/middleware — the full set:
    • ValidationChat (structural/app-neutral: text-only System, role/part legality, tool-call↔result pairing)
    • RetryChat/RetryEmbeddings (classify via llms.Retryable, honor RetryAfter, backoff + optional jitter)
    • TimeoutChat/TimeoutEmbeddings (per-attempt deadline)
    • CircuitChat/CircuitEmbeddings (3-state breaker, single-flight half-open, non-retryable *CircuitOpenError)
    • MetricsChat/MetricsEmbeddings (narrow app-neutral Observer, one Event per attempt)
    • RecommendedChat/RecommendedEmbeddings — wires the spec's recommended composition order; ChainChat stays the primitive
  • llms.RetryAfter accessor, symmetric with llms.Retryable
  • apierr: context.Canceled returned as-is (non-retryable, errors.Is-matchable); context.DeadlineExceeded stays a retryable timeout
  • Docs: ADR log adopted (docs/adr/, 0001–0006); MAESTRO_DIVERGENCES extended; README rewritten as a project + usage guide

Decisions of record

ADR-0003 (middleware is Complete/Embed-only; streaming deferred), ADR-0004 (single error classifier), ADR-0005 (non-retryable CircuitOpenError), ADR-0006 (structural ValidationError).

Not in this release — Maestro cut-over readiness gate

v0.3.0 is a complete middleware milestone, not a "cut-over ready" claim. Two tracked, separate readiness gates remain before the Maestro cut-over:

  • ToolChoiceRequired — "the model must call one of the offered tools" (general capability; divergences OC2/G2)
  • Provider-neutral prompt-cache hint — restores Maestro's CacheControl behavior (divergence A5)

Streaming and vLLM remain deliberately deferred. Pre-1.0: v0.x minor versions may break.