Minor Changes
-
#590
fe0d182Thanks @threepointone! - The AI Gateway delegate gains cross-vendor server-side fallback
(fallback: { mode: "server" }) — multiple vendors in one gateway run, with the
winner selected viacf-aig-step.The gateway delegate now reaches header parity with the run path: the gateway
path forwardscacheKey,eventId,requestTimeoutMs, andretriesfrom the
gateway options ascf-aig-*headers, andDelegateCallOptionsgains two new
universal-endpoint controls —byokAlias(cf-aig-byok-alias, select a stored
BYOK key by alias) andzdr(cf-aig-zdr, per-request Zero Data Retention
override for Unified Billing, applied on both transports).Internally, the provider registry,
cf-aig-*header building, resumable-stream
engine, and Workers AI SSE helpers are now shared across the Cloudflare AI
packages (bundled inline — no new dependency for you to install). -
#593
1c6afd0Thanks @threepointone! - Native Workers AI failures are now surfaced as AI SDKAPICallErrors so the AI
SDK's built-in retry (maxRetries) can engage on transient errors.Previously the binding path (
env.AI.run) threw plainErrors and the REST
path threw a genericError, so the AI SDK never retried them — most notably
the common "out of capacity" failure (internal code3040, HTTP429) and
other 5xx blips just failed the call outright.- Binding path: errors thrown by
env.AI.runare normalized into an
APICallErroracross every Workers AI model — chat, embedding, image, speech,
transcription, and reranking. The Workers AI internal error code is parsed from
the message (or a numericcodeproperty) and mapped to the documented HTTP
status (e.g.3040/3036→429,3007/3008→408,5007→400), and
APICallErrorderivesisRetryablefrom that status (retryable on
408/409/429/5xx). Unrecognized errors get no status and stay non-retryable
(prior behavior).AbortError/TimeoutErrorcancellations propagate
unchanged. - REST path: non-OK responses now throw an
APICallErrorcarrying the real
statusCode, response headers (soRetry-Afteris honored), and body, instead
of a genericError. The error message keeps the same
Workers AI API error (<status> <statusText>): <body>shape.
This means transient capacity/5xx errors are now automatically retried with
exponential backoff bygenerateText/streamText(default 2 retries; tune via
maxRetries). SetmaxRetries: 0to opt out. - Binding path: errors thrown by