Minor Changes
-
#573
4f19489Thanks @threepointone! - Add AI Gateway routing for third-party catalog models tocreateWorkersAI, with capability-driven transport selection, the full provider registry, a bring-your-own-provider wrapper, typed errors, and client/server fallback.Experimental. This is a substantial new surface for the package — well beyond its original job of wrapping Workers AI — and several behaviors rely on undocumented AI Gateway internals (the
cf-aig-run-idresume buffer, per-provider run-path wire formats). Treat the entire third-party / gateway surface as experimental: the API may change, and provider coverage maturity varies (only the run-catalog providers are live-verified end-to-end). It does not affect the existing stable Workers AI / AI Search APIs.createWorkersAIis the single public entry point. Pass an optionalprovidersarray (wire-format plugins from the sub-paths below). When set, a"<provider>/<model>"catalog slug passed to the provider (or.chat) is routed through AI Gateway automatically, while@cf/...ids continue to build Workers AI models. Each slug is resolved against a registry of every AI Gateway provider, and the transport is picked from the requested options: the run path (env.AI.run) for resumable streaming (cf-aig-run-id, the default, on the unified-billing run catalog), or the gateway path (env.AI.gateway(id).run([…])) for BYOK providers, server-side fallback, and caching. Incompatible option combinations (e.g.resume: truewithfallback.mode: "server", orresume/transport: "run"on a BYOK provider) throw a clearGatewayDelegateError; resume-disabling combinations warn loudly. This is fully additive: leavingprovidersunset preserves the prior behavior exactly, and passing a catalog slug without it throws a helpful error. The chat factory's settings argument is typed from the model id literal — a"<provider>/<model>"slug autocompletesDelegateCallOptions, while a@cf/...id autocompletesWorkersAIChatSettings.gatewayis optional for catalog routing — when unset, requests use the account's"default"AI Gateway; setgateway(here or per call) to target a specific one.New sub-path exports:
workers-ai-provider/openai,workers-ai-provider/anthropic,workers-ai-provider/google— provider plugins keyed by wire format. Oneopenaiplugin serves the OpenAI-compatible long tail (deepseek,xai/grok,groq,mistral,perplexity,cerebras,openrouter,fireworks) plus the unified-catalog chat providersalibaba(Qwen) andminimax.@ai-sdk/openai,@ai-sdk/anthropic, and@ai-sdk/googleare optional peer dependencies; install only the ones whose wire formats you use. Theopenaiplugin is required for the run path (see below). Providers whose gateway-path URL isn't reproducible from the shared builder (cohere, baseten, parallel, azure-openai, google-vertex) and provider-native/non-chat providers are bring-your-own-provider only.workers-ai-provider/gateway—createGatewayFetch/createGatewayProviderwrap any@ai-sdk/*provider so its traffic flows through AI Gateway (provider id detected from the request URL, or set explicitly). Use it for provider-native or non-chat providers the slug routing can't auto-wire (bedrock, replicate, audio/image), or for full control of the underlying provider.
The transport types, error classes (
WorkersAIGatewayError,WorkersAIFallbackError,GatewayDelegateError), the registry helpers,DelegateCallOptions, andcreateResumableStreamare re-exported from the package root.Features:
- Provider registry (
GATEWAY_PROVIDERS,findProviderBySlug,detectProviderByUrl) maps slugs to gateway provider ids, wire formats, billing model, and run-catalog membership. Covers every provider in the AI Gateway directory (OpenAI, Anthropic, Google AI Studio/Vertex, xAI, Groq, DeepSeek, Mistral, Perplexity, Cerebras, OpenRouter, Cohere, Baseten, Parallel, Azure OpenAI, Amazon Bedrock, HuggingFace, Replicate, Fal, Ideogram, Cartesia, Deepgram, ElevenLabs — plus Fireworks), with URL host patterns socreateGatewayFetchauto-detects each from the wrapped provider's request URL. Also includes the unified-catalog chat providersalibaba(Qwen) andminimaxon the resumable run catalog (verified live: OpenAI-wire,cf-aig-run-idon streams); these are run-path only (gatewayPath: false— not native gateway providers), so caching, server-side fallback, andtransport: "gateway"are rejected with a clearGatewayDelegateErrorinstead of failing upstream. - Metadata & logging —
metadata(custom log attributes for spend attribution) andcollectLogare first-class call options on both transports. On the run path they fold into the typed gateway options; on the gateway path they becomecf-aig-metadata/cf-aig-collect-logheaders (bigint metadata values are coerced to strings). Call-levelmetadatamerges over (and wins against) anymetadataset viagateway: { metadata }. - BYOK — set
byok: true(+ supply the key viaextraHeaders) to forward the upstream provider key on the gateway path; otherwise provider auth headers are stripped so unified billing / the gateway's stored key applies. - Client-side fallback (
fallback.mode: "client") keeps resume per leg — a failed pre-stream dispatch falls through to the next model; if all fail, aWorkersAIFallbackErrorcarries the per-attempt tree. Server-side fallback (fallback.mode: "server") routes same-vendor fallbacks through the gateway path. - Typed errors —
WorkersAIGatewayError(with a coarsecode, arecoverablehint, and the parsed CF/provider envelope) andWorkersAIFallbackError(attempt tree). HelpersclassifyStatus/extractErrorMessageare exported. - Abort + gateway options are passed through on both transports.
On the run path, the response stream is wrapped so a transient mid-stream drop reconnects through the gateway resume endpoint (
resume?from=N) transparently — the@ai-sdkparser never sees the break.fromis an SSE event index, so the wrapper emits only complete events and realigns on the boundary after a drop (no duplicated or truncated bytes). When the gateway buffer expires (404, ~5.5 min TTL), anonResumeExpiredpolicy controls whether the stream errors ("error", the default) or ends with partial output ("accept-partial").For cross-invocation recovery (e.g. a new Durable Object invocation after eviction),
createResumableStreamis exported and accepts noinitialbody plus afromEventoffset — it re-attaches by resuming directly from that event index. AnonProgress(eventOffset)callback (also surfaced on the delegate as a call option) reports the live SSE event offset so callers can persist{ runId, eventOffset }and re-attach later.Run-path wire format (per-provider): on the resumable run path (
env.AI.run), Cloudflare's unified catalog normalizes most providers to OpenAI chat-completions wire (sogoogle/…is parsed with theopenaiplugin on the run path, even though the gateway path uses the nativegoogleplugin), but passes Anthropic through natively (content[].text, native tool shape) — soanthropic/…is parsed with theanthropicplugin on both paths. The registry records this asrunWireFormat(defaults to"openai"). Includeopenaifor the openai-wire run-path providers (openai, google, xai/grok, groq) andanthropicto useanthropic/…; the delegate throws a clearGatewayDelegateErrornaming the exact plugin a transport needs if it's missing.
Patch Changes
-
#563
231c19bThanks @slegarraga! - Validatefileparts in chat messages before sending them to Workers AI.Previously every
filepart in a user message was unconditionally wrapped as
animage_url, regardless of itsmediaType. Non-image files (e.g.
application/pdf,audio/*,video/*,application/octet-stream) were
forwarded as if they were valid vision inputs, and a missingmediaType
silently defaulted toimage/png, producing a corrupt data URL.Now
convertToWorkersAIChatMessages:- throws an
UnsupportedFunctionalityErrorwhen afilepart has a
non-image/*mediaType, or nomediaTypeat all, instead of forwarding
broken multimodal content; - matches the
image/prefix case-insensitively (per RFC 2045), so media
types such asIMAGE/JPEGare accepted while the caller's original casing
is preserved in the emitted data URL; - preserves the provided image
mediaTypeinstead of defaulting missing
media types toimage/png.
This is a behavior change: inputs that previously "succeeded" with broken or
defaulted media types now throw a clear, catchable error. Type-correct callers
(the AI SDK always setsmediaTypeon file parts) are unaffected for valid
image inputs. - throws an
-
#575
65e0735Thanks @threepointone! - Map the AI SDK's forced single-tool choice to the documented named-function form.Previously
toolChoice: { type: "tool", toolName }was downgraded to
tool_choice: "required"(with the tool list filtered to the single function).
Workers AI treats"required"as advisory: on long contexts and reasoning
models (e.g.@cf/google/gemma-4-26b-a4b-it,@cf/qwen/qwq-32b,
@cf/qwen/qwen3-30b-a3b-fp8) the model would "fail open" and answer in prose
instead of calling the requested tool.Now the provider sends the OpenAI-style named-function form
tool_choice: { type: "function", function: { name } }, which Workers AI
enforces server-side, and keeps the full tool list (matching OpenAI semantics
and preserving tool-result context fidelity).Note: forcing a tool on a reasoning model with insufficient
max_tokensis
validated server-side and now surfaces as a clear error (Workers AI8006)
rather than silently producing no tool call.Additionally, recover forced tool calls that gpt-oss models leak as text.
When a tool is forced, gpt-oss (harmony format) sometimes emits the tool call
as raw JSON inmessage.contentwith an emptytool_callsarray and
finish_reason: "stop". The provider now detects this — only when a tool was
forced and the leaked JSON'snamematches a requested tool — and
reinterprets it as a structured tool call (withfinishReason: "tool-calls"
and a warning), across bothgenerateTextandstreamText. Ambiguous leaks
(harmony channel/role names, hallucinated names) are left untouched to avoid
fabricating bogus calls. -
#570
104c4a7Thanks @threepointone! - Refresh Workers AI model references from the deprecated@cf/moonshotai/kimi-k2.5to the current@cf/moonshotai/kimi-k2.7-codein the README and inline source documentation. -
#576
a360e7aThanks @threepointone! - Keep structured-outputname/descriptioninstead of dropping them on native Workers AI models.Output.object({ schema, name, description })andgenerateObject({ schema, schemaName, schemaDescription })pass aname/descriptionalongside the JSON
schema. On the native@cf/...path the provider previously forwarded only the
bare schema asresponse_format.json_schemaand silently discarded both.Native Workers AI expects
json_schemato be a bare JSON Schema, not
OpenAI's{ name, schema, strict }envelope, so we can't just wrap it (that
would break native models). Instead thenameis folded into the schema's
standardtitlekeyword and thedescriptioninto itsdescriptionkeyword —
the payload stays a valid bare schema while the guidance reaches the model.
Existing schema-leveltitle/descriptionare never overwritten and the input
schema is not mutated.Note on issue #559: the reported failure was OpenAI partner models (e.g.
openai/gpt-5.4-mini) rejecting requests withMissing required parameter: 'response_format.json_schema.name'. Partner-model slugs are no longer handled
by this code path at all — they route through the AI Gateway delegate and the
real@ai-sdk/*providers, which build the requiredjson_schema.nameenvelope
themselves (configure them viacreateWorkersAI({ binding, providers: [openai] })). This change covers the remaining native-model gap where that guidance was
being dropped.See #559.