Skip to content

Version Packages#591

Merged
threepointone merged 1 commit into
mainfrom
changeset-release/main
Jun 29, 2026
Merged

Version Packages#591
threepointone merged 1 commit into
mainfrom
changeset-release/main

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.

Releases

ai-gateway-provider@3.2.0

Minor Changes

  • #590 fe0d182 Thanks @threepointone! - New gateway options, plus the provider routing table and cf-aig-* header
    building are now shared with the workers-ai-provider AI Gateway delegate
    (bundled inline — no new dependency), so the two stay in lockstep.

    • AiGatewayOptions gains two universal-endpoint controls: byokAlias
      (cf-aig-byok-alias, select a stored BYOK key by alias) and zdr
      (cf-aig-zdr, per-request Zero Data Retention override for Unified Billing).
    • Cache controls now emit the current cf-aig-cache-ttl / cf-aig-skip-cache
      header names instead of the upstream-deprecated cf-cache-ttl / cf-skip-cache.
    • New opt-in resumable streaming on the binding/run path (coming soon
      not generally available yet while the AI Gateway resume backend rolls out;
      treat as experimental): pass resume
      ({ binding: env.AI, gateway, onResumeExpired?, maxReconnects? }) and a
      streaming run that surfaces a cf-aig-run-id will transparently reconnect on a
      transient mid-stream drop, reusing the same resumable-stream engine as the
      workers-ai-provider delegate. No-op on the REST/API-key path and on
      non-streaming calls.
    • The misspelled retries option type is renamed AiGatewayReties
      AiGatewayRetries; the old name stays exported as a deprecated alias, so this
      is non-breaking.

    Existing behavior is otherwise unchanged.

@cloudflare/tanstack-ai@0.2.0

Minor Changes

  • #590 fe0d182 Thanks @threepointone! - - Add resumable streaming to the Workers AI adapter (coming soon — not
    generally available yet while the AI Gateway resume backend rolls out; treat as
    experimental): catalog models dispatch through the AI Gateway run path, so
    transient mid-stream drops reconnect transparently via cf-aig-run-id.
    Configure with resume / onResumeExpired (no-op + warning where no run id is
    available, e.g. REST).

    • Gain the gpt-oss forced tool-call salvage (#560) and non-SSE
      graceful-degradation, now shared with workers-ai-provider.
    • Bump @tanstack/ai and the @tanstack/ai-* adapter peers to current versions
      (adapts to the multimodal MediaPrompt API). @ai-sdk/* is intentionally not
      bumped.
  • #594 12fb307 Thanks @threepointone! - Retry transient Workers AI failures and normalize errors across every adapter.

    • Chat: the binding shim now surfaces binding failures as HTTP responses
      (e.g. "out of capacity" 3040429, "no such model" 5007400) so the
      OpenAI SDK's status-based retry engages and honors Retry-After. Aborts and
      unrecognized errors propagate untouched. Non-OK gateway run-path responses are
      returned verbatim instead of being swallowed into an empty completion.
    • Non-chat adapters (embedding, image, TTS, transcription, summarize) gain a
      bounded exponential-backoff retry (the OpenAI SDK isn't in play for these) and
      normalize binding / REST / gateway failures into a single WorkersAiRequestError
      carrying the HTTP status (and the raw Workers AI code when recognized). The
      retry loop honors a server Retry-After header. Non-OK gateway responses are no
      longer swallowed.
    • Add a maxRetries option to the adapter config: forwarded to the OpenAI SDK on
      the chat path, and used by the non-chat retry loop. Defaults to 2; set to 0
      to disable.

workers-ai-provider@3.3.0

Minor Changes

  • #590 fe0d182 Thanks @threepointone! - The AI Gateway delegate gains cross-vendor server-side fallback
    (fallback: { mode: "server" }) — multiple vendors in one gateway run, with the
    winner selected via cf-aig-step.

    The gateway delegate now reaches header parity with the run path: the gateway
    path forwards cacheKey, eventId, requestTimeoutMs, and retries from the
    gateway options as cf-aig-* headers, and DelegateCallOptions gains two new
    universal-endpoint controls — byokAlias (cf-aig-byok-alias, select a stored
    BYOK key by alias) and zdr (cf-aig-zdr, per-request Zero Data Retention
    override for Unified Billing, applied on both transports).

    Internally, the provider registry, cf-aig-* header building, resumable-stream
    engine, and Workers AI SSE helpers are now shared across the Cloudflare AI
    packages (bundled inline — no new dependency for you to install).

  • #593 1c6afd0 Thanks @threepointone! - Native Workers AI failures are now surfaced as AI SDK APICallErrors so the AI
    SDK's built-in retry (maxRetries) can engage on transient errors.

    Previously the binding path (env.AI.run) threw plain Errors and the REST
    path threw a generic Error, so the AI SDK never retried them — most notably
    the common "out of capacity" failure (internal code 3040, HTTP 429) and
    other 5xx blips just failed the call outright.

    • Binding path: errors thrown by env.AI.run are normalized into an
      APICallError across every Workers AI model — chat, embedding, image, speech,
      transcription, and reranking. The Workers AI internal error code is parsed from
      the message (or a numeric code property) and mapped to the documented HTTP
      status (e.g. 3040/3036429, 3007/3008408, 5007400), and
      APICallError derives isRetryable from that status (retryable on
      408/409/429/5xx). Unrecognized errors get no status and stay non-retryable
      (prior behavior). AbortError/TimeoutError cancellations propagate
      unchanged.
    • REST path: non-OK responses now throw an APICallError carrying the real
      statusCode, response headers (so Retry-After is honored), and body, instead
      of a generic Error. The error message keeps the same
      Workers AI API error (<status> <statusText>): <body> shape.

    This means transient capacity/5xx errors are now automatically retried with
    exponential backoff by generateText/streamText (default 2 retries; tune via
    maxRetries). Set maxRetries: 0 to opt out.

@github-actions github-actions Bot force-pushed the changeset-release/main branch 3 times, most recently from 0fe632b to 9162d08 Compare June 29, 2026 05:38
@github-actions github-actions Bot force-pushed the changeset-release/main branch from 9162d08 to ec1d54b Compare June 29, 2026 17:09
@threepointone threepointone enabled auto-merge June 29, 2026 17:23
@threepointone threepointone disabled auto-merge June 29, 2026 17:24
@threepointone threepointone merged commit 4911155 into main Jun 29, 2026
@threepointone threepointone deleted the changeset-release/main branch June 29, 2026 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant