-
Notifications
You must be signed in to change notification settings - Fork 241
Comprehensive cleanup of the workers-ai-provider package. #393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
**Bug fixes:**
- Fixed phantom dependency on `fetch-event-stream` that caused runtime crashes when installed outside the monorepo. Replaced with a built-in SSE parser.
- Fixed streaming buffering: responses now stream token-by-token instead of arriving all at once. The root cause was twofold — an eager `ReadableStream` `start()` pattern that buffered all chunks, and a heuristic that silently fell back to non-streaming `doGenerate` whenever tools were defined. Both are fixed. Streaming now uses a proper `TransformStream` pipeline with backpressure.
- Fixed `reasoning-delta` ID mismatch in simulated streaming — was using `generateId()` instead of the `reasoningId` from the preceding `reasoning-start` event, causing the AI SDK to drop reasoning content.
- Fixed REST API client (`createRun`) silently swallowing HTTP errors. Non-200 responses now throw with status code and response body.
- Fixed `response_format` being sent as `undefined` on every non-JSON request. Now only included when actually set.
- Fixed `json_schema` field evaluating to `false` (a boolean) instead of `undefined` when schema was missing.
**Workers AI quirk workarounds:**
- Added `sanitizeToolCallId()` — strips non-alphanumeric characters and pads/truncates to 9 chars, fixing tool call round-trips through the binding which rejects its own generated IDs.
- Added `normalizeMessagesForBinding()` — converts `content: null` to `""` and sanitizes tool call IDs before every binding call. Only applied on the binding path (REST preserves original IDs).
- Added null-finalization chunk filtering for streaming tool calls.
- Added numeric value coercion in native-format streams (Workers AI sometimes returns numbers instead of strings for the `response` field).
- Improved image model to handle all output types from `binding.run()`: `ReadableStream`, `Uint8Array`, `ArrayBuffer`, `Response`, and `{ image: base64 }` objects.
- Graceful degradation: if `binding.run()` returns a non-streaming response despite `stream: true`, it wraps the complete response as a simulated stream instead of throwing.
**Premature stream termination detection:**
- Streams that end without a `[DONE]` sentinel now report `finishReason: "error"` with `raw: "stream-truncated"` instead of silently reporting `"stop"`.
- Stream read errors are caught and emit `finishReason: "error"` with `raw: "stream-error"`.
**AI Search (formerly AutoRAG):**
- Added `createAISearch` and `AISearchChatLanguageModel` as the canonical exports, reflecting the rename from AutoRAG to AI Search.
- `createAutoRAG` still works but emits a one-time deprecation warning pointing to `createAISearch`.
- `createAutoRAG` preserves `"autorag.chat"` as the provider name for backward compatibility.
- AI Search now warns when tools or JSON response format are provided (unsupported by the `aiSearch` API).
- Simplified AI Search internals — removed dead tool/response-format processing code.
**Code quality:**
- Removed dead code: `workersai-error.ts` (never imported), `workersai-image-config.ts` (inlined).
- Consistent file naming: renamed `workers-ai-embedding-model.ts` to `workersai-embedding-model.ts`.
- Replaced `StringLike` catch-all index signatures with `[key: string]: unknown` on settings types.
- Replaced `any` types with proper interfaces (`FlatToolCall`, `OpenAIToolCall`, `PartialToolCall`).
- Tightened `processToolCall` format detection to check `function.name` instead of just the presence of a `function` property.
- Removed `@ai-sdk/provider-utils` and `zod` peer dependencies (no longer used in source).
- Added `imageModel` to the `WorkersAI` interface type for consistency.
**Tests:**
- 149 unit tests across 10 test files (up from 82).
- New test coverage: `sanitizeToolCallId`, `normalizeMessagesForBinding`, `prepareToolsAndToolChoice`, `processText`, `mapWorkersAIUsage`, image model output types, streaming error scenarios (malformed SSE, premature termination, empty stream), backpressure verification, graceful degradation (non-streaming fallback with text/tools/reasoning), REST API error handling (401/404/500), AI Search warnings, embedding `TooManyEmbeddingValuesForCallError`, message conversion with images and reasoning.
- Integration tests for REST API and binding across 12 models and 7 categories (chat, streaming, multi-turn, tool calling, tool round-trip, structured output, image generation, embeddings).
- All tests use the AI SDK's public APIs (`generateText`, `streamText`, `generateImage`, `embedMany`) instead of internal `.doGenerate()`/`.doStream()` methods.
**README:**
- Rewritten from scratch with concise examples, model recommendations, configuration guide, and known limitations section.
- Updated to use current AI SDK v6 APIs (`generateText` + `Output.object` instead of deprecated `generateObject`, `generateImage` instead of `experimental_generateImage`, `stopWhen: stepCountIs(2)` instead of `maxSteps`).
- Added sections for tool calling, structured output, embeddings, image generation, and AI Search.
- Uses `wrangler.jsonc` format for configuration examples.
🦋 Changeset detectedLatest commit: b0e91e6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
Update demos/structured-output-node/biome.json to use Biome schema 2.3.13. Add explicit type="button" to tab buttons in examples/workers-ai/src/client/App.tsx to avoid accidental form submission. Remove the unused HttpResponse import from packages/workers-ai-provider/test/stream-text.test.ts to address linter/test warnings.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a ground-up cleanup of the
workers-ai-providerpackage. It fixes real bugs (streaming was broken, tool round-trips through the binding failed), adds workarounds for documented Workers AI quirks, renames AutoRAG to AI Search, rewrites the README, and brings test coverage from 82 to 149 unit tests + full integration tests across 12 models.Also migrates all 13 demos off deprecated AI SDK APIs (
generateObject->generateText+Output.object,experimental_generateImage->generateImage), and adds a newexamples/workers-aidemo app.What was broken
Streaming didn't actually stream. Two bugs: (a) the
ReadableStreamused an eagerasync start()that buffered everything before the consumer pulled, and (b) a heuristic silently fell back to non-streamingdoGeneratewhenever tools were defined. EverystreamTextcall with tools was secretly agenerateTextwrapped in a fake stream.fetch-event-streamwas a phantom dependency. Listed in the rootpackage.jsonbut not in the package's own — anyone installing from npm outside the monorepo got a runtime crash on any streaming call.Tool round-trips through the binding were broken. The binding generates IDs like
chatcmpl-tool-875d3ec6179676aebut validates them against[a-zA-Z0-9]{9}. It also rejectscontent: nullon tool-calling messages. Both are now normalized before everybinding.run()call.REST API errors were silently swallowed. A 401 or 404 would flow through to
response.json()and throw an opaque parse error.reasoning-deltaevents had wrong IDs in the simulated streaming path, causing reasoning content to be dropped by the AI SDK.What's new
TransformStreampipeline (SSEDecoder-> event mapper) with backpressure. Verified with a timing-based test.sanitizeToolCallId,normalizeMessagesForBinding, null-finalization chunk filtering, numeric value coercion, dual stream format detection (native vs OpenAI-compatible).createAISearchreplacescreateAutoRAG(which still works with a deprecation warning). Reflects the AutoRAG -> AI Search rename.binding.run()returns a non-streaming response despitestream: true, it wraps the response as a simulated stream instead of throwing.finishReason: "error"instead of silently reporting"stop".examples/workers-ai: Vite + React + Cloudflare Workers with chat (streaming + tool calling + reasoning), image generation, and embeddings. Model dropdowns for each.Breaking changes
None.
createAutoRAGpreserves"autorag.chat"as the provider name. Models that don't support streaming with tools get graceful degradation (same output, just not token-by-token). Removed peer deps (zod,@ai-sdk/provider-utils) were unused in source.Demo migrations
All 13 demos updated from deprecated AI SDK APIs:
generateObject->generateText+Output.object(12 demos, ~25 call sites)experimental_generateImage->generateImage(1 demo)Notes for reviewers
The streaming fix is the most important change. Before: everything was buffered. After: proper
TransformStreampipeline. The backpressure test instream-text.test.ts("should deliver chunks incrementally") verifies this with timing assertions.The
simulateStreamFromGeneratefallback is gone but replaced by graceful degradation at the response level. Ifbinding.run()returns an object instead of a stream,doStreamwraps it. Three dedicated tests cover this (text, tool calls, reasoning). This is better than the old approach because it tries streaming first, only falling back when the binding refuses.sanitizeToolCallIdis deterministic — same input always produces same output. This matters because the AI SDK stores the tool call ID from step 1 and passes it back in step 2. Both sides go throughsanitizeToolCallId, so they match. The normalization is only applied on the binding path (isBindingflag), not REST.The
SSEDecoderclass instreaming.tsreplaces both thefetch-event-streamdependency and the oldparseSSEStreamasync generator. It's aTransformStreamsubclass that handles line buffering, partial chunks, and bothdata:anddata:(no space) formats.Integration tests require credentials. REST tests need
CLOUDFLARE_ACCOUNT_ID+CLOUDFLARE_API_TOKENin.env. Binding tests start awrangler devserver. Both skip gracefully without credentials. Run withpnpm test:e2e:rest/pnpm test:e2e:binding.The example app at
examples/workers-aiis a standalone Vite + React app, not a demo. It uses the Cloudflare Vite plugin in SPA mode with a plain Workersfetchhandler (no Hono). Good for testing the provider end-to-end with a real UI.The
workersai-models.tsTODO (// This needs to be fixed to allow more models) is still there. TheExclude<TextGen, TextToImage>type could incorrectly exclude models that satisfy both interfaces. Worth investigating separately.