fix: three tool calling bugs causing multi-turn agentic loops to fail by mchenco · Pull Request #454 · cloudflare/ai

mchenco · 2026-03-22T06:19:50Z

Summary

Fixes three bugs in workers-ai-provider that collectively caused multi-turn tool calling to fail for new models (@cf/moonshotai/kimi-k2.5, @cf/meta/llama-4-scout-17b-16e-instruct, @cf/zai-org/glm-4.7-flash).

Fixes

1. Tool result output not unwrapped (primary cause of broken agentic loops)

convert-to-workersai-chat-messages.ts was calling JSON.stringify(toolResponse.output) on the full LanguageModelV3ToolResultOutput wrapper ({ type: 'text', value: '...' }), sending the wrapper object as the tool message content instead of just the value. Models received garbled tool results and stopped after the first tool call instead of continuing the loop.

Fix: exhaustive switch over all 7 LanguageModelV3ToolResultOutput variants:

switch (output.type) {
  case "text":
  case "error-text":
    content = output.value;
    break;
  case "json":
  case "error-json":
    content = JSON.stringify(output.value);
    break;
  case "execution-denied":
    content = output.reason
      ? `Tool execution denied: ${output.reason}`
      : "Tool execution was denied.";
    break;
  case "content":
    content = output.value
      .filter((p) => p.type === "text")
      .map((p) => p.text)
      .join("\n");
    break;
  default:
    content = "";
    break;
}

2. toolChoice: "required" mapped to "any" instead of "required"

All models returned 8001: Invalid input for tool_choice: "any". Anyone using toolChoice: "required" or toolChoice: { type: "tool", toolName: "..." } got a hard error.

// Before
case "required": return { tool_choice: "any", tools: mappedTools };
case "tool":     return { tool_choice: "any", tools: filteredTools };

// After
case "required": return { tool_choice: "required", tools: mappedTools };
case "tool":     return { tool_choice: "required", tools: filteredTools };

3. description: false and parameters: false in tool definitions

&& short-circuit evaluated to false (not undefined) for non-function tool types, which would cause 8001: Invalid input if the ai-sdk ever sends a non-function tool type.

// Before
description: tool.type === "function" && tool.description,

// After
description: tool.type === "function" ? tool.description : undefined,

Testing

Unit tests

222 unit tests passing, including new tests covering:

All 7 LanguageModelV3ToolResultOutput variants (text, json, error-text, error-json, execution-denied, content, unknown)
toolChoice: "required" mapping
Tool definition field correctness

E2E tests

E2E tests against real models via both the Workers AI binding and REST API, including two new test scenarios:

Multi-step agentic loop (T-MS): Model must make 2+ sequential tool calls, correctly read each result, and produce a final text answer. Directly exercises the tool result unwrapping fix.
toolChoice: "required" (T-Rq): Validates the tool_choice mapping fix — verifies no API error and at least one tool call is made.

Binding E2E results:

Model	Chat	Strm	Turn	Tool	T-RT	T-MS	T-Rq	JSON
Llama 4 Scout 17B	✅	✅	✅	✅	✅	✅	✅	✅
Llama 3.3 70B	✅	✅	✅	✅	~	✅	✅	✅
GPT-OSS 120B	✅	✅	✅	✅	✅	✅	✅	✅
QwQ 32B (reasoning)	✅	✅	✅	✅	✅	✅	❌	✅
Llama 3.1 8B Fast	✅	✅	✅	✅	~	~	✅	✅
GPT-OSS 20B	✅	✅	✅	✅	✅	✅	✅	✅
Qwen3 30B	✅	✅	✅	✅	✅	✅	✅	✅
Kimi K2.5	✅	✅	✅	✅	✅	✅	✅	✅

T-MS = multi-step agentic loop, T-Rq = toolChoice required, ~ = partial (model behavior)

QwQ 32B's T-Rq failure is a server-side model limitation, not a provider bug.

…ption false in tool prep

changeset-bot · 2026-03-22T06:19:54Z

🦋 Changeset detected

Latest commit: cff5c89

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
workers-ai-provider	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pkg-pr-new · 2026-03-22T06:20:45Z

Open in StackBlitz

npx https://pkg.pr.new/cloudflare/ai/ai-gateway-provider@454

npx https://pkg.pr.new/cloudflare/ai/@cloudflare/tanstack-ai@454

npx https://pkg.pr.new/cloudflare/ai/workers-ai-provider@454

commit: cff5c89

Improve handling of tool-result outputs by unwrapping different output types (text, error-text, json, error-json, execution-denied, and content parts) and returning appropriate plain strings instead of serializing wrapper objects. Add unit tests covering error-text/error-json/execution-denied/content cases and adjust existing expectations. Extend E2E fixtures and tests to exercise multi-step agentic tool loops and toolChoice="required" behavior (including new routes in the binding worker, summary table updates, and per-model checks). Also update a test description in text-generation.test and remove a couple of models from E2E model lists.

threepointone

added some more checks and updated e2e tests, looking good to me now

fix: tool result unwrapping, tool_choice required mapping, and descri…

29087ad

…ption false in tool prep

threepointone approved these changes Mar 22, 2026

View reviewed changes

threepointone merged commit 2e7a282 into cloudflare:main Mar 22, 2026
3 checks passed

github-actions bot mentioned this pull request Mar 22, 2026

Version Packages #456

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: three tool calling bugs causing multi-turn agentic loops to fail#454

fix: three tool calling bugs causing multi-turn agentic loops to fail#454
threepointone merged 2 commits intocloudflare:mainfrom
mchenco:mchen/fix-tool-calling

mchenco commented Mar 22, 2026 •

edited by threepointone

Loading

Uh oh!

changeset-bot bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

pkg-pr-new bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

threepointone left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mchenco commented Mar 22, 2026 • edited by threepointone Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fixes

Testing

Unit tests

E2E tests

Uh oh!

changeset-bot bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pkg-pr-new bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

threepointone left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mchenco commented Mar 22, 2026 •

edited by threepointone

Loading

changeset-bot bot commented Mar 22, 2026 •

edited

Loading

pkg-pr-new bot commented Mar 22, 2026 •

edited

Loading