Skip to content

fix: three tool calling bugs causing multi-turn agentic loops to fail#454

Merged
threepointone merged 2 commits intocloudflare:mainfrom
mchenco:mchen/fix-tool-calling
Mar 22, 2026
Merged

fix: three tool calling bugs causing multi-turn agentic loops to fail#454
threepointone merged 2 commits intocloudflare:mainfrom
mchenco:mchen/fix-tool-calling

Conversation

@mchenco
Copy link
Contributor

@mchenco mchenco commented Mar 22, 2026

Summary

Fixes three bugs in workers-ai-provider that collectively caused multi-turn tool calling to fail for new models (@cf/moonshotai/kimi-k2.5, @cf/meta/llama-4-scout-17b-16e-instruct, @cf/zai-org/glm-4.7-flash).

Fixes

1. Tool result output not unwrapped (primary cause of broken agentic loops)

convert-to-workersai-chat-messages.ts was calling JSON.stringify(toolResponse.output) on the full LanguageModelV3ToolResultOutput wrapper ({ type: 'text', value: '...' }), sending the wrapper object as the tool message content instead of just the value. Models received garbled tool results and stopped after the first tool call instead of continuing the loop.

Fix: exhaustive switch over all 7 LanguageModelV3ToolResultOutput variants:

switch (output.type) {
  case "text":
  case "error-text":
    content = output.value;
    break;
  case "json":
  case "error-json":
    content = JSON.stringify(output.value);
    break;
  case "execution-denied":
    content = output.reason
      ? `Tool execution denied: ${output.reason}`
      : "Tool execution was denied.";
    break;
  case "content":
    content = output.value
      .filter((p) => p.type === "text")
      .map((p) => p.text)
      .join("\n");
    break;
  default:
    content = "";
    break;
}

2. toolChoice: "required" mapped to "any" instead of "required"

All models returned 8001: Invalid input for tool_choice: "any". Anyone using toolChoice: "required" or toolChoice: { type: "tool", toolName: "..." } got a hard error.

// Before
case "required": return { tool_choice: "any", tools: mappedTools };
case "tool":     return { tool_choice: "any", tools: filteredTools };

// After
case "required": return { tool_choice: "required", tools: mappedTools };
case "tool":     return { tool_choice: "required", tools: filteredTools };

3. description: false and parameters: false in tool definitions

&& short-circuit evaluated to false (not undefined) for non-function tool types, which would cause 8001: Invalid input if the ai-sdk ever sends a non-function tool type.

// Before
description: tool.type === "function" && tool.description,

// After
description: tool.type === "function" ? tool.description : undefined,

Testing

Unit tests

222 unit tests passing, including new tests covering:

  • All 7 LanguageModelV3ToolResultOutput variants (text, json, error-text, error-json, execution-denied, content, unknown)
  • toolChoice: "required" mapping
  • Tool definition field correctness

E2E tests

E2E tests against real models via both the Workers AI binding and REST API, including two new test scenarios:

  • Multi-step agentic loop (T-MS): Model must make 2+ sequential tool calls, correctly read each result, and produce a final text answer. Directly exercises the tool result unwrapping fix.
  • toolChoice: "required" (T-Rq): Validates the tool_choice mapping fix — verifies no API error and at least one tool call is made.

Binding E2E results:

Model Chat Strm Turn Tool T-RT T-MS T-Rq JSON
Llama 4 Scout 17B
Llama 3.3 70B ~
GPT-OSS 120B
QwQ 32B (reasoning)
Llama 3.1 8B Fast ~ ~
GPT-OSS 20B
Qwen3 30B
Kimi K2.5

T-MS = multi-step agentic loop, T-Rq = toolChoice required, ~ = partial (model behavior)

QwQ 32B's T-Rq failure is a server-side model limitation, not a provider bug.

@changeset-bot
Copy link

changeset-bot bot commented Mar 22, 2026

🦋 Changeset detected

Latest commit: cff5c89

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
workers-ai-provider Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 22, 2026

Open in StackBlitz

npx https://pkg.pr.new/cloudflare/ai/ai-gateway-provider@454
npx https://pkg.pr.new/cloudflare/ai/@cloudflare/tanstack-ai@454
npx https://pkg.pr.new/cloudflare/ai/workers-ai-provider@454

commit: cff5c89

Improve handling of tool-result outputs by unwrapping different output types (text, error-text, json, error-json, execution-denied, and content parts) and returning appropriate plain strings instead of serializing wrapper objects. Add unit tests covering error-text/error-json/execution-denied/content cases and adjust existing expectations. Extend E2E fixtures and tests to exercise multi-step agentic tool loops and toolChoice="required" behavior (including new routes in the binding worker, summary table updates, and per-model checks). Also update a test description in text-generation.test and remove a couple of models from E2E model lists.
Copy link
Collaborator

@threepointone threepointone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some more checks and updated e2e tests, looking good to me now

@threepointone threepointone merged commit 2e7a282 into cloudflare:main Mar 22, 2026
3 checks passed
@github-actions github-actions bot mentioned this pull request Mar 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants