fix(ai): stop emitting duplicate TOOL_CALL_END for server-executed tools by valeriudev · Pull Request #696 · TanStack/ai

valeriudev · 2026-06-03T15:53:48Z

Closes #519.

What

chat() emitted a duplicate TOOL_CALL_END for every server-executed tool. Adapters already stream TOOL_CALL_START / TOOL_CALL_ARGS / TOOL_CALL_END for each call, but the post-execution phase pushed a second END with no matching START. AG-UI-strict consumers (e.g. @ag-ui/client's verifyEvents, which CopilotKit's runtime pipes through) reject an orphan END and abort the stream.

Fix

buildToolResultChunks now emits END only alongside the START/ARGS it already gates on argsMap — the continuation re-execution path, which reconstructs a tool call the adapter never streamed. The normal post-execution path contributes only TOOL_CALL_RESULT. This also makes the triple atomic: START and ARGS were already conditional on argsMap; only END was leaking out unconditionally.

Result and error state ride on the spec-compliant TOOL_CALL_RESULT, so nothing is lost — verified the client StreamProcessor delivers both via handleToolCallResultEvent.

Tests

New regression test asserting exactly one START/END/RESULT per server-executed tool, plus the "every END has a matching START" invariant verifyEvents enforces.
Updated the tool-loop mocks to stream their own TOOL_CALL_END (matching real adapters) and moved the error-state assertion onto TOOL_CALL_RESULT.
Full @tanstack/ai unit suite + affected E2E specs green.

Summary by CodeRabbit

New Features
- Devtools now surfaces server-executed tool outputs using a dedicated tool-result event so tool results appear correctly in the UI.
Bug Fixes
- Fixed duplicate/orphan TOOL_CALL_END events after server-executed tool calls so each tool call emits a single, well-formed lifecycle sequence.
Tests
- Added and strengthened unit and end-to-end regression tests to enforce exactly one START/END/RESULT per server-executed tool and validate event-stream invariants.

Adapters already stream START/ARGS/END for every tool call, but the post-execution phase in chat() pushed a second TOOL_CALL_END with no preceding TOOL_CALL_START. AG-UI-strict consumers (e.g. @ag-ui/client's verifyEvents) reject an orphan END and abort the stream. buildToolResultChunks now emits END only alongside the matching START it already gates on argsMap (the continuation re-execution path, which reconstructs a never-streamed call); the normal post-execution path contributes only TOOL_CALL_RESULT. The result/error state lives on the spec-compliant TOOL_CALL_RESULT, so nothing is lost. Fixes TanStack#519.

coderabbitai · 2026-06-03T15:54:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 41bf3622-ef99-4eee-bdd7-02dcbbd5a506

📥 Commits

Reviewing files that changed from the base of the PR and between e365766 and 40ca041.

📒 Files selected for processing (2)

.changeset/devtools-tool-result-from-result-event.md
packages/ai-event-client/src/devtools-middleware.ts

✅ Files skipped from review due to trivial changes (1)

.changeset/devtools-tool-result-from-result-event.md

📝 Walkthrough

Walkthrough

The PR fixes issue #519 by making buildToolResultChunks emit TOOL_CALL_END only when reconstructing continuation re-executions (argsMap present). Adapters that already streamed END will now only produce TOOL_CALL_RESULT post-execution. Tests, an e2e route/spec, devtools middleware, and changesets were added/updated to enforce the one-START/one-END/one-RESULT invariant.

Chat Duplication Fix

Layer / File(s)	Summary
Conditional TOOL_CALL_END emission in buildToolResultChunks `packages/ai/src/activities/chat/index.ts`	Clarifies `argsMap` is only provided for continuation re-executions and makes `TOOL_CALL_END` emission conditional on `argsMap`, so adapters that already emitted END are not duplicated.
Unit test updates and regression `packages/ai/tests/chat.test.ts`, `packages/ai/tests/middleware.test.ts`	Mocks now emit `toolEnd` for tool calls where appropriate; assertions updated to verify error state on `TOOL_CALL_RESULT`, ensure a single `TOOL_CALL_END`, and add a `#519` regression test ensuring one START/END/RESULT per server tool call.
E2E route harness and specs `testing/e2e/src/routes/api.tool-call-lifecycle-wire.ts`, `testing/e2e/src/routeTree.gen.ts`, `testing/e2e/tests/tool-call-lifecycle.spec.ts`	Add a deterministic server adapter route that streams balanced tool-call lifecycles; include route in generated route tree and add Playwright spec asserting event parity for a server-executed tool.
Devtools and release notes `.changeset/chat-no-duplicate-tool-call-end.md`, `.changeset/devtools-tool-result-from-result-event.md`, `packages/ai-event-client/src/devtools-middleware.ts`	Record the patch in changesets and update devtools middleware to surface `TOOL_CALL_RESULT` as tool results.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

AlemTuzlak
tombeckenham
crutchcorn

"🐰
I hopped through events, calm and spry,
Removed a double END that made streams cry.
Once from the adapter's steady beat,
Now RESULT completes the story neat—
Hooray, the event stream's tidy and dry!"

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and concisely summarizes the main change: fixing duplicate TOOL_CALL_END emission for server-executed tools.
Description check	✅ Passed	The PR description covers the What, Fix, and Tests sections, explaining the problem, solution, and validation approach comprehensively.
Linked Issues check	✅ Passed	All code changes directly address issue `#519`: removing duplicate TOOL_CALL_END from buildToolResultChunks, updating tests, and adjusting devtools middleware to use TOOL_CALL_RESULT for result delivery.
Out of Scope Changes check	✅ Passed	All changes are scoped to fixing the duplicate TOOL_CALL_END issue: core logic fixes, comprehensive test updates, devtools alignment, and supporting route/infrastructure updates for E2E validation.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@testing/e2e/tests/anthropic-server-tool.spec.ts`:
- Around line 37-38: Add a dedicated Playwright+aimock E2E test in
testing/e2e/tests/anthropic-server-tool.spec.ts that reproduces `#519` by
exercising the chat/tools-test path and asserting streaming event invariants:
simulate a server-executed tool invocation and verify exactly one TOOL_CALL_END
is emitted per server tool execution and that each TOOL_CALL_START pairs with a
subsequent TOOL_CALL_END (preserve ordering). Use the existing aimock setup in
this spec to mock the tool response, create a test case named to reference the
regression (e.g., "regression `#519`: single TOOL_CALL_END per tool"), and add
assertions that count TOOL_CALL_START and TOOL_CALL_END events and validate
start/end pairing in the streamed output.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a9fbd824-91a5-4756-b759-5a74efedb274

📥 Commits

Reviewing files that changed from the base of the PR and between d02cbb4 and a48d109.

📒 Files selected for processing (5)

.changeset/chat-no-duplicate-tool-call-end.md
packages/ai/src/activities/chat/index.ts
packages/ai/tests/chat.test.ts
packages/ai/tests/middleware.test.ts
testing/e2e/tests/anthropic-server-tool.spec.ts

Drive chat() with a server tool through an adapter that streams its own START/ARGS/END (as real adapters do) and assert the emitted lifecycle is balanced: exactly one TOOL_CALL_END per tool call, each preceded by a matching START — the invariant @ag-ui/client's verifyEvents enforces. Reproduces TanStack#519 (the orphan duplicate END) end-to-end.

coderabbitai

🧹 Nitpick comments (1)

testing/e2e/tests/tool-call-lifecycle.spec.ts (1)

44-50: ⚡ Quick win

Tighten END/START pairing invariant.

Line 45-50 currently checks only that a START existed previously. It should consume the START on Line 48-49 so duplicate ENDs for the same toolCallId fail the invariant check.

Suggested diff

     const open = new Set<string>()
     for (const c of chunks) {
       if (c.type === 'TOOL_CALL_START') open.add(idOf(c))
       if (c.type === 'TOOL_CALL_END') {
-        expect(open.has(idOf(c))).toBe(true)
+        const id = idOf(c)
+        expect(open.has(id)).toBe(true)
+        open.delete(id)
       }
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@testing/e2e/tests/tool-call-lifecycle.spec.ts` around lines 44 - 50, The
invariant currently only checks that a TOOL_CALL_END has a prior TOOL_CALL_START
by using open.has(idOf(c)) but does not remove the start, allowing duplicate
ENDs to pass; update the loop handling of c.type === 'TOOL_CALL_END' to both
assert open.has(idOf(c)) and then remove that id from the open Set
(open.delete(idOf(c))) so each START is consumed and a second END for the same
toolCallId will fail; reference the variables/chunk types used in this test:
chunks, open (Set<string>), idOf(c), and the c.type values 'TOOL_CALL_START' /
'TOOL_CALL_END'.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@testing/e2e/tests/tool-call-lifecycle.spec.ts`:
- Around line 44-50: The invariant currently only checks that a TOOL_CALL_END
has a prior TOOL_CALL_START by using open.has(idOf(c)) but does not remove the
start, allowing duplicate ENDs to pass; update the loop handling of c.type ===
'TOOL_CALL_END' to both assert open.has(idOf(c)) and then remove that id from
the open Set (open.delete(idOf(c))) so each START is consumed and a second END
for the same toolCallId will fail; reference the variables/chunk types used in
this test: chunks, open (Set<string>), idOf(c), and the c.type values
'TOOL_CALL_START' / 'TOOL_CALL_END'.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 66e47d63-8009-45a7-81c2-ccf096da26f5

📥 Commits

Reviewing files that changed from the base of the PR and between a48d109 and 4d1586c.

📒 Files selected for processing (3)

testing/e2e/src/routeTree.gen.ts
testing/e2e/src/routes/api.tool-call-lifecycle-wire.ts
testing/e2e/tests/tool-call-lifecycle.spec.ts

✅ Files skipped from review due to trivial changes (1)

testing/e2e/src/routeTree.gen.ts

A second TOOL_CALL_END for the same toolCallId now fails the pairing check directly, matching how @ag-ui/client's verifyEvents closes open tool calls. Addresses CodeRabbit review feedback on TanStack#696.

The devtools middleware sourced tool results from TOOL_CALL_END, which the adapter emits before execution (no result attached). With the TanStack#519 fix in @tanstack/ai, the post-execution END is no longer re-emitted, so results now travel on the spec-compliant TOOL_CALL_RESULT event. Handle it so server-executed tool output keeps reaching devtools.

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread testing/e2e/tests/anthropic-server-tool.spec.ts

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

valeriudev added 2 commits June 3, 2026 19:14

test(e2e): consume open START on END in pairing invariant

e365766

A second TOOL_CALL_END for the same toolCallId now fails the pairing check directly, matching how @ag-ui/client's verifyEvents closes open tool calls. Addresses CodeRabbit review feedback on TanStack#696.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai): stop emitting duplicate TOOL_CALL_END for server-executed tools#696

fix(ai): stop emitting duplicate TOOL_CALL_END for server-executed tools#696
valeriudev wants to merge 4 commits into
TanStack:mainfrom
valeriudev:fix/chat-duplicate-tool-call-end

valeriudev commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

Walkthrough

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

valeriudev commented Jun 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Fix

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

valeriudev commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading