Skip to content

fix(ai): stop emitting duplicate TOOL_CALL_END for server-executed tools#696

Open
valeriudev wants to merge 4 commits into
TanStack:mainfrom
valeriudev:fix/chat-duplicate-tool-call-end
Open

fix(ai): stop emitting duplicate TOOL_CALL_END for server-executed tools#696
valeriudev wants to merge 4 commits into
TanStack:mainfrom
valeriudev:fix/chat-duplicate-tool-call-end

Conversation

@valeriudev
Copy link
Copy Markdown

@valeriudev valeriudev commented Jun 3, 2026

Closes #519.

What

chat() emitted a duplicate TOOL_CALL_END for every server-executed tool. Adapters already stream TOOL_CALL_START / TOOL_CALL_ARGS / TOOL_CALL_END for each call, but the post-execution phase pushed a second END with no matching START. AG-UI-strict consumers (e.g. @ag-ui/client's verifyEvents, which CopilotKit's runtime pipes through) reject an orphan END and abort the stream.

Fix

buildToolResultChunks now emits END only alongside the START/ARGS it already gates on argsMap — the continuation re-execution path, which reconstructs a tool call the adapter never streamed. The normal post-execution path contributes only TOOL_CALL_RESULT. This also makes the triple atomic: START and ARGS were already conditional on argsMap; only END was leaking out unconditionally.

Result and error state ride on the spec-compliant TOOL_CALL_RESULT, so nothing is lost — verified the client StreamProcessor delivers both via handleToolCallResultEvent.

Tests

  • New regression test asserting exactly one START/END/RESULT per server-executed tool, plus the "every END has a matching START" invariant verifyEvents enforces.
  • Updated the tool-loop mocks to stream their own TOOL_CALL_END (matching real adapters) and moved the error-state assertion onto TOOL_CALL_RESULT.
  • Full @tanstack/ai unit suite + affected E2E specs green.

Summary by CodeRabbit

  • New Features

    • Devtools now surfaces server-executed tool outputs using a dedicated tool-result event so tool results appear correctly in the UI.
  • Bug Fixes

    • Fixed duplicate/orphan TOOL_CALL_END events after server-executed tool calls so each tool call emits a single, well-formed lifecycle sequence.
  • Tests

    • Added and strengthened unit and end-to-end regression tests to enforce exactly one START/END/RESULT per server-executed tool and validate event-stream invariants.

Adapters already stream START/ARGS/END for every tool call, but the
post-execution phase in chat() pushed a second TOOL_CALL_END with no
preceding TOOL_CALL_START. AG-UI-strict consumers (e.g. @ag-ui/client's
verifyEvents) reject an orphan END and abort the stream.

buildToolResultChunks now emits END only alongside the matching START it
already gates on argsMap (the continuation re-execution path, which
reconstructs a never-streamed call); the normal post-execution path
contributes only TOOL_CALL_RESULT. The result/error state lives on the
spec-compliant TOOL_CALL_RESULT, so nothing is lost.

Fixes TanStack#519.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 3, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 41bf3622-ef99-4eee-bdd7-02dcbbd5a506

📥 Commits

Reviewing files that changed from the base of the PR and between e365766 and 40ca041.

📒 Files selected for processing (2)
  • .changeset/devtools-tool-result-from-result-event.md
  • packages/ai-event-client/src/devtools-middleware.ts
✅ Files skipped from review due to trivial changes (1)
  • .changeset/devtools-tool-result-from-result-event.md

📝 Walkthrough

Walkthrough

The PR fixes issue #519 by making buildToolResultChunks emit TOOL_CALL_END only when reconstructing continuation re-executions (argsMap present). Adapters that already streamed END will now only produce TOOL_CALL_RESULT post-execution. Tests, an e2e route/spec, devtools middleware, and changesets were added/updated to enforce the one-START/one-END/one-RESULT invariant.

Chat Duplication Fix

Layer / File(s) Summary
Conditional TOOL_CALL_END emission in buildToolResultChunks
packages/ai/src/activities/chat/index.ts
Clarifies argsMap is only provided for continuation re-executions and makes TOOL_CALL_END emission conditional on argsMap, so adapters that already emitted END are not duplicated.
Unit test updates and regression
packages/ai/tests/chat.test.ts, packages/ai/tests/middleware.test.ts
Mocks now emit toolEnd for tool calls where appropriate; assertions updated to verify error state on TOOL_CALL_RESULT, ensure a single TOOL_CALL_END, and add a #519 regression test ensuring one START/END/RESULT per server tool call.
E2E route harness and specs
testing/e2e/src/routes/api.tool-call-lifecycle-wire.ts, testing/e2e/src/routeTree.gen.ts, testing/e2e/tests/tool-call-lifecycle.spec.ts
Add a deterministic server adapter route that streams balanced tool-call lifecycles; include route in generated route tree and add Playwright spec asserting event parity for a server-executed tool.
Devtools and release notes
.changeset/chat-no-duplicate-tool-call-end.md, .changeset/devtools-tool-result-from-result-event.md, packages/ai-event-client/src/devtools-middleware.ts
Record the patch in changesets and update devtools middleware to surface TOOL_CALL_RESULT as tool results.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • AlemTuzlak
  • tombeckenham
  • crutchcorn

"🐰
I hopped through events, calm and spry,
Removed a double END that made streams cry.
Once from the adapter's steady beat,
Now RESULT completes the story neat—
Hooray, the event stream's tidy and dry!"

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main change: fixing duplicate TOOL_CALL_END emission for server-executed tools.
Description check ✅ Passed The PR description covers the What, Fix, and Tests sections, explaining the problem, solution, and validation approach comprehensively.
Linked Issues check ✅ Passed All code changes directly address issue #519: removing duplicate TOOL_CALL_END from buildToolResultChunks, updating tests, and adjusting devtools middleware to use TOOL_CALL_RESULT for result delivery.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the duplicate TOOL_CALL_END issue: core logic fixes, comprehensive test updates, devtools alignment, and supporting route/infrastructure updates for E2E validation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@testing/e2e/tests/anthropic-server-tool.spec.ts`:
- Around line 37-38: Add a dedicated Playwright+aimock E2E test in
testing/e2e/tests/anthropic-server-tool.spec.ts that reproduces `#519` by
exercising the chat/tools-test path and asserting streaming event invariants:
simulate a server-executed tool invocation and verify exactly one TOOL_CALL_END
is emitted per server tool execution and that each TOOL_CALL_START pairs with a
subsequent TOOL_CALL_END (preserve ordering). Use the existing aimock setup in
this spec to mock the tool response, create a test case named to reference the
regression (e.g., "regression `#519`: single TOOL_CALL_END per tool"), and add
assertions that count TOOL_CALL_START and TOOL_CALL_END events and validate
start/end pairing in the streamed output.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a9fbd824-91a5-4756-b759-5a74efedb274

📥 Commits

Reviewing files that changed from the base of the PR and between d02cbb4 and a48d109.

📒 Files selected for processing (5)
  • .changeset/chat-no-duplicate-tool-call-end.md
  • packages/ai/src/activities/chat/index.ts
  • packages/ai/tests/chat.test.ts
  • packages/ai/tests/middleware.test.ts
  • testing/e2e/tests/anthropic-server-tool.spec.ts

Comment thread testing/e2e/tests/anthropic-server-tool.spec.ts
Drive chat() with a server tool through an adapter that streams its own
START/ARGS/END (as real adapters do) and assert the emitted lifecycle is
balanced: exactly one TOOL_CALL_END per tool call, each preceded by a
matching START — the invariant @ag-ui/client's verifyEvents enforces.
Reproduces TanStack#519 (the orphan duplicate END) end-to-end.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
testing/e2e/tests/tool-call-lifecycle.spec.ts (1)

44-50: ⚡ Quick win

Tighten END/START pairing invariant.

Line 45-50 currently checks only that a START existed previously. It should consume the START on Line 48-49 so duplicate ENDs for the same toolCallId fail the invariant check.

Suggested diff
     const open = new Set<string>()
     for (const c of chunks) {
       if (c.type === 'TOOL_CALL_START') open.add(idOf(c))
       if (c.type === 'TOOL_CALL_END') {
-        expect(open.has(idOf(c))).toBe(true)
+        const id = idOf(c)
+        expect(open.has(id)).toBe(true)
+        open.delete(id)
       }
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@testing/e2e/tests/tool-call-lifecycle.spec.ts` around lines 44 - 50, The
invariant currently only checks that a TOOL_CALL_END has a prior TOOL_CALL_START
by using open.has(idOf(c)) but does not remove the start, allowing duplicate
ENDs to pass; update the loop handling of c.type === 'TOOL_CALL_END' to both
assert open.has(idOf(c)) and then remove that id from the open Set
(open.delete(idOf(c))) so each START is consumed and a second END for the same
toolCallId will fail; reference the variables/chunk types used in this test:
chunks, open (Set<string>), idOf(c), and the c.type values 'TOOL_CALL_START' /
'TOOL_CALL_END'.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@testing/e2e/tests/tool-call-lifecycle.spec.ts`:
- Around line 44-50: The invariant currently only checks that a TOOL_CALL_END
has a prior TOOL_CALL_START by using open.has(idOf(c)) but does not remove the
start, allowing duplicate ENDs to pass; update the loop handling of c.type ===
'TOOL_CALL_END' to both assert open.has(idOf(c)) and then remove that id from
the open Set (open.delete(idOf(c))) so each START is consumed and a second END
for the same toolCallId will fail; reference the variables/chunk types used in
this test: chunks, open (Set<string>), idOf(c), and the c.type values
'TOOL_CALL_START' / 'TOOL_CALL_END'.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 66e47d63-8009-45a7-81c2-ccf096da26f5

📥 Commits

Reviewing files that changed from the base of the PR and between a48d109 and 4d1586c.

📒 Files selected for processing (3)
  • testing/e2e/src/routeTree.gen.ts
  • testing/e2e/src/routes/api.tool-call-lifecycle-wire.ts
  • testing/e2e/tests/tool-call-lifecycle.spec.ts
✅ Files skipped from review due to trivial changes (1)
  • testing/e2e/src/routeTree.gen.ts

A second TOOL_CALL_END for the same toolCallId now fails the pairing
check directly, matching how @ag-ui/client's verifyEvents closes open
tool calls. Addresses CodeRabbit review feedback on TanStack#696.
The devtools middleware sourced tool results from TOOL_CALL_END, which the
adapter emits before execution (no result attached). With the TanStack#519 fix in
@tanstack/ai, the post-execution END is no longer re-emitted, so results
now travel on the spec-compliant TOOL_CALL_RESULT event. Handle it so
server-executed tool output keeps reaching devtools.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chat(): duplicate TOOL_CALL_END (no preceding START) for server-executed tools breaks AG-UI verify

1 participant