Skip to content

fix: use toolCallId for tool status tracking, prevent duplicate messages on long responses#53

Closed
ruan330 wants to merge 2 commits intoopenabdev:mainfrom
ruan330:fix/tool-status-display
Closed

fix: use toolCallId for tool status tracking, prevent duplicate messages on long responses#53
ruan330 wants to merge 2 commits intoopenabdev:mainfrom
ruan330:fix/tool-status-display

Conversation

@ruan330
Copy link
Copy Markdown

@ruan330 ruan330 commented Apr 5, 2026

Problem

Two related display bugs in the Discord output pipeline, both stemming from the same architectural gap: tool call events lack stable identity tracking, and the streaming/final-edit lifecycle creates orphaned messages.

Bug 1: tool status lines show wrong or empty titles

When using Claude Code (via claude-agent-acp), the ACP protocol's tool_call (start) and tool_call_update (done) events frequently carry different title values for the same tool invocation:

  • Read tool starts as "Read CLAUDE.md", completes as "Read File"
  • Bash tool starts as "find /workspace -name '*.py'", completes as "Terminal"
  • Subagent child tools often complete with an empty title

The current matching logic (tool_lines.find(|l| l.contains(&title))) fails to correlate these, producing:

✅ ``         ← empty: ToolDone title was empty, no match found
✅ ``         ← empty: same issue
🔧 `Read CLAUDE.md`...  ← stale: ToolDone had different title, never matched

Bug 2: long responses appear duplicated in Discord

When agent output exceeds ~1900 characters, the response is posted 2-3 times:

[tool status lines]
[agent response text]       ← from streaming overflow (channel.say)
[tool status lines]         ← duplicated
[agent response text]       ← from final edit overflow (channel.say again)

Root cause: the edit-streaming task (1.5s interval) calls channel.say() to create overflow messages when content exceeds 1900 chars. These message IDs are tracked only inside the spawned task. The final edit uses the original message ID and creates its own overflow — the streaming messages are never cleaned up.

Solution

Fix 1: match tool calls by toolCallId

Every ACP tool_call and tool_call_update notification carries a unique toolCallId field. This PR extracts it and uses it as the matching key instead of title text.

On ToolDone, we find the corresponding ToolStart line by ID and swap only the icon, preserving the original title from the start event — which always has the most descriptive text.

Before: ToolStart("Read CLAUDE.md") → ToolDone("Read File") → no match → broken
After:  ToolStart(id=abc, "Read CLAUDE.md") → ToolDone(id=abc, "") → match → ✅ Read CLAUDE.md

Fix 2: streaming edits a single message, no overflow

During streaming, only edit the single placeholder message. If content exceeds 1900 chars, truncate with "…" — this is a live preview, not the final output. The final edit (after streaming completes) handles proper multi-message splitting via channel.say().

This cleanly separates responsibilities:

  • Streaming = live preview in one message (truncated if needed)
  • Final edit = authoritative delivery, split across messages if needed

No orphaned messages, no duplication.

Alternatives considered

For tool matching:

  • Keep title matching but normalize titles → fragile, depends on CLI-specific title formats
  • Match by index (last pending tool) → breaks with parallel tool calls

For duplicate messages:

  • Share message IDs between streaming task and final edit via Arc<Mutex<Vec<MessageId>>> → adds complexity and potential race conditions
  • Delete streaming overflow messages before final edit → fragile, depends on Discord API timing
  • Truncate during streaming, let final edit handle splitting ← chosen, simplest, zero race conditions

Changes

File Lines What
protocol.rs +7/-5 Add id field to ToolStart/ToolDone, extract toolCallId
discord.rs +24/-16 Track tool_ids parallel to tool_lines, match by ID; simplify streaming to single-message edit with truncation

Testing

Tested with claude-agent-acp backend:

  • Tool status lines now correctly show titles for all tool types (Read, Bash, Edit, Agent subagents)
  • Long responses (>2000 chars with 30+ tool calls) display once, not duplicated
  • Streaming preview works normally for responses under 1900 chars

…ges on long responses

Two related bugs in the Discord display pipeline, both caused by the
same architectural gap: tool call events lack stable identity tracking,
and the streaming/final-edit lifecycle creates orphaned messages.

--- Bug 1: tool status lines show wrong or empty titles ---

Root cause: tool_call (start) and tool_call_update (done) events from
the ACP protocol frequently carry different `title` values for the same
tool invocation. For example, a Read tool starts with title
"Read CLAUDE.md" but completes with "Read File" — or an empty string
for subagent child tools. The current title-based string matching
(`tool_lines.find(|l| l.contains(&title))`) fails to correlate these,
producing empty `✅ ` lines or stale `🔧` lines that never resolve.

Fix: extract `toolCallId` from ACP notifications (every tool_call and
tool_call_update carries this unique identifier) and use it as the
primary matching key. On ToolDone, locate the corresponding ToolStart
line by ID and swap only the icon (🔧→✅/❌), preserving the original
title from the start event — which always has the most descriptive text.

--- Bug 2: long responses appear duplicated in Discord ---

Root cause: when streamed content exceeds 1900 characters, the
edit-streaming task (running every 1.5s) calls `channel.say()` to
create overflow messages. These message IDs are tracked only inside the
spawned streaming task. When streaming completes, the final edit uses
the *original* message ID and calls `channel.say()` again for its own
overflow — producing a second copy of the same content. The streaming
overflow messages are never cleaned up.

Considered alternatives:
- Share message IDs between streaming task and final edit via
  Arc<Mutex<Vec<MessageId>>> — adds complexity and race conditions
- Delete streaming overflow messages before final edit — fragile,
  depends on Discord API timing
- Don't create overflow during streaming; truncate to single message
  ← chosen approach

Fix: during streaming, only edit the single placeholder message. If
content exceeds 1900 chars, truncate with "…" — this is a live preview,
not the final output. The final edit (after streaming completes) handles
proper multi-message splitting. This cleanly separates responsibilities:
streaming = live preview in one message, final edit = authoritative
multi-message delivery.

Changes:
  protocol.rs (+7/-5): add `id` field to ToolStart/ToolDone variants,
    extract toolCallId from ACP update notifications
  discord.rs (+24/-16): track tool_ids vec parallel to tool_lines,
    match ToolDone by ID preserving original title; simplify streaming
    edit to single-message truncation
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0684e6dfa8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/discord.rs Outdated
// Truncate if over Discord's limit — the final edit handles
// proper multi-message splitting after streaming completes.
let display = if content.len() > 1900 {
format!("{}…", &content[..1900])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Truncate streaming preview at UTF-8 boundaries

The new streaming truncation path slices with &content[..1900] after checking content.len(), but len() is bytes, not character boundaries. For any response over 1900 bytes where byte 1900 lands inside a multibyte character (for example emoji or CJK text), this will panic at runtime, which terminates the spawned edit task and stops live message updates for the rest of that prompt. This regression is input-dependent but user-visible for non-ASCII long outputs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator

@chaodu-agent chaodu-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good — toolCallId matching and single-message streaming are clean fixes. One concern below.

// proper multi-message splitting after streaming completes.
let display = if content.len() > 1900 {
format!("{}…", &content[..1900])
} else {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ &content[..1900] will panic if byte 1900 falls in the middle of a multi-byte UTF-8 character (e.g. Chinese/Japanese text is 3 bytes per char). This is likely to hit in production since many openab users write in zh-TW.

Suggestion:

let end = (1900..content.len())
    .find(|&i| content.is_char_boundary(i))
    .unwrap_or(content.len());
format!("{}…", &content[..end])

Or if targeting Rust 1.80+: content.floor_char_boundary(1900)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3cbc865 — both the streaming truncation and the split_message() hard-split path now use floor_char_boundary() (stable since Rust 1.80).

  • discord.rs: &content[..1900]content.floor_char_boundary(1900)
  • format.rs: as_bytes().chunks(limit) + from_utf8_lossyfloor_char_boundary loop, no more replacement chars

Thanks for catching this.

Streaming truncation (`&content[..1900]`) panics when byte 1900 falls
inside a multi-byte UTF-8 character (CJK, emoji). Use
`floor_char_boundary()` (stable since Rust 1.80) to find the nearest
valid boundary.

Also fix `split_message()` hard-split path which used
`as_bytes().chunks()` + `from_utf8_lossy`, silently corrupting
multi-byte characters with replacement chars (�).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ruan330 ruan330 requested a review from thepagent as a code owner April 10, 2026 07:18
@ruan330
Copy link
Copy Markdown
Author

ruan330 commented Apr 10, 2026

The UTF-8 review feedback from @chaodu-agent has been addressed in 3cbc865. Both the streaming truncation and split_message() hard-split path now use floor_char_boundary().

This should be ready for another look when you have a chance. Thanks!

@chaodu-agent
Copy link
Copy Markdown
Collaborator

Closing in favor of #135 and #138, which together cover both fixes from this PR (message dedup + toolCallId matching + UTF-8 boundary handling).

@ruan330 — thank you for the excellent root cause analysis and the early fix. Your investigation in both this PR and the #81 discussion was instrumental in shaping the final solution. Much appreciated! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants