fix(chat): stop duplicating assistant replies on multi-segment turns#1648
Conversation
The reconciliation path in ChatRuntimeProvider treated a segmented turn as "incomplete" whenever the concatenation of received segments did not byte- for-byte equal chat_done.full_response, then appended full_response as an extra assistant message. That equality essentially never held in practice — the server-side segmenter trims each segment and normalises paragraph breaks to "\n\n" (presentation.rs::segment_for_delivery), while chat_done.full_response ships the raw, untrimmed LLM text. So every multi-paragraph reply produced N segment bubbles + one duplicate full-text bubble. Trust the count instead of content: delivery is complete iff every expected segment_index arrived. Per-segment dedup (markChatEventSeen on segment:thread:request:index) already guarantees Map size = expected only when all distinct indices have been seen, so the count + index-presence check is sufficient. The reconciliation path still fires when a segment_index is genuinely missing, which is what it was meant to cover. Updated the "out-of-order full_response" test (which asserted the buggy content-equality behaviour) to assert the new contract, and added a regression test that exercises a missing segment_index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughChatRuntimeProvider's segment delivery completion check is refactored to verify that all expected segment indices (0 through ChangesSegment Completion Verification
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
PR Review — fix(chat): stop duplicating assistant replies on multi-segment turns
Walkthrough
This PR fixes a genuine product bug: every assistant reply long enough to trigger the Rust-side segmenter produced a spurious extra bubble in the chat UI. The root cause was a byte-equality check in hasCompleteSegmentDelivery that compared client-reconstructed segment text against full_response. Because segment_for_delivery in presentation.rs trims each segment and joins on \n\n while full_response preserves raw LLM whitespace, the strings almost never matched — so the reconciliation path appended full_response as a new message on nearly every multi-segment turn, and that message was persisted to the backend via appendMessage, meaning duplicates survived page reload.
The fix is correct and minimal: replace the string-equality gate with a count + index-presence loop. The genuine missing-segment recovery (socket drop / partial delivery) is preserved and now has its own test.
Changes
| File | Summary |
|---|---|
app/src/providers/ChatRuntimeProvider.tsx |
hasCompleteSegmentDelivery: drop reconstructed-string concatenation and === full_response equality check; replace with count guard + 0..N-1 index-presence loop. Add 6-line comment explaining removal. |
app/src/providers/__tests__/ChatRuntimeProvider.test.tsx |
Rename and invert old "out-of-order" test → regression test asserting reconciliation does NOT fire. Add new 'reconciles when a segment is missing' test for genuine drop scenario. |
Actionable comments
[minor] ChatRuntimeProvider.tsx:136 — segments.size < expected guard is now redundant
The early-exit if (delivery.segments.size < expected) return false was load-bearing when the function also relied on the reconstructed string — it short-circuited an expensive equality check. Now that the body is a loop that returns false on the first missing index, this guard adds no new information. Consider labeling it as a fast-path optimization:
// Fast path: if fewer entries than expected, at least one index is absent.
if (delivery.segments.size < expected) return false;[minor] ChatRuntimeProvider.tsx:771 — no debug log when segment delivery is complete (happy path)
The chat_done_segment_reconcile log fires on reconciliation. There is no corresponding log when completeSegmentDelivery === true. Before this fix, that branch was never hit for multi-segment turns; now it will be the dominant case. Per the project's debug-logging requirement, consider:
if (completeSegmentDelivery) {
rtLog('chat_done_segment_complete', {
thread: event.thread_id,
request: event.request_id,
segments: event.segment_total,
});
}[minor] Test: 'reconciles when a segment is missing' — intermediate assertion could be more specific
The await waitFor(() => expect(threadApi.appendMessage).toHaveBeenCalledTimes(1)) before onDone asserts count but not content. Making it content-aware would strengthen the test:
await waitFor(() =>
expect(threadApi.appendMessage).toHaveBeenCalledWith(
't-missing',
expect.objectContaining({ content: 'Part one.', sender: 'agent' })
)
);Verified / looks good
segment_for_deliveryinpresentation.rsconfirms the PR description: it callstrim()on each paragraph during the split, so byte-equality was a guaranteed false-negative.takeSegmentDeliveryremoves the delivery from the map beforehasCompleteSegmentDeliveryis called — no double-fire risk.- The
segment:${thread}:${request}:${segment_index}dedupe key ensuresdelivery.segments.sizecan only reachsegment_totalwhen all distinct indices arrived exactly once. The count guard is therefore sufficient. - No dynamic imports, no direct
import.meta.env, nowindow.__TAURI__checks — all clean. - Test data uses generic IDs (
t-trim,r-missing) — no hardcoded real names/emails. - Coverage gate passed. All 22 tests pass.
CI note
The only CI failure is "PR Submission Checklist" — 6 N/A items need [x] marking. Not a code issue.
Overall: clean, well-scoped fix with good test coverage. All comments are minor — nothing blocking merge.
Summary
Problem
When the server splits a long assistant reply into multiple bubbles via
presentation.rs::segment_for_delivery, the client receives Nchat_segmentevents followed by achat_done. Each segment is persisted as its own message inonSegment.onDonethen runs a "did all segments arrive?" check and, if not, falls back to appendingchat_done.full_responseas one more message.That fallback check used:
where
reconstructedis the received segments joined with no separator. The server-side segmenter.trim()s each segment and joins paragraphs with a normalised\n\n, whilechat_done.full_responseships the raw LLM text (leading/trailing whitespace, original separators). The strings basically never matched — so the fallback fired on every multi-segment turn and produced N segment bubbles + one duplicate full-text bubble. The duplicates were persisted to the backend viathreadApi.appendMessage, so they stick around in thread history on reload.The bug only fires when segmentation kicks in (response ≥ 80 chars, no code fences, not predominantly list/table content), so short replies and structured responses were unaffected — easy to miss in unit tests that used clean concatenable inputs.
Solution
Trust the count, not the content. Delivery is complete iff every expected
segment_indexarrived:delivery.segments.size === segment_total[0, segment_total)is present in the MapPer-segment dedup already runs at the cache key
segment:${thread}:${request}:${segment_index}, so the Map can only reach sizesegment_totalwhen all distinct indices have been seen exactly once. The content-equality check added nothing except a guaranteed false-negative because of the lossy server-side normalisation.Reconciliation still fires when a
segment_indexis genuinely missing (socket drop / partial delivery), which is the case it was designed for — covered by a new test.Added a leading comment to
hasCompleteSegmentDeliveryexplaining why the equality check was removed, so it doesn't get reintroduced.Submission Checklist
diff-cover) meet the gate enforced by.github/workflows/coverage.yml. Runpnpm test:coverageandpnpm test:rustlocally; PRs below 80% on changed lines will not merge.docs/TEST-COVERAGE-MATRIX.mddocs/RELEASE-MANUAL-SMOKE.md— existing "send a chat message" smoke covers itImpact
appendMessageRPC + one fewer Redux dispatch per affected turn.Related
scripts/run-dev-win.shhas two unrelated Windows build-tooling bugs (greedy-regex SDK detection, leakedVSINSTALLDIRcausingGenerator Ninja does not support instance specificationon fresh worktrees) — separate PR.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
debug/double-messages-chat37b9cd4667c98046ddb80cd95756f141f87ab243Validation Run
pnpm --filter openhuman-app format:checkpnpm typecheckpnpm --filter openhuman-app test:unit src/providers/__tests__/ChatRuntimeProvider.test.tsx— 22/22 passedValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
segment_indexnever arrives) is unchanged —onDonestill appendsfull_responsein that case. Covered by the new'reconciles when a segment is missing'test.!event.segment_totalbranch (single-bubble path) untouched;onSegmentper-segment dedup untouched; segment-delivery TTL/eviction untouched.Duplicate / Superseded PR Handling
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests