fix(chat): stop duplicating assistant replies on multi-segment turns by sanil-23 · Pull Request #1648 · tinyhumansai/openhuman

sanil-23 · 2026-05-13T14:27:12Z

Summary

Stop duplicating every multi-paragraph assistant reply in the chat UI.
Replace a content-equality check in segment reconciliation with a count + index-presence check.
Update the "out-of-order full_response" unit test (which had asserted the buggy behaviour) and add a regression test for a genuinely missing segment.

Problem

When the server splits a long assistant reply into multiple bubbles via presentation.rs::segment_for_delivery, the client receives N chat_segment events followed by a chat_done. Each segment is persisted as its own message in onSegment. onDone then runs a "did all segments arrive?" check and, if not, falls back to appending chat_done.full_response as one more message.

That fallback check used:

return reconstructed === event.full_response;

where reconstructed is the received segments joined with no separator. The server-side segmenter .trim()s each segment and joins paragraphs with a normalised \n\n, while chat_done.full_response ships the raw LLM text (leading/trailing whitespace, original separators). The strings basically never matched — so the fallback fired on every multi-segment turn and produced N segment bubbles + one duplicate full-text bubble. The duplicates were persisted to the backend via threadApi.appendMessage, so they stick around in thread history on reload.

The bug only fires when segmentation kicks in (response ≥ 80 chars, no code fences, not predominantly list/table content), so short replies and structured responses were unaffected — easy to miss in unit tests that used clean concatenable inputs.

Solution

Trust the count, not the content. Delivery is complete iff every expected segment_index arrived:

delivery.segments.size === segment_total
every index in [0, segment_total) is present in the Map

Per-segment dedup already runs at the cache key segment:${thread}:${request}:${segment_index}, so the Map can only reach size segment_total when all distinct indices have been seen exactly once. The content-equality check added nothing except a guaranteed false-negative because of the lossy server-side normalisation.

Reconciliation still fires when a segment_index is genuinely missing (socket drop / partial delivery), which is the case it was designed for — covered by a new test.

Added a leading comment to hasCompleteSegmentDelivery explaining why the equality check was removed, so it doesn't get reintroduced.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
N/A: behaviour-only change — no new/removed/renamed feature rows for docs/TEST-COVERAGE-MATRIX.md
N/A: behaviour-only change, no matrix feature IDs affected
No new external network dependencies introduced (mock backend used per Testing Strategy)
N/A: touches chat-runtime client glue, no new surface in docs/RELEASE-MANUAL-SMOKE.md — existing "send a chat message" smoke covers it
N/A: no linked issue (regression caught during live triage)

Impact

Desktop: every multi-paragraph assistant reply previously produced one redundant bubble persisted to backend; now produces exactly N segment bubbles.
No protocol change — purely a client-side check tightening.
No migration: existing duplicate messages already persisted in user threads stay until manually deleted; future turns are clean.
Performance: one fewer appendMessage RPC + one fewer Redux dispatch per affected turn.
Backwards compatibility: the genuine "segment dropped on the wire" recovery path still works (covered by new test).

Closes: no GitHub issue — regression found during live debugging session
Follow-up PR(s)/TODOs: scripts/run-dev-win.sh has two unrelated Windows build-tooling bugs (greedy-regex SDK detection, leaked VSINSTALLDIR causing Generator Ninja does not support instance specification on fresh worktrees) — separate PR.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: debug/double-messages-chat
Commit SHA: 37b9cd4667c98046ddb80cd95756f141f87ab243

Validation Run

pnpm --filter openhuman-app format:check
pnpm typecheck
Focused tests: pnpm --filter openhuman-app test:unit src/providers/__tests__/ChatRuntimeProvider.test.tsx — 22/22 passed
N/A: no Rust changes
N/A: no Tauri shell changes

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: multi-segment assistant replies no longer produce a duplicate full-response bubble.
User-visible effect: in the chat list, an assistant reply that previously rendered as e.g. 2 segment bubbles followed by a 3rd duplicate-content bubble now renders as exactly 2 bubbles.

Parity Contract

Legacy behavior preserved: the genuine missing-segment recovery path (when a segment_index never arrives) is unchanged — onDone still appends full_response in that case. Covered by the new 'reconciles when a segment is missing' test.
Guard/fallback/dispatch parity checks: !event.segment_total branch (single-bubble path) untouched; onSegment per-segment dedup untouched; segment-delivery TTL/eviction untouched.

Duplicate / Superseded PR Handling

Duplicate PR(s): none
Canonical PR: N/A
Resolution (closed/superseded/updated): N/A

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved chat streaming completion detection to prevent unnecessary reconciliation when segment delivery is complete, even if response formatting differs.
Tests
- Updated test coverage for streaming chat scenarios, including regression tests for segment delivery verification and reconciliation behavior.

The reconciliation path in ChatRuntimeProvider treated a segmented turn as "incomplete" whenever the concatenation of received segments did not byte- for-byte equal chat_done.full_response, then appended full_response as an extra assistant message. That equality essentially never held in practice — the server-side segmenter trims each segment and normalises paragraph breaks to "\n\n" (presentation.rs::segment_for_delivery), while chat_done.full_response ships the raw, untrimmed LLM text. So every multi-paragraph reply produced N segment bubbles + one duplicate full-text bubble. Trust the count instead of content: delivery is complete iff every expected segment_index arrived. Per-segment dedup (markChatEventSeen on segment:thread:request:index) already guarantees Map size = expected only when all distinct indices have been seen, so the count + index-presence check is sufficient. The reconciliation path still fires when a segment_index is genuinely missing, which is what it was meant to cover. Updated the "out-of-order full_response" test (which asserted the buggy content-equality behaviour) to assert the new contract, and added a regression test that exercises a missing segment_index. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-13T14:27:28Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6212c8ad-cb88-4ef3-a2b7-2b239473ac43

📥 Commits

Reviewing files that changed from the base of the PR and between 9160317 and 37b9cd4.

📒 Files selected for processing (2)

app/src/providers/ChatRuntimeProvider.tsx
app/src/providers/__tests__/ChatRuntimeProvider.test.tsx

📝 Walkthrough

Walkthrough

ChatRuntimeProvider's segment delivery completion check is refactored to verify that all expected segment indices (0 through segment_total - 1) have been received, rather than reconstructing the full response and byte-comparing it against event.full_response. Tests add a regression case and update assertions to confirm reconciliation is skipped when segments are complete, and to validate reconciliation behavior when segments are missing.

Changes

Segment Completion Verification

Layer / File(s)	Summary
Segment index coverage verification `app/src/providers/ChatRuntimeProvider.tsx`	`hasCompleteSegmentDelivery` now checks that all expected `segment_index` values (0 to `segment_total - 1`) are present in the delivery, rather than reconstructing the full response and byte-comparing it to `event.full_response`. Comments explain that segment trimming and joiner normalization make byte-equality unreliable.
Segment delivery and reconciliation tests `app/src/providers/__tests__/ChatRuntimeProvider.test.tsx`	A regression test verifies reconciliation does not occur when all segments arrive, even if `chat_done.full_response` formatting differs. The missing-segment test is updated to ensure reconciliation occurs only after `chat_done` when segments are incomplete, producing the full joined message content with the agent sender.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

tinyhumansai/openhuman#1469: Both PRs modify ChatRuntimeProvider's segmented-response reconciliation logic and update tests to cover reconciliation behavior when all expected segments have arrived despite formatting differences.
tinyhumansai/openhuman#1051: Directly related refactor to stop byte-equality-based reconciliation in favor of segment-index coverage verification to handle joiner/trim formatting divergence in full_response.
tinyhumansai/openhuman#1261: Both PRs modify ChatRuntimeProvider segment completion logic and update tests to avoid unnecessary reconciliation based on segment presence verification.

Suggested reviewers

senamakel

Poem

🐰 A rabbit hops through segments bright,
No longer haunted by byte-equal plight!
Coverage counts, not string comparison's way,
Streaming flows smoothly, hooray hooray! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'fix(chat): stop duplicating assistant replies on multi-segment turns' directly and clearly summarizes the main change: fixing duplicate assistant message replies that occur during multi-segment chat responses.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

graycyrus

PR Review — fix(chat): stop duplicating assistant replies on multi-segment turns

Walkthrough

This PR fixes a genuine product bug: every assistant reply long enough to trigger the Rust-side segmenter produced a spurious extra bubble in the chat UI. The root cause was a byte-equality check in hasCompleteSegmentDelivery that compared client-reconstructed segment text against full_response. Because segment_for_delivery in presentation.rs trims each segment and joins on \n\n while full_response preserves raw LLM whitespace, the strings almost never matched — so the reconciliation path appended full_response as a new message on nearly every multi-segment turn, and that message was persisted to the backend via appendMessage, meaning duplicates survived page reload.

The fix is correct and minimal: replace the string-equality gate with a count + index-presence loop. The genuine missing-segment recovery (socket drop / partial delivery) is preserved and now has its own test.

Changes

File	Summary
`app/src/providers/ChatRuntimeProvider.tsx`	`hasCompleteSegmentDelivery`: drop reconstructed-string concatenation and `=== full_response` equality check; replace with count guard + 0..N-1 index-presence loop. Add 6-line comment explaining removal.
`app/src/providers/__tests__/ChatRuntimeProvider.test.tsx`	Rename and invert old "out-of-order" test → regression test asserting reconciliation does NOT fire. Add new `'reconciles when a segment is missing'` test for genuine drop scenario.

Actionable comments

[minor] ChatRuntimeProvider.tsx:136 — segments.size < expected guard is now redundant

The early-exit if (delivery.segments.size < expected) return false was load-bearing when the function also relied on the reconstructed string — it short-circuited an expensive equality check. Now that the body is a loop that returns false on the first missing index, this guard adds no new information. Consider labeling it as a fast-path optimization:

// Fast path: if fewer entries than expected, at least one index is absent.
if (delivery.segments.size < expected) return false;

[minor] ChatRuntimeProvider.tsx:771 — no debug log when segment delivery is complete (happy path)

The chat_done_segment_reconcile log fires on reconciliation. There is no corresponding log when completeSegmentDelivery === true. Before this fix, that branch was never hit for multi-segment turns; now it will be the dominant case. Per the project's debug-logging requirement, consider:

if (completeSegmentDelivery) {
  rtLog('chat_done_segment_complete', {
    thread: event.thread_id,
    request: event.request_id,
    segments: event.segment_total,
  });
}

[minor] Test: 'reconciles when a segment is missing' — intermediate assertion could be more specific

The await waitFor(() => expect(threadApi.appendMessage).toHaveBeenCalledTimes(1)) before onDone asserts count but not content. Making it content-aware would strengthen the test:

await waitFor(() =>
  expect(threadApi.appendMessage).toHaveBeenCalledWith(
    't-missing',
    expect.objectContaining({ content: 'Part one.', sender: 'agent' })
  )
);

Verified / looks good

segment_for_delivery in presentation.rs confirms the PR description: it calls trim() on each paragraph during the split, so byte-equality was a guaranteed false-negative.
takeSegmentDelivery removes the delivery from the map before hasCompleteSegmentDelivery is called — no double-fire risk.
The segment:${thread}:${request}:${segment_index} dedupe key ensures delivery.segments.size can only reach segment_total when all distinct indices arrived exactly once. The count guard is therefore sufficient.
No dynamic imports, no direct import.meta.env, no window.__TAURI__ checks — all clean.
Test data uses generic IDs (t-trim, r-missing) — no hardcoded real names/emails.
Coverage gate passed. All 22 tests pass.

CI note

The only CI failure is "PR Submission Checklist" — 6 N/A items need [x] marking. Not a code issue.

Overall: clean, well-scoped fix with good test coverage. All comments are minor — nothing blocking merge.

sanil-23 requested a review from a team May 13, 2026 14:27

coderabbitai Bot approved these changes May 13, 2026

View reviewed changes

graycyrus reviewed May 13, 2026

View reviewed changes

graycyrus merged commit bddfbb1 into tinyhumansai:main May 13, 2026
18 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(chat): stop duplicating assistant replies on multi-segment turns#1648

fix(chat): stop duplicating assistant replies on multi-segment turns#1648
graycyrus merged 1 commit into
tinyhumansai:mainfrom
sanil-23:debug/double-messages-chat

sanil-23 commented May 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

graycyrus left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sanil-23 commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

graycyrus left a comment

Choose a reason for hiding this comment

PR Review — fix(chat): stop duplicating assistant replies on multi-segment turns

Walkthrough

Changes

Actionable comments

Verified / looks good

CI note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sanil-23 commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading