fix(agent): bound cached resume transcript by max_history_messages by YellowSnnowmann · Pull Request #2224 · tinyhumansai/openhuman

YellowSnnowmann · 2026-05-19T15:01:02Z

Summary

Cap the cached transcript prefix at config.max_history_messages whenever a resumed session is primed, so resume paths can no longer ship an unbounded message log to the provider on iteration 1.
Applied at both resume entry points: seed_resume_from_messages (cold-boot priming from caller-supplied messages) and try_load_session_transcript (transcript-file load).
Leading system message is preserved when present; otherwise the tail N messages are kept.
Added bound_cached_transcript_messages helper on Agent with three unit tests covering: system-prefix bound on seed, system-prefix bound on transcript load, and the no-system-message tail-keep branch.
Resume paths now emit a warn log when the bound actually trims, so over-long transcripts are visible in diagnostics.

Problem

Sentry issue OPENHUMAN-TAURI-QX — custom_openai 400 Bad Request: "Requested token count exceeds the model's maximum context length of 202752 tokens. You requested a total of 203783 tokens". 48 events, first seen 2026-05-15, last 2026-05-17, release openhuman@0.53.43.

Root cause: on resume, cached_transcript_messages is consumed verbatim on the first iteration of the agent loop (turn.rs:561) — only a single new-tail user message is appended. This path bypasses both trim_history and reduce_before_call, so a long persisted transcript blows straight through the model's context window on the very first provider call after a resume.

Solution

Introduce Agent::bound_cached_transcript_messages(Vec<ChatMessage>) -> Vec<ChatMessage> that caps the slice at config.max_history_messages (.max(1) floor). When the first message is system, it is preserved and the last max-1 messages are kept; otherwise the last max messages are kept.
Wire the helper at both resume entry points so the bound applies before the cached transcript is handed to the provider.
Diagnostics: warn-level log when bounding actually trims, plus the existing info log now reports the post-bound count so logs don't overstate what was primed.

Design notes / tradeoffs:

Bound is by message count, mirroring the existing trim_history semantics — not by token count. A single oversized tool-result message can still in theory exceed context; the autocompactor (reduce_before_call) is what handles token-level pressure on subsequent iterations. A token-aware bound for the cached path is a sensible follow-up but was intentionally deferred to keep this change behavior-conservative and Sentry-targeted.
ChatMessage is role + content only (no structured tool_calls), so slicing in the middle is structurally safe at the wire level; same orphan-tool-reference risk that trim_history already carries.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change (or N/A: behaviour-only change) — N/A: behaviour-only change to an existing resume code path; no new feature surface.
All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no matrix rows affected.
No new external network dependencies introduced (mock backend used per Testing Strategy)
Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: internal agent harness bound; no release-cut surface touched.
Linked issue closed via Closes #NNN in the ## Related section — N/A: Sentry issue OPENHUMAN-TAURI-QX, no GitHub issue tracking item.

Impact

Runtime/platform: Rust core only (src/openhuman/agent/harness/session/). Affects desktop (Tauri host) and openhuman-core CLI equally — both paths share the agent harness. No frontend, no mobile, no web surface touched.
Performance: net positive — resume requests with long transcripts will now fit inside the context window instead of 400ing. Bounding cost is a single Vec slice/clone on resume entry; negligible.
Security: none.
Migration / compatibility: no schema, RPC, or persistence changes. Cached transcripts on disk are unaffected — they're just truncated in memory at load time when oversized.
Behavioral change for resumed sessions: a resumed agent that previously sent (and provider-rejected) a 200k+-token transcript will now see only the most recent max_history_messages of that transcript on iteration 1. Older context is dropped at the wire layer, mirroring what trim_history already does for in-process histories. Logged at warn when this actually triggers.

Summary by CodeRabbit

Bug Fixes
- Session transcript history is now properly limited to the configured maximum message count when resuming conversations. This ensures that long prior conversation histories don't unnecessarily consume memory or impact performance during session restoration.
Tests
- Added tests covering transcript history bounding and session resume behavior.

Added a new method to limit the number of cached transcript messages to the configured maximum while preserving the leading system message if present. Updated the resume logic to utilize this method, ensuring that the cached messages do not exceed the defined history window. Added logging to warn when messages are trimmed.

…essages Introduced a new test to verify that the transcript resume functionality correctly limits the number of cached messages to the configured maximum, ensuring the leading system message is preserved. This test checks the integrity of the resumed messages after persisting a session transcript with more messages than the limit.

Enhanced the test suite by adding two new tests to verify the behavior of the transcript message bounding functionality. The first test ensures that the history window limit is respected when resuming messages, while the second test checks that the cached messages retain the correct tail when exceeding the maximum limit. These tests help confirm the integrity of message handling in various scenarios.

coderabbitai · 2026-05-19T15:01:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 29437015-b149-439c-831a-029a3f4efcd1

📥 Commits

Reviewing files that changed from the base of the PR and between 1f98614 and 3cee419.

📒 Files selected for processing (4)

src/openhuman/agent/harness/session/runtime.rs
src/openhuman/agent/harness/session/tests.rs
src/openhuman/agent/harness/session/turn.rs
src/openhuman/agent/harness/session/turn_tests.rs

📝 Walkthrough

Walkthrough

This PR enforces history window limits on resumed cached transcripts. A new helper method bounds ChatMessage vectors to max_history_messages while preserving system messages. The helper is integrated into session resume and seed resume paths, each with logging and tests confirming tail-message preservation.

Changes

Transcript history window bounding

Layer / File(s)	Summary
Bounded transcript cache helper `src/openhuman/agent/harness/session/turn.rs`, `src/openhuman/agent/harness/session/tests.rs`	New `bound_cached_transcript_messages` method limits `ChatMessage` vectors to `max_history_messages`, preserving a leading system message and otherwise keeping only the most recent messages. Unit test validates behavior when input lacks system prefix.
Session transcript resume bounding `src/openhuman/agent/harness/session/turn.rs`, `src/openhuman/agent/harness/session/turn_tests.rs`	`try_load_session_transcript` applies the bounding helper to loaded `session.messages` and logs when truncation occurs. Integration test persists an overlong transcript, resumes with `max_history_messages=5`, and asserts the cached window contains system message plus last two user/assistant pairs.
Seed resume context bounding `src/openhuman/agent/harness/session/runtime.rs`, `src/openhuman/agent/harness/session/tests.rs`	`seed_resume_from_messages` bounds the prepared cached transcript before storage and logs with before/after counts when history is reduced. Unit test configures the history limit, seeds multiple prior turns, and asserts truncation preserves system message and tail turns only.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 When old transcripts seek to return,
We trim their tales at history's turn,
System whispers stay pristine and clear,
While recent echoes ring most dear,
Messages bounded by the cap,
No lengthy past can close that gap!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: bounding cached resume transcripts by the max_history_messages configuration, which is the primary objective of the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CodeGhost21

Non-blocking review — fix is sound, scoped, and tested. Two observations worth surfacing:

1. Message-count bound vs. token-count root cause. The Sentry case is a 203,783 / 202,752 token overflow. With any non-trivial max_history_messages, a handful of large tool-result messages can still exceed the context window on iteration 1, because reduce_before_call only fires on iteration 2+. This change reduces the failure rate substantially but doesn't fully eliminate the class of bug. The PR description acknowledges this and defers a token-aware bound — worth filing a follow-up issue for that path if one doesn't already exist, so it doesn't get lost.

2. max_history_messages = 1 + system prefix is degenerate. With max=1 and a leading system message, bound_cached_transcript_messages returns [system] only — no user/assistant message at all, which most providers will reject with a different 400. The .max(1) floor protects against underflow but the implicit contract is really "max ≥ 2 when a system message is present." Nobody sets max=1 in practice, so this is theoretical, but a .max(2) floor in the system-prefix branch would make the helper self-consistent. Trivial change if desired.

LGTM otherwise — helper is well-placed, doc-commented for the non-obvious "why," and the symmetric application at both resume entry points is the right shape for the fix.

…inyhumansai#2224)

YellowSnnowmann added 3 commits May 19, 2026 20:25

YellowSnnowmann marked this pull request as ready for review May 19, 2026 15:52

YellowSnnowmann requested a review from a team May 19, 2026 15:52

coderabbitai Bot approved these changes May 19, 2026

View reviewed changes

CodeGhost21 self-requested a review May 19, 2026 18:08

CodeGhost21 reviewed May 19, 2026

View reviewed changes

senamakel merged commit 525d7c7 into tinyhumansai:main May 19, 2026
27 checks passed

coderabbitai Bot mentioned this pull request May 19, 2026

fix(agent): guard agent prompts against model max-context limit (#2074) #2100

Merged

12 tasks

CodeGhost21 pushed a commit to CodeGhost21/openhuman that referenced this pull request May 22, 2026

fix(agent): bound cached resume transcript by max_history_messages (t…

c5f59fd

…inyhumansai#2224)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): bound cached resume transcript by max_history_messages#2224

fix(agent): bound cached resume transcript by max_history_messages#2224
senamakel merged 3 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/agent-resume-bound-cached-transcript

YellowSnnowmann commented May 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

CodeGhost21 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

YellowSnnowmann commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

CodeGhost21 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

YellowSnnowmann commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading