fix(core): sanitize orphaned tool_use/tool_result on history restore (#1360) by bug-ops · Pull Request #1362 · bug-ops/zeph

bug-ops · 2026-03-08T21:27:42Z

Summary

Cross-session history restore could produce invalid tool_use/tool_result sequences at history boundaries, causing Claude API 400 errors on session resume
Add sanitize_tool_pairs() post-load sanitization in load_history() that removes orphaned tool messages at both ends of restored history
6 unit tests added covering all boundary conditions

Root causes

RC-1: load_history_filtered() LIMIT clause can split a tool_use/tool_result pair at the boundary, leaving an orphaned tool_use as the last restored message.

RC-2: Session interruption (Ctrl+C, timeout, crash) between persisting the assistant tool_use message and the user tool_result message leaves an orphaned tool_use in SQLite. On next session restore this triggers a Claude API 400.

Changes

crates/zeph-core/src/agent/persistence.rs: Add private sanitize_tool_pairs(messages: &mut Vec<Message>) -> usize. Called in load_history() after the loading loop on the just-loaded slice (split off from self.messages to exclude the system prompt). The function loops until stable, removing:
1. Trailing assistant messages that have ToolUse parts with no following user ToolResult
2. Leading user messages that have ToolResult parts with no preceding assistant ToolUse
  Each removal logs tracing::warn! with affected tool IDs.

Tests

6 new unit tests in agent::persistence::tests:

Test	Scenario
`load_history_removes_trailing_orphan_tool_use`	Trailing orphan removed
`load_history_removes_leading_orphan_tool_result`	Leading orphan removed
`load_history_preserves_complete_tool_pairs`	Valid pair preserved
`load_history_handles_multiple_trailing_orphans`	Multiple consecutive orphans removed
`load_history_no_tool_messages_unchanged`	Plain messages pass through
`load_history_removes_both_leading_and_trailing_orphans`	Loop handles both ends in one call

Test plan

cargo +nightly fmt --check passes
cargo clippy --workspace --features full -- -D warnings passes
cargo nextest run --workspace --features full --lib --bins passes (4693 tests)
All 10 load_history tests pass
config_default_snapshot failure is pre-existing on main (confirmed), unrelated to this PR

Notes

No SQL queries modified
O(n) remove(0) for leading orphan removal is acceptable given history_limit bounds; can be optimized with VecDeque in a future pass (R-1)
Mid-sequence orphan detection (RC-4) deferred as separate low-severity issue (R-2)

Closes #1360

…1360) Cross-session history restore could produce invalid tool_use/tool_result sequences at history boundaries, causing Claude API 400 errors. Two root causes: - RC-1: load_history_filtered() LIMIT clause can split a tool_use/tool_result pair at the boundary, leaving an orphaned tool_use as the last restored message. - RC-2: Session interruption between persisting the assistant tool_use message and the user tool_result message leaves an orphaned tool_use in SQLite. Add sanitize_tool_pairs() called in load_history() after the loading loop. The function loops until stable, removing: 1. Trailing assistant messages that have ToolUse parts with no following user message containing ToolResult parts. 2. Leading user messages that have ToolResult parts with no preceding assistant message containing ToolUse parts. Each removal is logged via tracing::warn with the affected tool IDs. Add 6 unit tests covering all cases: trailing orphan, leading orphan, complete pair preserved, multiple consecutive orphans, plain messages unchanged, and combined leading+trailing orphans in one history.

crates/zeph-core/src/agent/persistence.rs

bug-ops added 2 commits March 8, 2026 22:25

merge: sync with main, resolve CHANGELOG conflict

2c2e264

github-actions bot added bug Something isn't working size/L Large PR (201-500 lines) documentation Improvements or additions to documentation rust Rust code changes core zeph-core crate and removed size/L Large PR (201-500 lines) labels Mar 8, 2026

bug-ops enabled auto-merge (squash) March 8, 2026 21:28

github-advanced-security bot found potential problems Mar 8, 2026

View reviewed changes

crates/zeph-core/src/agent/persistence.rs Dismissed Show dismissed Hide dismissed

bug-ops merged commit e356654 into main Mar 8, 2026
40 of 42 checks passed

bug-ops deleted the fix-1360-history-restore branch March 8, 2026 21:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): sanitize orphaned tool_use/tool_result on history restore (#1360)#1362

fix(core): sanitize orphaned tool_use/tool_result on history restore (#1360)#1362
bug-ops merged 2 commits intomainfrom
fix-1360-history-restore

bug-ops commented Mar 8, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bug-ops commented Mar 8, 2026

Summary

Root causes

Changes

Tests

Test plan

Notes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant