Lazy watermark-based View cache on ConversationState#3335
Conversation
Refs: #3053 Alternative to the push-based callback approach in #3324. Instead of eagerly syncing the cached View via EventLog._on_append callbacks, the View is lazily brought up to date on read using an integer watermark that tracks how many events have been incorporated. Changes: - state.py: _view, _view_watermark, view property, rebuild_view() - agent.py: rebuild_view() in step/astep malformed-history handlers - local_conversation.py: rebuild_view() after fork event copy - condenser/base.py: docstring read-only warning Files NOT changed (unlike #3324): - event_store.py: no _on_append callback, no set_on_append() - local_conversation.py: _default_callback unchanged, close() unchanged, condense() unchanged — no lock-wrapping changes needed Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
||||||||||||||||||||||||||||||||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new SDK ConversationState.view cache behavior through real SDK usage; it works for incremental updates, condensation, persistence resume, fork, and malformed-history recovery.
Does this PR achieve its stated goal?
Yes. The PR set out to add a lazily-updated, watermark-maintained View cache on ConversationState plus rebuild hooks for cold load/fork/recovery; exercising the SDK directly showed the cache starts empty, catches up after real event appends, matches View.from_events, applies condensation, survives resume/fork, and is rebuilt before malformed-history condensation retry. No functional issues were found.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run python created/used the project environment and imported openhands.sdk successfully. |
| CI Status | 🟡 Most checks green, including SDK/tests/pre-commit/API checks; image publish, PR review, and this QA workflow were still in progress when checked. |
| Functional Verification | ✅ Real SDK scripts exercised the changed behavior successfully; no test suite, linter, or formatter was run. |
Functional Verification
Test 1: ConversationState.view cache, condensation, resume, and rebuild
Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py after checking out origin/main:
created_state=c88009b0-32a8-4123-8511-75e751db1308
AttributeError: 'ConversationState' object has no attribute 'view'
BASE_EXIT=1
This confirms the base branch does not expose the new cached view surface.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 91d20cba325c9ed9af689e2af3a1e520cd45a883.
Step 3 — Re-run with the PR in place:
Ran the same SDK workflow:
initial_view_events=0
after_appends_ids=0c61...,f17b...,13fa...
view_matches_full_rebuild=True
after_condensation_ids=f17b...,13fa...
first_removed=True
condensation_request_flag=True
resumed_event_count=2
rebuild_replaced_view=True
RESULT=PASS
This shows the lazy cache is initialized, catches up after appends, stays equivalent to a full rebuild, applies condensation, restores on resume, and can be explicitly rebuilt.
Test 2: Conversation fork gets a rebuilt cached view
Step 1 — Baseline:
On origin/main, the cached state.view surface is absent as shown in Test 1, so there is no forked cached view to inspect.
Step 2 — Apply the PR's changes:
Used the PR branch and created a real local Conversation, appended a message event, then called convo.fork().
Step 3 — Verify fork behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view.py:
source_view_events=1
fork_type=LocalConversation
fork_view_events=1
fork_contains_source_event=True
RESULT=PASS
This confirms a user-level forked conversation exposes a cached view containing the copied source event.
Test 3: Malformed-history recovery rebuilds a stale cached view
Step 1 — Baseline:
The base branch has no cached state.view; this recovery hook is specific to the PR's new cache.
Step 2 — Apply the PR's changes:
On the PR branch, I created a real Conversation with a custom LLM that raises LLMMalformedConversationHistoryError, deliberately corrupted the cached view, and invoked agent.step(...).
Step 3 — Verify recovery behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
corrupted_cached_view_events=0
view_after_recovery_events=2
view_contains_original_event=True
condensation_request_emitted=True
RESULT=PASS
This shows Agent.step recovered the stale cache before emitting the CondensationRequest path, matching the PR's stated error-recovery goal.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the SDK-facing lazy ConversationState.view cache behavior end-to-end via real SDK calls; the PR achieves its stated goal with no QA issues found.
Does this PR achieve its stated goal?
Yes. The PR sets out to add a lazily-updated, watermark-based View cache on ConversationState, plus rebuild behavior for cold load, fork, and malformed-history recovery. Running the SDK on the PR commit showed the new API exists and works for incremental reads, condensation projection, full rebuild parity, persistence resume, conversation fork, and malformed-history recovery (rebuild_calls: 1 before CondensationRequest).
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the uv workspace dependencies. |
| CI Status | 🟡 31 checks successful, 2 skipped, 2 automation checks still in progress at review time (pr-review, qa-changes). |
| Functional Verification | ✅ Real SDK operations passed on PR commit; baseline confirmed the new cache API was absent on origin/main. |
Functional Verification
Test 1: Establish baseline without the PR
Step 1 — Reproduce / establish baseline (without the fix):
Ran:
git checkout --quiet origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py > /tmp/qa_base_final.raw 2>&1
grep '^{"event"' /tmp/qa_base_final.rawOutput:
{"event": "sdk_import_ok"}
{"event": "api_surface", "has_view": false, "has_rebuild_view": false}
{"event": "baseline_stop", "reason": "ConversationState view cache API is unavailable"}This confirms the baseline SDK does not expose the new ConversationState.view / rebuild_view() surface, so there is no lazy view cache to exercise before the PR.
Step 2 — Apply the PR's changes:
Checked out the PR commit:
git checkout --quiet cb9d2fd0143300e4b3b07691b26694b34a08aa76Step 3 — Re-run with the fix in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py > /tmp/qa_pr_final.raw 2>&1
grep '^{"event"' /tmp/qa_pr_final.rawOutput:
{"event": "sdk_import_ok"}
{"event": "api_surface", "has_view": true, "has_rebuild_view": true}
{"event": "first_read", "view_len": 1, "matches_first": true}
{"event": "incremental_read", "same_view_instance": true, "view_len": 2}
{"event": "condensation_projection", "view_len": 2, "dropped_first": true, "kept_second": true, "kept_third": true}
{"event": "rebuild", "replaced_instance": true, "matches_from_events": true, "watermark_len": 4}
{"event": "resume", "view_len": 1, "same_event_id": true}
{"event": "fork", "source_view_len": 2, "fork_view_len": 2, "same_count": true}
{"event": "malformed_history_recovery", "rebuild_calls": 1, "emitted_condensation_request": true}This exercised the changed behavior as SDK users would: creating ConversationState, appending real events, reading state.view, applying Condensation, calling rebuild_view(), resuming persisted state, forking a Conversation, and driving Agent.step() through malformed-history recovery. The output shows the lazy cache catches up incrementally, condensation is reflected in the cached projection, rebuild matches View.from_events, cold resume and fork populate the view, and malformed-history recovery rebuilds before emitting a condensation request.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Solid engineering. The pull-based watermark is the right call — it eliminates every synchronization concern from #3324 by simply not touching the view on the write path. The implementation is minimal (123 lines vs 557), the tests are comprehensive, and the additive-only surface means no behavior change until #3328 lands.
Two issues flagged below: one about thread-safety guarantees that should be documented before the hot path is wired up in the follow-up, and one about the stale-reference semantics of rebuild_view().
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable, but the cache currently has ownership/concurrency holes that can corrupt state.
[RISK ASSESSMENT] 🟡 MEDIUM — this changes SDK conversation state / condenser cache semantics and could affect LLM history. Since there is no eval evidence in the PR, I’m leaving a COMMENT for human maintainer/eval follow-up rather than approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26195669127
- Guard view property and rebuild_view() with a threading.Lock so concurrent readers cannot double-append into the cached View. - Document in rebuild_view() that previously returned View references are invalidated after the call. - Extend test_resume_path_rebuilds_view to verify the watermark is correctly positioned after resume by appending an event and checking it appears incrementally. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
The pull-based watermark design is the right call — it eliminates all the synchronization complexity of #3324 by simply not touching the view on the write path. The implementation is minimal and clean (123 lines vs 557), the additive-only surface means no behavior change until #3328 lands, and the previous review's thread-safety concerns have been properly addressed with _view_lock.
Two new observations below: one maintenance-safety flag on the lock type, and a minor test style note.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
Functional SDK verification passed for the lazy ConversationState.view cache, including append catch-up, condensation, rebuild, resume, fork, and malformed-history recovery; one CI job is currently failing.
Does this PR achieve its stated goal?
Yes. The PR set out to add a lazily-updated, watermark-based View cache on ConversationState as additive groundwork without changing normal conversation behavior. I verified the base branch lacks state.view/rebuild_view, then exercised the PR branch through real SDK scripts: state.view tracks appended events, applies condensation, survives persistence resume, is rebuilt on demand, is rebuilt for forks, and malformed-history recovery restores a corrupted cached view before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv sync --frozen --dev completed successfully |
| CI Status | Consolidate Build Information is failing; qa-changes/pr-review are still pending |
| Functional Verification | ✅ Real SDK usage matched the PR's stated behavior |
Functional Verification
Test 1 — Base branch establishes the old behavior
Step 1 — Reproduce / establish baseline without the fix:
Ran git checkout main && uv run python - <<'PY' ... with a script that created a ConversationState, appended two MessageEvents, and attempted to use the new cache API:
baseline_manual_view_events= 2
baseline_has_state_view= False
baseline_has_rebuild_view= False
baseline_state_view_access= AttributeError
This shows the old SDK required rebuilding a View from events manually and did not expose the new cached state.view or rebuild_view() APIs.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 01dd45a60839695b2ef1fc6d2a24ce25b503d6e4.
Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python - <<'PY' ... with a script that created ConversationState instances and used the new cache API with real SDK event appends:
cache.empty_view_events 0
cache.after_two_events 2
cache.repeated_read_same_instance True
cache.after_third_events 3
cache.condensation_removed_first True
cache.condensation_kept_others True
cache.rebuild_replaced_view True
cache.rebuild_matches_full_view True
cache.resumed_view_events 2
cache.resumed_after_append_events 3
This confirms the new cached view starts empty, lazily catches up after appends, repeated reads are idempotent, condensation affects the cached view, rebuild_view() re-derives the cache to match View.from_events, and cold resume initializes the cache so later appends continue working.
Test 2 — Conversation fork rebuilds the copied view
Ran the PR branch SDK script using Conversation(..., visualizer=None), send_message(...), and fork():
fork.source_view_events 2
fork.forked_view_events 2
fork.fork_after_append_events 3
fork.source_after_fork_append_events 2
This shows a forked conversation gets a rebuilt cached view matching the source history, and subsequent fork-only events update only the fork's cache without mutating the source conversation's cache.
Test 3 — Malformed-history recovery rebuilds before condensation retry
Ran the PR branch SDK script with a real Agent.step(...) path using an LLM subclass that raises LLMMalformedConversationHistoryError and a condenser that handles condensation requests. Before stepping, I intentionally removed one event from the cached view to simulate the cache corruption the recovery path is intended to repair:
malformed.corrupted_view_events_before_step 1
malformed.view_events_when_request_emitted 2
malformed.original_view_count 2
malformed.removed_event_restored True
malformed.emitted_condensation_request True
This confirms Agent.step() rebuilt the cached view back to the full event count before emitting CondensationRequest, and the malformed-history recovery path continued instead of raising.
Issues Found
- 🟡 CI:
Consolidate Build Informationis currently failing on the PR. I did not find functional SDK issues in manual QA, but this CI failure should be resolved or confirmed unrelated before merge.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the pull-based cache is simpler than the previous push approach, but one cache/watermark race still needs tightening.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation history/condenser semantics and could affect agent/eval behavior; I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.
VERDICT: Worth merging after the bounded rebuild issue is fixed.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26233370659
- Switch _view_lock from threading.Lock to threading.RLock so future reentrant paths (e.g. ViewProperty touching state.view) cannot deadlock silently. - Snapshot events in rebuild_view() before building the View so the watermark is tied to exactly the events that were materialized, closing a theoretical TOCTOU race with concurrent appenders. - Move 'from unittest.mock import patch' to module-level in test_agent_context_window_condensation.py (was duplicated in two function bodies with no circular-import justification). Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the SDK behavior introduced by this PR by running standalone ConversationState/LocalConversation exercises; the lazy view cache, resume rebuild, fork rebuild, and malformed-history recovery path behaved as claimed.
Does this PR achieve its stated goal?
Yes. The PR sets out to add a lazily-updated, watermark-based ConversationState.view cache plus rebuild hooks for cold load/fork/error recovery. On origin/main, the same SDK exercises show ConversationState.view/rebuild_view are absent; on lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206, appended events incrementally appeared in state.view, condensation state matched View.from_events(...), resume continued incrementally after rebuilding, forked conversations had matching cached views, and malformed-history recovery called rebuild_view() before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no test suite, linter, formatter, or pre-commit run was performed. |
| CI Status | 🟡 GitHub checks showed 31 successful, 2 skipped, 2 pending (PR Review by OpenHands, QA Changes by OpenHands) at verification time. |
| Functional Verification | ✅ Standalone SDK scripts exercised the new public state cache API and affected conversation paths successfully. |
Functional Verification
Test 1: Lazy ConversationState.view cache, condensation, rebuild, and resume
Step 1 — Establish baseline without the PR:
Checked out origin/main and ran uv run python /tmp/qa_view_cache_exercise.py:
BASE_HEAD=8f406a88
state_view_property=missing
result=baseline lacks lazy view cache API
view_exit=2
This confirms the base branch does not expose the new lazy view cache API, so the PR is adding the behavior under review.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.
Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:
initial_view_event_count=3
matches_full_rebuild=True
repeated_reads_same_object=True
condensation_removed_first=True
condensed_view_event_count=2
condensation_request_flag=True
rebuild_replaced_cached_instance=True
rebuild_matches_full_rebuild=True
resume_view_event_count=2
resume_incremental_event_count=3
result=lazy view cache behavior verified
view_exit=0
This shows appended events are reflected lazily, repeated reads reuse the cached instance, condensation semantics match a full View.from_events(...) rebuild, explicit rebuild replaces the cached view with a fully-derived view, and resumed persisted state continues incremental updates correctly.
Test 2: Malformed-history recovery rebuilds the cached view
Step 1 — Establish baseline without the PR:
On origin/main, ran uv run python /tmp/qa_malformed_recovery_exercise.py:
rebuild_view_method=missing
result=baseline cannot rebuild cached view during malformed-history recovery
malformed_exit=2
This confirms the base branch cannot perform the new recovery behavior because the rebuild API does not exist there.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.
Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py with an SDK LLM subclass that raises LLMMalformedConversationHistoryError:
rebuild_view_calls=1
emitted_condensation_request=True
result=malformed-history recovery path verified
malformed_exit=0
This shows the real Agent.step(...) recovery path calls ConversationState.rebuild_view() and still emits the expected CondensationRequest.
Test 3: LocalConversation fork rebuild path
Step 1 — Establish baseline / affected path:
The PR description calls out fork creation as a cache rebuild path. I exercised the public LocalConversation.fork() flow with a conversation containing a user message.
Step 2 — Apply the PR's changes:
Ran on lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.
Step 3 — Exercise the fork path:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_cache_exercise.py:
source_view_event_count=2
fork_view_event_count=2
fork_view_matches_source=True
result=fork view cache behavior verified
This shows the forked conversation's cached view is populated and matches the source conversation after events are copied.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the pull-based cache is much simpler, but there is one cache-consistency bug to fix before this is safe to build on.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/view state and condenser recovery semantics, so maintainer/eval judgment is still appropriate before merge.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26252792628
all-hands-bot
left a comment
There was a problem hiding this comment.
The pull-based watermark is the right call — eliminating the push-based callback entirely removes every synchronization concern that plagued #3324. All 8 previous review threads are resolved in the current code, and the implementation is clean: the RLock, the snapshot approach in rebuild_view(), and the watermark-position-after-rebuild test are all solid. Two minor fresh observations below.
This review was generated by an AI agent (OpenHands) on behalf of the user via OpenHands Automation.
…ocstring - Advance _view_watermark after each successful append_event in the view property loop, so a mid-loop failure cannot leave already- appended events eligible for replay on the next read. - Document the implicit exception-safety of rebuild_view(): if View.from_events raises, the cache is left unchanged. - Decline instance-level patch.object suggestion: Pydantic's __setattr__ prevents setting arbitrary attributes on model instances, so class-level patching is the correct approach here. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new ConversationState view cache through real SDK usage: lazy reads, condensation, persistence resume, rebuild recovery, malformed-history recovery, and conversation fork all behaved as described.
Does this PR achieve its stated goal?
Yes. The PR set out to add a lazily-updated, watermark-maintained View cache on ConversationState as an additive groundwork change, with rebuild paths for cold load, fork, and malformed-history recovery. I exercised those SDK paths directly: main has no state.view, while this PR exposes it and keeps it in sync across appended events, condensation, resume, explicit rebuild, malformed-history recovery, and Conversation.fork().
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the uv dev environment. |
| CI Status | unresolved-review-threads failing, pr-review/qa-changes pending, cleanup/artifact review checks skipped. |
| Functional Verification | ✅ Real SDK scripts returned ok: true for PR behavior; no test suite, linters, or type checks were run locally. |
Functional Verification
Test 1: ConversationState lazy view cache, condensation, resume, and rebuild
Step 1 — Establish baseline on main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_cache_quiet.py:
{
"ok": false,
"checks": {
"has_view_attr": false
},
"error_type": "AttributeError",
"error": "'ConversationState' object has no attribute 'view'"
}This shows the cached ConversationState.view API is not present on the base branch, so the PR is adding the intended new surface.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 26711fff.
Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_cache_quiet.py:
{
"ok": true,
"checks": {
"has_view_attr": true,
"initial_view_count": 3,
"repeated_read_same_object": true,
"incremental_matches_from_events": true,
"condensation_removed_first": true,
"condensation_view_count": 2,
"condensation_request_flag": true,
"resume_view_count": 2,
"resume_incremental_count": 3,
"rebuild_replaces_view": true,
"rebuild_matches_from_events": true
}
}This confirms the new cache lazily catches up from appended events, repeated reads are idempotent, incremental state matches View.from_events, condensation is reflected, persisted resume initializes the cache, post-resume appends are tracked, and rebuild_view() replaces/re-derives the cache.
Test 2: Malformed-history recovery rebuilds the cached view before retry
Step 1 — Baseline context:
The base branch result above shows there is no cached state.view to rebuild. The PR-specific claim is that malformed-history recovery refreshes the new cache before emitting a condensation retry.
Step 2 — Apply the PR's changes:
Used lazy-view-watermark at 26711fff with a real Agent.step() call and a local LLM subclass that raises LLMMalformedConversationHistoryError.
Step 3 — Exercise recovery with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_agent_malformed_recovery_quiet.py:
{
"ok": true,
"checks": {
"condensation_request_emitted": true,
"cached_view_rebuilt": true
},
"seen_events": [
"CondensationRequest"
]
}This confirms the malformed-history path emits the expected retry request and invalidates/rebuilds the cached view before continuing.
Test 3: Conversation fork preserves and continues the cached view
Step 1 — Baseline context:
As in Test 1, main has no state.view, so forked cached-view behavior is additive in this PR.
Step 2 — Apply the PR's changes:
Used lazy-view-watermark at 26711fff and exercised Conversation.send_message() followed by Conversation.fork().
Step 3 — Exercise fork with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_conversation_fork_view_quiet.py:
{
"ok": true,
"checks": {
"source_view_had_message": true,
"fork_view_rebuilt_from_source": true,
"fork_view_tracks_new_message": true,
"fork_view_matches_full_rebuild": true
}
}This confirms a forked conversation receives a rebuilt view from copied events and continues tracking new events consistently with a full View.from_events rebuild.
Issues Found
None from functional QA.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable, but there’s one cache-consistency hole in rebuild watermarking (inline).
[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/condenser state and could affect LLM message construction once state.view is wired into the hot path, so I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26254102500
Covers the gap where test_view_matches_full_rebuild only used MessageEvents. The new test asserts state.view == View.from_events on a mixed sequence that includes a Condensation event. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the pull-based design is clean, but one cold-load watermark consistency bug needs attention.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/view cache semantics and the future condenser hot path, so maintainer/eval judgment is still appropriate before merge.
VERDICT: Worth merging after the inline watermark bug is fixed.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26297602298
The snapshot was added to guard a theoretical TOCTOU race between from_events iteration and watermark assignment, but rebuild_view holds the view lock and event appends happen on the same thread. Passing self._events directly is simpler and avoids the copy. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR delivers the claimed lazy watermark-based ConversationState.view cache: I exercised the SDK as a caller, verified incremental cache updates, persistence resume, fork behavior, and malformed-history recovery.
Does this PR achieve its stated goal?
Yes. On main, ConversationState has no cached view / rebuild_view API; on the PR commit, the same SDK operations expose the new API, update the view incrementally after appended events, apply condensation, rebuild with enforcement on demand, cold-load persisted state, and keep forked conversations' cached view in sync. I also triggered a real Agent.step() malformed-history path and observed rebuild_view() being called once before a CondensationRequest was emitted.
| Phase | Result |
|---|---|
| Environment Setup | ✅ uv run python created/used the project env and imported the SDK successfully; no test suite was run. |
| CI Status | ✅ Existing completed checks are green; the current PR review/QA automation checks were still in progress when checked. |
| Functional Verification | ✅ SDK scripts exercised the changed behavior end-to-end on base vs PR commit. |
Functional Verification
Test 1: ConversationState.view cache, condensation, rebuild, and persistence resume
Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:
STATE_VIEW_PRESENT=False
STATE_REBUILD_VIEW_PRESENT=False
BASELINE_RESULT=ConversationState has no cached view API to exercise
This confirms the base branch does not provide the new cached view surface.
Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:
STATE_VIEW_PRESENT=True
STATE_REBUILD_VIEW_PRESENT=True
INCREMENTAL_VIEW_COUNT=2
REPEATED_READ_SAME_INSTANCE=True
INCREMENTAL_ENFORCE_CALLS=0
CONDENSED_VIEW_COUNT=1
CONDENSED_CONTAINS_ALPHA=False
CONDENSED_CONTAINS_BETA=True
REBUILD_REPLACED_INSTANCE=True
REBUILD_ENFORCE_CALLS=1
REBUILD_VIEW_COUNT=1
RESUMED_VIEW_COUNT=1
RESUMED_AFTER_APPEND_VIEW_COUNT=2
This shows the PR adds the API, lazy reads catch up to appended events without invoking full property enforcement, condensation is reflected in the cached view, rebuild_view() replaces the cached instance and runs enforcement, and resumed persisted state has a usable cache/watermark for future appends.
Test 2: malformed-history recovery rebuilds the cache before requesting condensation
Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py:
STATE_REBUILD_VIEW_PRESENT=False
BASELINE_EVENTS=['CondensationRequest']
BASELINE_RESULT=Malformed-history recovery emits CondensationRequest but has no cached view to rebuild
This confirms the existing recovery path emits the condensation request, but there is no cached view API to refresh.
Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py:
STATE_REBUILD_VIEW_PRESENT=True
REBUILD_VIEW_CALLS=1
EVENTS=['CondensationRequest']
CONDENSATION_REQUEST_EMITTED=True
This shows a real Agent.step() call through a malformed-history LLM error invokes rebuild_view() once and still emits the expected CondensationRequest.
Test 3: conversation fork initializes and maintains the forked cached view
Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_exercise.py:
SOURCE_EVENTS=2
SOURCE_VIEW_PRESENT=False
FORK_EVENTS=2
FORK_VIEW_PRESENT=False
This confirms fork behavior existed, but without the cached view API.
Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_exercise.py:
SOURCE_EVENTS=2
SOURCE_VIEW_PRESENT=True
FORK_EVENTS=2
FORK_VIEW_PRESENT=True
FORK_VIEW_COUNT=2
FORK_VIEW_AFTER_APPEND_COUNT=3
This confirms a forked conversation starts with a rebuilt cached view matching copied events and continues to update after new fork-local user messages.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review
The pull-based watermark design is the right call — completely removing the push-based _on_append callback eliminates every synchronization concern from #3324, and the resulting diff is dramatically simpler (123 lines added vs 557). The core view property is elegant: O(k) incremental updates on reads, full enforcement only on the three well-defined cold paths (cold load, fork, error recovery), and the watermark update inside the loop is correctly defensive (if append_event raises on event i, the watermark stays at i so the next read retries rather than silently skipping). One suggestion below on the lock type; everything else looks solid.
This review was generated by an AI agent (OpenHands) on behalf of the reviewer, via OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the pull-based design is clean, but the bounded rebuild watermark issue below should be fixed before building on this cache.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation history/condenser cache semantics and can affect future LLM message construction, so I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26302652293
all-hands-bot
left a comment
There was a problem hiding this comment.
Clean, well-motivated PR. The pull-based watermark pattern eliminates every synchronization concern from #3324 and cuts the diff from 557 → 123 lines. The implementation is correct, the test coverage is thorough (12 + 4 new tests), and the PR description is one of the clearest I've seen on a caching change. Three points worth addressing before #3328 lands (two are documentation gaps, one is a missing test case).
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR’s lazy ConversationState.view cache works in real SDK usage across append, condensation, rebuild, persistence resume, fork, and malformed-history recovery paths.
Does this PR achieve its stated goal?
Yes. The base branch has no ConversationState.view, while the PR branch exposes a cached view that incrementally reflects appended events, applies condensation, survives cold resume with the watermark positioned correctly, and is rebuilt before malformed-history condensation recovery. I exercised these behaviors by importing and using the SDK directly with real ConversationState, Conversation, EventLog, View, and Agent.step() calls; no test suite, linter, or formatter was run.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the uv-managed environment. |
| CI Status | ✅ 31 checks passing; only this QA/pr-review automation currently pending, with expected skipped artifact cleanup jobs. |
| Functional Verification | ✅ SDK behavior matched the PR claims in before/after branch runs. |
Functional Verification
Test 1: ConversationState.view lazy cache, condensation, rebuild, resume, and fork
Step 1 — Establish baseline without the PR:
Ran on origin/main:
git switch --detach origin/main
uv run python -u /tmp/qa_state_view_cache.pyRelevant output:
initial_append: event_log_count=3
after_initial_append: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: view_available=False
This confirms the base branch has no cached view surface to maintain.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR in place:
Ran:
uv run python -u /tmp/qa_state_view_cache.pyRelevant output:
initial_append: event_log_count=3
after_initial_append: view_count=3 event_log_count=3 unhandled_condensation_request=False
after_initial_append: view_event_texts=['first', 'second', 'third']
repeated_read_same_instance=True
after_condensation: view_count=2 event_log_count=4 unhandled_condensation_request=False
after_condensation: view_event_texts=['second', 'third']
after_condensation_request: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_rebuild: rebuilt_matches_fresh=True rebuilt_count=2 fresh_count=2
after_resume: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_resume_append: view_count=3 event_log_count=6 unhandled_condensation_request=True
conversation_fork: view_count=2 event_log_count=2 unhandled_condensation_request=False
conversation_fork: view_event_texts=['SystemPromptEvent', 'fork seed']
conversation_fork: view_available=True
This shows the new state.view catches up to appended events, repeated reads return the same cached instance when unchanged, condensation removes forgotten events, condensation requests update the view flag, rebuild_view() matches a fresh View.from_events(...), resumed persisted state continues incrementally after cold-load rebuild, and a forked conversation has a usable rebuilt view.
Test 2: Malformed-history recovery rebuilds the cached view before retry
Step 1 — Establish baseline without the PR:
Ran on origin/main:
git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
rebuild_view_available=False
rebuild_view_call_count=unavailable
condensation_request_seen=True
events_emitted=['CondensationRequest']
This confirms the old branch can emit the condensation retry but has no cached-view rebuild hook.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
rebuild_view_available=True
rebuild_view_call_count=1
condensation_request_seen=True
events_emitted=['CondensationRequest']
This verifies the actual Agent.step() malformed-history path still emits CondensationRequest and now calls rebuild_view() exactly once before retry handling.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR’s lazy ConversationState.view cache works in real SDK usage across append, condensation, rebuild, persistence resume, fork, and malformed-history recovery paths.
Does this PR achieve its stated goal?
Yes. The base branch has no ConversationState.view, while the PR branch exposes a cached view that incrementally reflects appended events, applies condensation, survives cold resume with the watermark positioned correctly, and is rebuilt before malformed-history condensation recovery. I exercised these behaviors by importing and using the SDK directly with real ConversationState, Conversation, EventLog, View, and Agent.step() calls; no test suite, linter, or formatter was run.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the uv-managed environment. |
| CI Status | ✅ 31 checks passing; only this QA/pr-review automation currently pending, with expected skipped artifact cleanup jobs. |
| Functional Verification | ✅ SDK behavior matched the PR claims in before/after branch runs. |
Functional Verification
Test 1: ConversationState.view lazy cache, condensation, rebuild, resume, and fork
Step 1 — Establish baseline without the PR:
Ran on origin/main:
git switch --detach origin/main
uv run python -u /tmp/qa_state_view_cache.pyRelevant output:
initial_append: event_log_count=3
after_initial_append: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: view_available=False
This confirms the base branch has no cached view surface to maintain.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR in place:
Ran:
uv run python -u /tmp/qa_state_view_cache.pyRelevant output:
initial_append: event_log_count=3
after_initial_append: view_count=3 event_log_count=3 unhandled_condensation_request=False
after_initial_append: view_event_texts=['first', 'second', 'third']
repeated_read_same_instance=True
after_condensation: view_count=2 event_log_count=4 unhandled_condensation_request=False
after_condensation: view_event_texts=['second', 'third']
after_condensation_request: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_rebuild: rebuilt_matches_fresh=True rebuilt_count=2 fresh_count=2
after_resume: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_resume_append: view_count=3 event_log_count=6 unhandled_condensation_request=True
conversation_fork: view_count=2 event_log_count=2 unhandled_condensation_request=False
conversation_fork: view_event_texts=['SystemPromptEvent', 'fork seed']
conversation_fork: view_available=True
This shows the new state.view catches up to appended events, repeated reads return the same cached instance when unchanged, condensation removes forgotten events, condensation requests update the view flag, rebuild_view() matches a fresh View.from_events(...), resumed persisted state continues incrementally after cold-load rebuild, and a forked conversation has a usable rebuilt view.
Test 2: Malformed-history recovery rebuilds the cached view before retry
Step 1 — Establish baseline without the PR:
Ran on origin/main:
git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
rebuild_view_available=False
rebuild_view_call_count=unavailable
condensation_request_seen=True
events_emitted=['CondensationRequest']
This confirms the old branch can emit the condensation retry but has no cached-view rebuild hook.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.
Step 3 — Re-run with the PR in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
rebuild_view_available=True
rebuild_view_call_count=1
condensation_request_seen=True
events_emitted=['CondensationRequest']
This verifies the actual Agent.step() malformed-history path still emits CondensationRequest and now calls rebuild_view() exactly once before retry handling.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Taste Rating: Good taste — the pull-based cache keeps the design simple, and I don't see fresh actionable code issues in the current diff.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser semantics and may affect future LLM message construction/eval behavior. I’m leaving a COMMENT rather than an approval so a human maintainer can decide after appropriate eval validation.
VERDICT: No code changes requested from this pass.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26304956030
- view property: catch append_event failures and fall back to a full rebuild via View.from_events, so callers never see a stuck view. - view property docstring: cross-reference rebuild_view() invalidation (relevant once #3328 lands and callers hold view references across condensation cycles). - Add test_rebuild_view_leaves_cache_unchanged_on_error: verifies the documented contract that a failed rebuild preserves the old cache. Co-authored-by: openhands <openhands@all-hands.dev>
|
Note to the review bot: this PR changes how views are maintained, but the semantics by which they're constructed/used has not. Purely a performance change. If the tests are all still passing that is sufficient evidence. |
all-hands-bot
left a comment
There was a problem hiding this comment.
The pull-based watermark is the right architectural choice, and the implementation is clean. All 21 prior review threads are resolved; a fresh pass over the current diff turns up two minor points worth noting (both non-blocking).
This review was generated by an AI agent (OpenHands) on behalf of the reviewer via OpenHands Automation.
Without this, the bare except silently swallows the root cause, making unexpected append_event failures undiagnosable in production. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR delivers the lazy watermark-based ConversationState.view cache behavior described, with persistence/fork rebuilds and sync/async malformed-history recovery verified by running SDK scripts against real SDK objects.
Does this PR achieve its stated goal?
Yes. On origin/main, ConversationState.view is absent; on the PR branch, reading state.view after appending events lazily exposes the expected events, matches a full View.from_events(...) rebuild after condensation, and rebuild_view() refreshes the cached instance. I also verified cold resume, conversation fork, and both Agent.step()/Agent.astep() malformed-history recovery restore a deliberately stale cached view before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv dev environment. |
| CI Status | unresolved-review-threads was failing. |
| Functional Verification | ✅ SDK scripts exercised the changed behavior directly; no test suite, linter, formatter, or pre-commit run was used. |
Functional Verification
Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental view behavior
Step 1 — Establish baseline without the fix:
Ran:
git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# then switched back to lazy-view-watermarkRelevant output:
ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0
This confirms the base branch did not expose the new ConversationState.view surface.
Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at f550790d58668265101797a00baa4293cdd198bc.
Step 3 — Re-run with the fix in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.pyRelevant output:
ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2
This shows the cache lazily catches up after appends, applies condensation consistently with a full rebuild, and rebuild_view() refreshes the cached projection.
Test 2: Cold resume and conversation fork rebuild the cached view
Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.
Step 2 — Run PR behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.pyRelevant output:
Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True
resume_fork_exit=0
This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.
Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry
Step 1 — Baseline:
The base branch lacks state.view, so the cached-view recovery behavior does not exist there.
Step 2 — Run PR sync recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
Cached view count after deliberate stale mutation: 0
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True
malformed_exit=0
This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.
Step 3 — Run PR async recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.pyRelevant output:
Async cached view count after stale mutation: 0
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True
async_malformed_exit=0
This confirms async parity for Agent.astep().
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The current PR head delivers the lazy watermark-based ConversationState.view cache behavior described, verified through real SDK usage rather than the test suite.
Does this PR achieve its stated goal?
Yes. The PR aims to add a pull-based, lazily updated View cache to ConversationState, with rebuilds on cold load/fork/error recovery. I verified origin/main lacks ConversationState.view, then on the current PR head (86d75dfa) verified lazy event catch-up, condensation parity with View.from_events(...), explicit rebuild behavior, persistence resume, conversation fork, and both sync/async malformed-history recovery restoring a deliberately stale cached view before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv dev environment. |
| CI Status | unresolved-review-threads was failing. |
| Functional Verification | ✅ SDK scripts exercised the changed behavior directly; I did not run pytest, linters, formatters, type checkers, or pre-commit. |
Functional Verification
Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental behavior
Step 1 — Establish baseline without the fix:
Ran:
git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# switched back to lazy-view-watermark afterwardRelevant output:
ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0
This confirms the base branch did not expose the new ConversationState.view surface.
Step 2 — Apply the PR's changes:
Fetched and checked out current PR head 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.
Step 3 — Re-run with the fix in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.pyRelevant output:
ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2
This shows the cache lazily catches up after event appends, preserves condensation semantics, and rebuild_view() refreshes the cached projection.
Test 2: Cold resume and conversation fork rebuild the cached view
Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.
Step 2 — Run PR behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.pyRelevant output:
Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True
This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.
Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry
Step 1 — Baseline:
The base branch lacks state.view, so the cache-specific recovery behavior does not exist there.
Step 2 — Run PR sync recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
Cached view count after deliberate stale mutation: 0
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True
This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.
Step 3 — Run PR async recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.pyRelevant output:
Async cached view count after stale mutation: 0
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True
This confirms async parity for Agent.astep().
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The current PR head delivers the lazy watermark-based ConversationState.view cache behavior described, verified through real SDK usage rather than the test suite.
Does this PR achieve its stated goal?
Yes. The PR aims to add a pull-based, lazily updated View cache to ConversationState, with rebuilds on cold load/fork/error recovery. I verified origin/main lacks ConversationState.view, then on the current PR head (86d75dfa) verified lazy event catch-up, condensation parity with View.from_events(...), explicit rebuild behavior, persistence resume, conversation fork, and both sync/async malformed-history recovery restoring a deliberately stale cached view before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv dev environment. |
| CI Status | unresolved-review-threads was failing. |
| Functional Verification | ✅ SDK scripts exercised the changed behavior directly; I did not run pytest, linters, formatters, type checkers, or pre-commit. |
Functional Verification
Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental behavior
Step 1 — Establish baseline without the fix:
Ran:
git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# switched back to lazy-view-watermark afterwardRelevant output:
ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0
This confirms the base branch did not expose the new ConversationState.view surface.
Step 2 — Apply the PR's changes:
Fetched and checked out current PR head 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.
Step 3 — Re-run with the fix in place:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.pyRelevant output:
ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2
This shows the cache lazily catches up after event appends, preserves condensation semantics, and rebuild_view() refreshes the cached projection.
Test 2: Cold resume and conversation fork rebuild the cached view
Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.
Step 2 — Run PR behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.pyRelevant output:
Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True
This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.
Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry
Step 1 — Baseline:
The base branch lacks state.view, so the cache-specific recovery behavior does not exist there.
Step 2 — Run PR sync recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.pyRelevant output:
Cached view count after deliberate stale mutation: 0
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True
This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.
Step 3 — Run PR async recovery behavior:
Ran:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.pyRelevant output:
Async cached view count after stale mutation: 0
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True
This confirms async parity for Agent.astep().
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
Review summary for the botWe have been through 7 review rounds with 24 total threads. Here is where things stand, for context on the next review pass. Accepted & implemented (14 threads, 58%)All of these improved the code and are reflected in the current diff:
Declined with rationale (5 threads, 21%)These were evaluated and declined for documented reasons:
Circular / contradictory (5 threads, 21%)These were re-raises of previously addressed or declined feedback:
Current state
The implementation is stable. We would appreciate an approval if there are no further substantive concerns. This comment was generated by an AI agent (OpenHands) on behalf of the PR author. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Taste Rating: Good taste — the pull-based cache keeps the design simple, and I don’t see fresh actionable code issues in the current diff.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser history semantics and could affect future LLM message construction/eval behavior once wired into the hot path. I’m leaving a COMMENT rather than approval so a human maintainer can decide after appropriate eval validation.
VERDICT: No code changes requested from this pass.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26305908143
|
Review bot: I'm a maintainer, and I know the risk is low. These are not changes that will impact performance of the agent, these changes are there to improve the efficiency of the harness. No evals are necessary -- if the tests pass, then we're good. |
all-hands-bot
left a comment
There was a problem hiding this comment.
The pull-based watermark design is the right call, and the implementation at HEAD is clean and correct. All 24 review threads across eight prior rounds are now resolved.
A fresh read of the current diff confirms the key properties hold:
viewproperty:_view_watermarkadvances only on a successfulappend_event; when that fails, theexcepthandler logs withexc_info=Trueand falls back to a fullView.from_eventsrebuild. If the rebuild itself fails, the exception propagates and neither_viewnor_view_watermarkis mutated — clean exception safety in both code paths.rebuild_view(): assignsself._viewbefore advancingself._view_watermarkinside the lock; iffrom_eventsraises, neither field is mutated and the existing cache continues to serve callers.- Cold-load path in
create(): callsrebuild_view()after attaching theEventLog, so resumed states start with a fully-enforced view. - Malformed-history recovery: both
step()andastep()callrebuild_view()before emittingCondensationRequest, ensuring the condenser sees a clean view on the retry. condenser/base.pydocstring: correctly contracts thatviewis read-only and explains why, directing implementations to return a newViewor aCondensation.
Test coverage is solid: 12 focused unit tests in test_state_view_cache.py — including the new test_rebuild_view_leaves_cache_unchanged_on_error that directly verifies the exception-safety guarantee — plus 4 agent-level tests covering both sync and async malformed-history paths. No gaps visible.
The author's decisions to decline the class-level patching suggestion (Pydantic's __setattr__ prevents instance-level patching for PrivateAttr fields) and the bounded-watermark hole flag (concurrent append during rebuild is not a typical usage scenario for the three call sites: cold load, fork, recovery) are both reasonable.
Ready to merge once the unresolved-review-threads CI gate clears.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR delivers the claimed lazy ConversationState.view cache behavior across direct SDK use, persistence resume, fork, and malformed-history recovery paths.
Does this PR achieve its stated goal?
Yes. On main, ConversationState has no view or rebuild_view API and users must rebuild a View manually from events. On the PR commit, SDK scripts that create real conversations show state.view lazily reflects appended events, condensation, persisted resume, and forked conversations; sync and async malformed-history recovery each call rebuild_view() before emitting a CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the uv-managed dev environment |
| CI Status | unresolved-review-threads), 2 pending automation checks, 4 skipped at review time |
| Functional Verification | ✅ Real SDK scripts exercised the changed library behavior without running the test suite |
Functional Verification
Test 1: Baseline vs PR lazy ConversationState.view API
Step 1 — Establish baseline on main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_behavior.py:
has_state_view False
has_rebuild_view False
manual_view_event_count 1
manual_view_first_text baseline-one
This shows the base branch does not expose the new cached view API; a developer can only derive a view manually with View.from_events(...).
Step 2 — Apply the PR's changes:
Checked out 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_behavior.py and extracted the behavior markers:
has_state_view True
has_rebuild_view True
initial_view_event_count 0
same_view_object_after_append True
after_append_event_texts ['one', 'two', 'three']
after_condense_event_texts ['two', 'three']
condensation_removed_first True
condensation_request_flag True
rebuild_replaced_view True
rebuilt_matches_full True
This confirms the PR adds the new public surface and that the cached view updates after direct event appends, applies condensation, records condensation requests, and can be fully rebuilt to match View.from_events(...).
Test 2: Cold-load resume and forked conversations
Step 1 — Baseline:
The same main run above showed no state.view API to inspect after resume/fork.
Step 2 — Apply the PR's changes:
Used the same PR checkout as Test 1.
Step 3 — Exercise persistence and fork paths:
The PR script created a persisted ConversationState, resumed it by ID, appended another event after resume, then created a real Conversation, sent a message, and forked it:
resume_view_event_texts ['persisted']
resume_after_append_texts ['persisted', 'after-resume']
fork_view_event_types ['SystemPromptEvent', 'MessageEvent']
fork_view_message_texts ['fork-source']
This confirms the cold-load/rebuild path produces an immediately usable cached view and that forked conversations expose the copied event history through forked.state.view.
Test 3: Malformed-history recovery rebuilds the cached view before retry
Step 1 — Baseline:
On main, there is no ConversationState.rebuild_view API, so malformed-history recovery cannot refresh a cached ConversationState.view because the cache does not exist yet.
Step 2 — Apply the PR's changes:
Used the same PR checkout as Test 1.
Step 3 — Exercise sync and async agent paths:
Ran custom SDK scripts with an LLM subclass that raises LLMMalformedConversationHistoryError through both completion and responses APIs:
$ OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py > /tmp/qa_malformed_recovery.out 2>&1; grep 'malformed_recovery_case' /tmp/qa_malformed_recovery.out
malformed_recovery_case completion rebuild_calls 1 condensation_requests 1
malformed_recovery_case responses rebuild_calls 1 condensation_requests 1
$ OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_async.py > /tmp/qa_malformed_recovery_async.out 2>&1; grep 'malformed_recovery_async_case' /tmp/qa_malformed_recovery_async.out
malformed_recovery_async_case completion rebuild_calls 1 condensation_requests 1
malformed_recovery_async_case responses rebuild_calls 1 condensation_requests 1
This confirms both Agent.step() and Agent.astep() rebuild the cached view once and then emit the condensation retry request when malformed history is reported.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Taste Rating: Good taste — the pull-based watermark cache keeps the design simple, and the current fixes address the prior cache consistency concerns I checked.
[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser history semantics and may affect future LLM message construction/eval behavior once wired into the hot path. I’m leaving a COMMENT rather than approval so a human maintainer can decide after appropriate eval validation.
VERDICT: No code changes requested from this pass.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26306623251
Refs: #3053
Replaces: #3324
Follow-up: #3328
Summary
Adds a lazily-updated, incrementally-maintained
Viewcache toConversationState, laying the groundwork for #3053. This is a drop-in replacement for #3324 that eliminates the synchronization complexity that was causing repeated review cycles.The key difference: pull instead of push
PR #3324 used a push-based callback on
EventLog._on_appendto eagerly update the cached View on every write. That design coupledEventLogtoViewlifecycle management and created lock-ordering, callback-timing, and reentrancy problems that surfaced across 6+ review rounds (19 threads, 4 still unresolved).This PR uses a pull-based watermark: the
Viewis lazily brought up to date whenstate.viewis read, by replaying only the events appended since the last read.This eliminates every synchronization concern from #3324:
synced_countdiverges from_sync_from_disklen(events)EventLog.append()bypasses cacheFIFOLockreentrancy incondense()/arun()_wire_view_sync()orderingclose()needs state-lock wrappingcondense()event emission moved inside lockComparison
EventLogchanges_on_append,set_on_append,synced_count_default_callbackchangedclose()/condense()changedenforce_propertieson hot pathChanges
openhands-sdk/openhands/sdk/conversation/state.py_view: Viewand_view_watermark: int— derived, not serialized.viewread-only property — lazily catches up via watermark.rebuild_view()— fullView.from_eventsre-derivation for cold load, fork, error recovery.create()resume path callsrebuild_view()after attaching theEventLog.openhands-sdk/openhands/sdk/agent/agent.pystep()andastep()malformed-history handlers callstate.rebuild_view()before emittingCondensationRequest.openhands-sdk/openhands/sdk/conversation/impl/local_conversation.pyfork()callsfork_conv._state.rebuild_view()after deep-copying events.openhands-sdk/openhands/sdk/context/condenser/base.pyFiles NOT changed (unlike #3324)
event_store.py— no_on_appendcallback, noset_on_append()._default_callback— stays asself._state.events.append(e).close()— no lock-wrapping changes.condense()— no lock-wrapping changes.Tests
tests/sdk/conversation/test_state_view_cache.py(12 tests): empty view, append tracking, parity withView.from_events, condensation, hot-pathenforce_propertiesavoidance, rebuild, idempotent reads, persistence resume.tests/sdk/agent/test_agent_context_window_condensation.py(4 new tests):rebuild_viewcalled on malformed-history recovery in bothstep()andastep().tests/sdk/conversation/,tests/sdk/agent/,tests/sdk/context/.Backward compatibility
ConversationState.view,ConversationState.rebuild_view.Agent.stepstill callsView.from_events(state.events)viaprepare_llm_messages. The cached view is maintained but unused on the hot path — the follow-up PR (Consume cached View in prepare_llm_messages #3328) swaps the call site.Follow-up
#3328 swaps
Agent.stepto read fromstate.viewinstead of rebuilding from scratch, delivering the actual perf win. That PR will need to be rebased onto this branch once this merges.This PR was created by an AI agent (OpenHands) on behalf of @csmith49.
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:4e78239-pythonRun
All tags pushed for this build
About Multi-Architecture Support
4e78239-python) is a multi-arch manifest supporting both amd64 and arm644e78239-python-amd64) are also available if needed