Lazy watermark-based View cache on ConversationState by csmith49 · Pull Request #3335 · OpenHands/software-agent-sdk

csmith49 · 2026-05-20T22:49:50Z

Refs: #3053
Replaces: #3324
Follow-up: #3328

Summary

Adds a lazily-updated, incrementally-maintained View cache to ConversationState, laying the groundwork for #3053. This is a drop-in replacement for #3324 that eliminates the synchronization complexity that was causing repeated review cycles.

The key difference: pull instead of push

PR #3324 used a push-based callback on EventLog._on_append to eagerly update the cached View on every write. That design coupled EventLog to View lifecycle management and created lock-ordering, callback-timing, and reentrancy problems that surfaced across 6+ review rounds (19 threads, 4 still unresolved).

This PR uses a pull-based watermark: the View is lazily brought up to date when state.view is read, by replaying only the events appended since the last read.

@property
def view(self) -> View:
    n = len(self._events)
    if n > self._view_watermark:
        for i in range(self._view_watermark, n):
            self._view.append_event(self._events[i])
        self._view_watermark = n
    return self._view

This eliminates every synchronization concern from #3324:

Concern from #3324	Why it disappears here
Callback fires inside file lock → deadlock risk	No callbacks
`synced_count` diverges from `_sync_from_disk`	Watermark just checks `len(events)`
Direct `EventLog.append()` bypasses cache	Watermark detects any new events automatically
`FIFOLock` reentrancy in `condense()` / `arun()`	No write-path coupling
`_wire_view_sync()` ordering	No callbacks to wire
`close()` needs state-lock wrapping	No write-path view mutation
`condense()` event emission moved inside lock	No write-path view mutation

Comparison

Metric	#3324 (push)	This PR (pull)
Files changed	7	5
Lines added	557	123
`EventLog` changes	`_on_append`, `set_on_append`, `synced_count`	None
`_default_callback` changed	Yes	No
`close()` / `condense()` changed	Yes (lock wrapping)	No
Unresolved review threads	4	N/A
Hot-path cost	O(1) per event	O(k) per step (k ≈ 2–4 new events)
`enforce_properties` on hot path	Never	Never

Changes

`openhands-sdk/openhands/sdk/conversation/state.py`

_view: View and _view_watermark: int — derived, not serialized.
view read-only property — lazily catches up via watermark.
rebuild_view() — full View.from_events re-derivation for cold load, fork, error recovery.
create() resume path calls rebuild_view() after attaching the EventLog.

`openhands-sdk/openhands/sdk/agent/agent.py`

step() and astep() malformed-history handlers call state.rebuild_view() before emitting CondensationRequest.

`openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py`

fork() calls fork_conv._state.rebuild_view() after deep-copying events.

`openhands-sdk/openhands/sdk/context/condenser/base.py`

Docstring: view parameter is now documented as read-only.

Files NOT changed (unlike #3324)

event_store.py — no _on_append callback, no set_on_append().
_default_callback — stays as self._state.events.append(e).
close() — no lock-wrapping changes.
condense() — no lock-wrapping changes.

Tests

tests/sdk/conversation/test_state_view_cache.py (12 tests): empty view, append tracking, parity with View.from_events, condensation, hot-path enforce_properties avoidance, rebuild, idempotent reads, persistence resume.
tests/sdk/agent/test_agent_context_window_condensation.py (4 new tests): rebuild_view called on malformed-history recovery in both step() and astep().
All 1,254 tests pass across tests/sdk/conversation/, tests/sdk/agent/, tests/sdk/context/.

Backward compatibility

All public APIs preserved. New additive surface only: ConversationState.view, ConversationState.rebuild_view.
No behavior change: Agent.step still calls View.from_events(state.events) via prepare_llm_messages. The cached view is maintained but unused on the hot path — the follow-up PR (Consume cached View in prepare_llm_messages #3328) swaps the call site.

Follow-up

#3328 swaps Agent.step to read from state.view instead of rebuilding from scratch, delivering the actual perf win. That PR will need to be rebased onto this branch once this merges.

This PR was created by an AI agent (OpenHands) on behalf of @csmith49.

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:4e78239-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-4e78239-python \
  ghcr.io/openhands/agent-server:4e78239-python

All tags pushed for this build

ghcr.io/openhands/agent-server:4e78239-golang-amd64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-golang-amd64
ghcr.io/openhands/agent-server:lazy-view-watermark-golang-amd64
ghcr.io/openhands/agent-server:4e78239-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:4e78239-golang-arm64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-golang-arm64
ghcr.io/openhands/agent-server:lazy-view-watermark-golang-arm64
ghcr.io/openhands/agent-server:4e78239-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:4e78239-java-amd64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-java-amd64
ghcr.io/openhands/agent-server:lazy-view-watermark-java-amd64
ghcr.io/openhands/agent-server:4e78239-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:4e78239-java-arm64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-java-arm64
ghcr.io/openhands/agent-server:lazy-view-watermark-java-arm64
ghcr.io/openhands/agent-server:4e78239-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:4e78239-python-amd64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-python-amd64
ghcr.io/openhands/agent-server:lazy-view-watermark-python-amd64
ghcr.io/openhands/agent-server:4e78239-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:4e78239-python-arm64
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-python-arm64
ghcr.io/openhands/agent-server:lazy-view-watermark-python-arm64
ghcr.io/openhands/agent-server:4e78239-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:4e78239-golang
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-golang
ghcr.io/openhands/agent-server:lazy-view-watermark-golang
ghcr.io/openhands/agent-server:4e78239-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:4e78239-java
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-java
ghcr.io/openhands/agent-server:lazy-view-watermark-java
ghcr.io/openhands/agent-server:4e78239-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:4e78239-python
ghcr.io/openhands/agent-server:4e7823955fa5f74c27b9b7e2f7cd1e0edcfd464d-python
ghcr.io/openhands/agent-server:lazy-view-watermark-python
ghcr.io/openhands/agent-server:4e78239-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

Each variant tag (e.g., 4e78239-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 4e78239-python-amd64) are also available if needed

Refs: #3053 Alternative to the push-based callback approach in #3324. Instead of eagerly syncing the cached View via EventLog._on_append callbacks, the View is lazily brought up to date on read using an integer watermark that tracks how many events have been incorporated. Changes: - state.py: _view, _view_watermark, view property, rebuild_view() - agent.py: rebuild_view() in step/astep malformed-history handlers - local_conversation.py: rebuild_view() after fork event copy - condenser/base.py: docstring read-only warning Files NOT changed (unlike #3324): - event_store.py: no _on_append callback, no set_on_append() - local_conversation.py: _default_callback unchanged, close() unchanged, condense() unchanged — no lock-wrapping changes needed Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-05-20T22:50:18Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-20T22:50:25Z

REST API breakage checks (OpenAPI) — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-05-20T22:52:19Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/agent
agent.py	365	39	89%	101, 215, 340, 344, 612–613, 620–621, 707, 711–712, 717–719, 721, 731–732, 749–750, 757–758, 781, 788, 792, 796, 799–800, 802–803, 817–818, 1105–1106, 1108, 1136, 1144–1145, 1179, 1186
openhands-sdk/openhands/sdk/context/condenser
base.py	69	22	68%	77, 191–192, 210–213, 215–216, 218–219, 221–223, 226–229, 231, 233, 242, 253
openhands-sdk/openhands/sdk/conversation
state.py	227	10	95%	260–261, 267–269, 448, 496–498, 639
openhands-sdk/openhands/sdk/conversation/impl
local_conversation.py	531	44	91%	308, 313, 460, 506, 543, 559, 624, 844–845, 848, 961, 972–975, 982–983, 986, 992–993, 996, 1002, 1017, 1020, 1024–1025, 1029–1031, 1038, 1124, 1129, 1239, 1241, 1245–1246, 1257–1258, 1283, 1478, 1482, 1552, 1559–1560
TOTAL	27835	8244	70%

all-hands-bot

✅ QA Report: PASS

Verified the new SDK ConversationState.view cache behavior through real SDK usage; it works for incremental updates, condensation, persistence resume, fork, and malformed-history recovery.

Does this PR achieve its stated goal?

Yes. The PR set out to add a lazily-updated, watermark-maintained View cache on ConversationState plus rebuild hooks for cold load/fork/recovery; exercising the SDK directly showed the cache starts empty, catches up after real event appends, matches View.from_events, applies condensation, survives resume/fork, and is rebuilt before malformed-history condensation retry. No functional issues were found.

Phase	Result
Environment Setup	✅ `uv run python` created/used the project environment and imported `openhands.sdk` successfully.
CI Status	🟡 Most checks green, including SDK/tests/pre-commit/API checks; image publish, PR review, and this QA workflow were still in progress when checked.
Functional Verification	✅ Real SDK scripts exercised the changed behavior successfully; no test suite, linter, or formatter was run.

Functional Verification

Test 1: `ConversationState.view` cache, condensation, resume, and rebuild

Step 1 — Establish baseline on origin/main:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py after checking out origin/main:

created_state=c88009b0-32a8-4123-8511-75e751db1308
AttributeError: 'ConversationState' object has no attribute 'view'
BASE_EXIT=1

This confirms the base branch does not expose the new cached view surface.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 91d20cba325c9ed9af689e2af3a1e520cd45a883.

Step 3 — Re-run with the PR in place:
Ran the same SDK workflow:

initial_view_events=0
after_appends_ids=0c61...,f17b...,13fa...
view_matches_full_rebuild=True
after_condensation_ids=f17b...,13fa...
first_removed=True
condensation_request_flag=True
resumed_event_count=2
rebuild_replaced_view=True
RESULT=PASS

This shows the lazy cache is initialized, catches up after appends, stays equivalent to a full rebuild, applies condensation, restores on resume, and can be explicitly rebuilt.

Test 2: Conversation fork gets a rebuilt cached view

Step 1 — Baseline:
On origin/main, the cached state.view surface is absent as shown in Test 1, so there is no forked cached view to inspect.

Step 2 — Apply the PR's changes:
Used the PR branch and created a real local Conversation, appended a message event, then called convo.fork().

Step 3 — Verify fork behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view.py:

source_view_events=1
fork_type=LocalConversation
fork_view_events=1
fork_contains_source_event=True
RESULT=PASS

This confirms a user-level forked conversation exposes a cached view containing the copied source event.

Test 3: Malformed-history recovery rebuilds a stale cached view

Step 1 — Baseline:
The base branch has no cached state.view; this recovery hook is specific to the PR's new cache.

Step 2 — Apply the PR's changes:
On the PR branch, I created a real Conversation with a custom LLM that raises LLMMalformedConversationHistoryError, deliberately corrupted the cached view, and invoked agent.step(...).

Step 3 — Verify recovery behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:

LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
corrupted_cached_view_events=0
view_after_recovery_events=2
view_contains_original_event=True
condensation_request_emitted=True
RESULT=PASS

This shows Agent.step recovered the stale cache before emitting the CondensationRequest path, matching the PR's stated error-recovery goal.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

✅ QA Report: PASS

Verified the SDK-facing lazy ConversationState.view cache behavior end-to-end via real SDK calls; the PR achieves its stated goal with no QA issues found.

Does this PR achieve its stated goal?

Yes. The PR sets out to add a lazily-updated, watermark-based View cache on ConversationState, plus rebuild behavior for cold load, fork, and malformed-history recovery. Running the SDK on the PR commit showed the new API exists and works for incremental reads, condensation projection, full rebuild parity, persistence resume, conversation fork, and malformed-history recovery (rebuild_calls: 1 before CondensationRequest).

Phase	Result
Environment Setup	✅ `make build` completed successfully and installed the uv workspace dependencies.
CI Status	🟡 31 checks successful, 2 skipped, 2 automation checks still in progress at review time (`pr-review`, `qa-changes`).
Functional Verification	✅ Real SDK operations passed on PR commit; baseline confirmed the new cache API was absent on `origin/main`.

Functional Verification

Test 1: Establish baseline without the PR

Step 1 — Reproduce / establish baseline (without the fix):
Ran:

git checkout --quiet origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py > /tmp/qa_base_final.raw 2>&1
grep '^{"event"' /tmp/qa_base_final.raw

Output:

{"event": "sdk_import_ok"}
{"event": "api_surface", "has_view": false, "has_rebuild_view": false}
{"event": "baseline_stop", "reason": "ConversationState view cache API is unavailable"}

This confirms the baseline SDK does not expose the new ConversationState.view / rebuild_view() surface, so there is no lazy view cache to exercise before the PR.

Step 2 — Apply the PR's changes:
Checked out the PR commit:

git checkout --quiet cb9d2fd0143300e4b3b07691b26694b34a08aa76

Step 3 — Re-run with the fix in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py > /tmp/qa_pr_final.raw 2>&1
grep '^{"event"' /tmp/qa_pr_final.raw

Output:

{"event": "sdk_import_ok"}
{"event": "api_surface", "has_view": true, "has_rebuild_view": true}
{"event": "first_read", "view_len": 1, "matches_first": true}
{"event": "incremental_read", "same_view_instance": true, "view_len": 2}
{"event": "condensation_projection", "view_len": 2, "dropped_first": true, "kept_second": true, "kept_third": true}
{"event": "rebuild", "replaced_instance": true, "matches_from_events": true, "watermark_len": 4}
{"event": "resume", "view_len": 1, "same_event_id": true}
{"event": "fork", "source_view_len": 2, "fork_view_len": 2, "same_count": true}
{"event": "malformed_history_recovery", "rebuild_calls": 1, "emitted_condensation_request": true}

This exercised the changed behavior as SDK users would: creating ConversationState, appending real events, reading state.view, applying Condensation, calling rebuild_view(), resuming persisted state, forking a Conversation, and driving Agent.step() through malformed-history recovery. The output shows the lazy cache catches up incrementally, condensation is reflected in the cached projection, rebuild matches View.from_events, cold resume and fork populate the view, and malformed-history recovery rebuilds before emitting a condensation request.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

Solid engineering. The pull-based watermark is the right call — it eliminates every synchronization concern from #3324 by simply not touching the view on the write path. The implementation is minimal (123 lines vs 557), the tests are comprehensive, and the additive-only surface means no behavior change until #3328 lands.

Two issues flagged below: one about thread-safety guarantees that should be documented before the hot path is wired up in the follow-up, and one about the stale-reference semantics of rebuild_view().

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

🟡 Taste Rating: Acceptable, but the cache currently has ownership/concurrency holes that can corrupt state.

[RISK ASSESSMENT] 🟡 MEDIUM — this changes SDK conversation state / condenser cache semantics and could affect LLM history. Since there is no eval evidence in the PR, I’m leaving a COMMENT for human maintainer/eval follow-up rather than approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26195669127

- Guard view property and rebuild_view() with a threading.Lock so concurrent readers cannot double-append into the cached View. - Document in rebuild_view() that previously returned View references are invalidated after the call. - Extend test_resume_path_rebuilds_view to verify the watermark is correctly positioned after resume by appending an event and checking it appears incrementally. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

The pull-based watermark design is the right call — it eliminates all the synchronization complexity of #3324 by simply not touching the view on the write path. The implementation is minimal and clean (123 lines vs 557), the additive-only surface means no behavior change until #3328 lands, and the previous review's thread-safety concerns have been properly addressed with _view_lock.

Two new observations below: one maintenance-safety flag on the lock type, and a minor test style note.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

⚠️ QA Report: PASS WITH ISSUES

Functional SDK verification passed for the lazy ConversationState.view cache, including append catch-up, condensation, rebuild, resume, fork, and malformed-history recovery; one CI job is currently failing.

Does this PR achieve its stated goal?

Yes. The PR set out to add a lazily-updated, watermark-based View cache on ConversationState as additive groundwork without changing normal conversation behavior. I verified the base branch lacks state.view/rebuild_view, then exercised the PR branch through real SDK scripts: state.view tracks appended events, applies condensation, survives persistence resume, is rebuilt on demand, is rebuilt for forks, and malformed-history recovery restores a corrupted cached view before emitting CondensationRequest.

Phase	Result
Environment Setup	✅ `uv sync --frozen --dev` completed successfully
CI Status	⚠️ Most checks pass, but `Consolidate Build Information` is failing; `qa-changes`/`pr-review` are still pending
Functional Verification	✅ Real SDK usage matched the PR's stated behavior

Functional Verification

Test 1 — Base branch establishes the old behavior

Step 1 — Reproduce / establish baseline without the fix:
Ran git checkout main && uv run python - <<'PY' ... with a script that created a ConversationState, appended two MessageEvents, and attempted to use the new cache API:

baseline_manual_view_events= 2
baseline_has_state_view= False
baseline_has_rebuild_view= False
baseline_state_view_access= AttributeError

This shows the old SDK required rebuilding a View from events manually and did not expose the new cached state.view or rebuild_view() APIs.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 01dd45a60839695b2ef1fc6d2a24ce25b503d6e4.

Step 3 — Re-run with the fix in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python - <<'PY' ... with a script that created ConversationState instances and used the new cache API with real SDK event appends:

cache.empty_view_events 0
cache.after_two_events 2
cache.repeated_read_same_instance True
cache.after_third_events 3
cache.condensation_removed_first True
cache.condensation_kept_others True
cache.rebuild_replaced_view True
cache.rebuild_matches_full_view True
cache.resumed_view_events 2
cache.resumed_after_append_events 3

This confirms the new cached view starts empty, lazily catches up after appends, repeated reads are idempotent, condensation affects the cached view, rebuild_view() re-derives the cache to match View.from_events, and cold resume initializes the cache so later appends continue working.

Test 2 — Conversation fork rebuilds the copied view

Ran the PR branch SDK script using Conversation(..., visualizer=None), send_message(...), and fork():

fork.source_view_events 2
fork.forked_view_events 2
fork.fork_after_append_events 3
fork.source_after_fork_append_events 2

This shows a forked conversation gets a rebuilt cached view matching the source history, and subsequent fork-only events update only the fork's cache without mutating the source conversation's cache.

Test 3 — Malformed-history recovery rebuilds before condensation retry

Ran the PR branch SDK script with a real Agent.step(...) path using an LLM subclass that raises LLMMalformedConversationHistoryError and a condenser that handles condensation requests. Before stepping, I intentionally removed one event from the cached view to simulate the cache corruption the recovery path is intended to repair:

malformed.corrupted_view_events_before_step 1
malformed.view_events_when_request_emitted 2
malformed.original_view_count 2
malformed.removed_event_restored True
malformed.emitted_condensation_request True

This confirms Agent.step() rebuilt the cached view back to the full event count before emitting CondensationRequest, and the malformed-history recovery path continued instead of raising.

Issues Found

🟡 CI: Consolidate Build Information is currently failing on the PR. I did not find functional SDK issues in manual QA, but this CI failure should be resolved or confirmed unrelated before merge.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Taste Rating: Acceptable — the pull-based cache is simpler than the previous push approach, but one cache/watermark race still needs tightening.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation history/condenser semantics and could affect agent/eval behavior; I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.

VERDICT: Worth merging after the bounded rebuild issue is fixed.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26233370659

- Switch _view_lock from threading.Lock to threading.RLock so future reentrant paths (e.g. ViewProperty touching state.view) cannot deadlock silently. - Snapshot events in rebuild_view() before building the View so the watermark is tied to exactly the events that were materialized, closing a theoretical TOCTOU race with concurrent appenders. - Move 'from unittest.mock import patch' to module-level in test_agent_context_window_condensation.py (was duplicated in two function bodies with no circular-import justification). Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

✅ QA Report: PASS

Verified the SDK behavior introduced by this PR by running standalone ConversationState/LocalConversation exercises; the lazy view cache, resume rebuild, fork rebuild, and malformed-history recovery path behaved as claimed.

Does this PR achieve its stated goal?

Yes. The PR sets out to add a lazily-updated, watermark-based ConversationState.view cache plus rebuild hooks for cold load/fork/error recovery. On origin/main, the same SDK exercises show ConversationState.view/rebuild_view are absent; on lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206, appended events incrementally appeared in state.view, condensation state matched View.from_events(...), resume continued incrementally after rebuilding, forked conversations had matching cached views, and malformed-history recovery called rebuild_view() before emitting CondensationRequest.

Phase	Result
Environment Setup	✅ `make build` completed successfully; no test suite, linter, formatter, or pre-commit run was performed.
CI Status	🟡 GitHub checks showed 31 successful, 2 skipped, 2 pending (`PR Review by OpenHands`, `QA Changes by OpenHands`) at verification time.
Functional Verification	✅ Standalone SDK scripts exercised the new public state cache API and affected conversation paths successfully.

Functional Verification

Test 1: Lazy `ConversationState.view` cache, condensation, rebuild, and resume

Step 1 — Establish baseline without the PR:
Checked out origin/main and ran uv run python /tmp/qa_view_cache_exercise.py:

BASE_HEAD=8f406a88
state_view_property=missing
result=baseline lacks lazy view cache API
view_exit=2

This confirms the base branch does not expose the new lazy view cache API, so the PR is adding the behavior under review.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:

initial_view_event_count=3
matches_full_rebuild=True
repeated_reads_same_object=True
condensation_removed_first=True
condensed_view_event_count=2
condensation_request_flag=True
rebuild_replaced_cached_instance=True
rebuild_matches_full_rebuild=True
resume_view_event_count=2
resume_incremental_event_count=3
result=lazy view cache behavior verified
view_exit=0

This shows appended events are reflected lazily, repeated reads reuse the cached instance, condensation semantics match a full View.from_events(...) rebuild, explicit rebuild replaces the cached view with a fully-derived view, and resumed persisted state continues incremental updates correctly.

Test 2: Malformed-history recovery rebuilds the cached view

Step 1 — Establish baseline without the PR:
On origin/main, ran uv run python /tmp/qa_malformed_recovery_exercise.py:

rebuild_view_method=missing
result=baseline cannot rebuild cached view during malformed-history recovery
malformed_exit=2

This confirms the base branch cannot perform the new recovery behavior because the rebuild API does not exist there.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py with an SDK LLM subclass that raises LLMMalformedConversationHistoryError:

rebuild_view_calls=1
emitted_condensation_request=True
result=malformed-history recovery path verified
malformed_exit=0

This shows the real Agent.step(...) recovery path calls ConversationState.rebuild_view() and still emits the expected CondensationRequest.

Test 3: LocalConversation fork rebuild path

Step 1 — Establish baseline / affected path:
The PR description calls out fork creation as a cache rebuild path. I exercised the public LocalConversation.fork() flow with a conversation containing a user message.

Step 2 — Apply the PR's changes:
Ran on lazy-view-watermark at 5a96980498f695448beafb7d6edad54ee2bc4206.

Step 3 — Exercise the fork path:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_cache_exercise.py:

source_view_event_count=2
fork_view_event_count=2
fork_view_matches_source=True
result=fork view cache behavior verified

This shows the forked conversation's cached view is populated and matches the source conversation after events are copied.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Taste Rating: Acceptable — the pull-based cache is much simpler, but there is one cache-consistency bug to fix before this is safe to build on.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/view state and condenser recovery semantics, so maintainer/eval judgment is still appropriate before merge.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26252792628

all-hands-bot

The pull-based watermark is the right call — eliminating the push-based callback entirely removes every synchronization concern that plagued #3324. All 8 previous review threads are resolved in the current code, and the implementation is clean: the RLock, the snapshot approach in rebuild_view(), and the watermark-position-after-rebuild test are all solid. Two minor fresh observations below.

This review was generated by an AI agent (OpenHands) on behalf of the user via OpenHands Automation.

…ocstring - Advance _view_watermark after each successful append_event in the view property loop, so a mid-loop failure cannot leave already- appended events eligible for replay on the next read. - Document the implicit exception-safety of rebuild_view(): if View.from_events raises, the cache is left unchanged. - Decline instance-level patch.object suggestion: Pydantic's __setattr__ prevents setting arbitrary attributes on model instances, so class-level patching is the correct approach here. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

✅ QA Report: PASS

Verified the new ConversationState view cache through real SDK usage: lazy reads, condensation, persistence resume, rebuild recovery, malformed-history recovery, and conversation fork all behaved as described.

Does this PR achieve its stated goal?

Yes. The PR set out to add a lazily-updated, watermark-maintained View cache on ConversationState as an additive groundwork change, with rebuild paths for cold load, fork, and malformed-history recovery. I exercised those SDK paths directly: main has no state.view, while this PR exposes it and keeps it in sync across appended events, condensation, resume, explicit rebuild, malformed-history recovery, and Conversation.fork().

Phase	Result
Environment Setup	✅ `make build` completed successfully and installed the uv dev environment.
CI Status	⚠️ Latest check glance: most checks passing; `unresolved-review-threads` failing, `pr-review`/`qa-changes` pending, cleanup/artifact review checks skipped.
Functional Verification	✅ Real SDK scripts returned `ok: true` for PR behavior; no test suite, linters, or type checks were run locally.

Functional Verification

Test 1: ConversationState lazy view cache, condensation, resume, and rebuild

Step 1 — Establish baseline on main:
Ran git checkout main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_cache_quiet.py:

{
  "ok": false,
  "checks": {
    "has_view_attr": false
  },
  "error_type": "AttributeError",
  "error": "'ConversationState' object has no attribute 'view'"
}

This shows the cached ConversationState.view API is not present on the base branch, so the PR is adding the intended new surface.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at 26711fff.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_cache_quiet.py:

{
  "ok": true,
  "checks": {
    "has_view_attr": true,
    "initial_view_count": 3,
    "repeated_read_same_object": true,
    "incremental_matches_from_events": true,
    "condensation_removed_first": true,
    "condensation_view_count": 2,
    "condensation_request_flag": true,
    "resume_view_count": 2,
    "resume_incremental_count": 3,
    "rebuild_replaces_view": true,
    "rebuild_matches_from_events": true
  }
}

This confirms the new cache lazily catches up from appended events, repeated reads are idempotent, incremental state matches View.from_events, condensation is reflected, persisted resume initializes the cache, post-resume appends are tracked, and rebuild_view() replaces/re-derives the cache.

Test 2: Malformed-history recovery rebuilds the cached view before retry

Step 1 — Baseline context:
The base branch result above shows there is no cached state.view to rebuild. The PR-specific claim is that malformed-history recovery refreshes the new cache before emitting a condensation retry.

Step 2 — Apply the PR's changes:
Used lazy-view-watermark at 26711fff with a real Agent.step() call and a local LLM subclass that raises LLMMalformedConversationHistoryError.

Step 3 — Exercise recovery with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_agent_malformed_recovery_quiet.py:

{
  "ok": true,
  "checks": {
    "condensation_request_emitted": true,
    "cached_view_rebuilt": true
  },
  "seen_events": [
    "CondensationRequest"
  ]
}

This confirms the malformed-history path emits the expected retry request and invalidates/rebuilds the cached view before continuing.

Test 3: Conversation fork preserves and continues the cached view

Step 1 — Baseline context:
As in Test 1, main has no state.view, so forked cached-view behavior is additive in this PR.

Step 2 — Apply the PR's changes:
Used lazy-view-watermark at 26711fff and exercised Conversation.send_message() followed by Conversation.fork().

Step 3 — Exercise fork with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_conversation_fork_view_quiet.py:

{
  "ok": true,
  "checks": {
    "source_view_had_message": true,
    "fork_view_rebuilt_from_source": true,
    "fork_view_tracks_new_message": true,
    "fork_view_matches_full_rebuild": true
  }
}

This confirms a forked conversation receives a rebuilt view from copied events and continues tracking new events consistently with a full View.from_events rebuild.

Issues Found

None from functional QA.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟡 Taste Rating: Acceptable, but there’s one cache-consistency hole in rebuild watermarking (inline).

[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/condenser state and could affect LLM message construction once state.view is wired into the hot path, so I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26254102500

Covers the gap where test_view_matches_full_rebuild only used MessageEvents. The new test asserts state.view == View.from_events on a mixed sequence that includes a Condensation event. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟡 Taste Rating: Acceptable — the pull-based design is clean, but one cold-load watermark consistency bug needs attention.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches conversation history/view cache semantics and the future condenser hot path, so maintainer/eval judgment is still appropriate before merge.

VERDICT: Worth merging after the inline watermark bug is fixed.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26297602298

The snapshot was added to guard a theoretical TOCTOU race between from_events iteration and watermark assignment, but rebuild_view holds the view lock and event appends happen on the same thread. Passing self._events directly is simpler and avoids the copy. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

✅ QA Report: PASS

The PR delivers the claimed lazy watermark-based ConversationState.view cache: I exercised the SDK as a caller, verified incremental cache updates, persistence resume, fork behavior, and malformed-history recovery.

Does this PR achieve its stated goal?

Yes. On main, ConversationState has no cached view / rebuild_view API; on the PR commit, the same SDK operations expose the new API, update the view incrementally after appended events, apply condensation, rebuild with enforcement on demand, cold-load persisted state, and keep forked conversations' cached view in sync. I also triggered a real Agent.step() malformed-history path and observed rebuild_view() being called once before a CondensationRequest was emitted.

Phase	Result
Environment Setup	✅ `uv run python` created/used the project env and imported the SDK successfully; no test suite was run.
CI Status	✅ Existing completed checks are green; the current PR review/QA automation checks were still in progress when checked.
Functional Verification	✅ SDK scripts exercised the changed behavior end-to-end on base vs PR commit.

Functional Verification

Test 1: `ConversationState.view` cache, condensation, rebuild, and persistence resume

Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:

STATE_VIEW_PRESENT=False
STATE_REBUILD_VIEW_PRESENT=False
BASELINE_RESULT=ConversationState has no cached view API to exercise

This confirms the base branch does not provide the new cached view surface.

Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:

STATE_VIEW_PRESENT=True
STATE_REBUILD_VIEW_PRESENT=True
INCREMENTAL_VIEW_COUNT=2
REPEATED_READ_SAME_INSTANCE=True
INCREMENTAL_ENFORCE_CALLS=0
CONDENSED_VIEW_COUNT=1
CONDENSED_CONTAINS_ALPHA=False
CONDENSED_CONTAINS_BETA=True
REBUILD_REPLACED_INSTANCE=True
REBUILD_ENFORCE_CALLS=1
REBUILD_VIEW_COUNT=1
RESUMED_VIEW_COUNT=1
RESUMED_AFTER_APPEND_VIEW_COUNT=2

This shows the PR adds the API, lazy reads catch up to appended events without invoking full property enforcement, condensation is reflected in the cached view, rebuild_view() replaces the cached instance and runs enforcement, and resumed persisted state has a usable cache/watermark for future appends.

Test 2: malformed-history recovery rebuilds the cache before requesting condensation

Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py:

STATE_REBUILD_VIEW_PRESENT=False
BASELINE_EVENTS=['CondensationRequest']
BASELINE_RESULT=Malformed-history recovery emits CondensationRequest but has no cached view to rebuild

This confirms the existing recovery path emits the condensation request, but there is no cached view API to refresh.

Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_exercise.py:

STATE_REBUILD_VIEW_PRESENT=True
REBUILD_VIEW_CALLS=1
EVENTS=['CondensationRequest']
CONDENSATION_REQUEST_EMITTED=True

This shows a real Agent.step() call through a malformed-history LLM error invokes rebuild_view() once and still emits the expected CondensationRequest.

Test 3: conversation fork initializes and maintains the forked cached view

Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_exercise.py:

SOURCE_EVENTS=2
SOURCE_VIEW_PRESENT=False
FORK_EVENTS=2
FORK_VIEW_PRESENT=False

This confirms fork behavior existed, but without the cached view API.

Step 2 — Apply the PR's changes:
Checked out bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_fork_view_exercise.py:

SOURCE_EVENTS=2
SOURCE_VIEW_PRESENT=True
FORK_EVENTS=2
FORK_VIEW_PRESENT=True
FORK_VIEW_COUNT=2
FORK_VIEW_AFTER_APPEND_COUNT=3

This confirms a forked conversation starts with a rebuilt cached view matching copied events and continues to update after new fork-local user messages.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

Code Review

The pull-based watermark design is the right call — completely removing the push-based _on_append callback eliminates every synchronization concern from #3324, and the resulting diff is dramatically simpler (123 lines added vs 557). The core view property is elegant: O(k) incremental updates on reads, full enforcement only on the three well-defined cold paths (cold load, fork, error recovery), and the watermark update inside the loop is correctly defensive (if append_event raises on event i, the watermark stays at i so the next read retries rather than silently skipping). One suggestion below on the lock type; everything else looks solid.

This review was generated by an AI agent (OpenHands) on behalf of the reviewer, via OpenHands Automation.

all-hands-bot

🟡 Taste Rating: Acceptable — the pull-based design is clean, but the bounded rebuild watermark issue below should be fixed before building on this cache.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation history/condenser cache semantics and can affect future LLM message construction, so I’m leaving a COMMENT for maintainer/eval follow-up rather than approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26302652293

all-hands-bot

Clean, well-motivated PR. The pull-based watermark pattern eliminates every synchronization concern from #3324 and cuts the diff from 557 → 123 lines. The implementation is correct, the test coverage is thorough (12 + 4 new tests), and the PR description is one of the clearest I've seen on a caching change. Three points worth addressing before #3328 lands (two are documentation gaps, one is a missing test case).

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

✅ QA Report: PASS

The PR’s lazy ConversationState.view cache works in real SDK usage across append, condensation, rebuild, persistence resume, fork, and malformed-history recovery paths.

Does this PR achieve its stated goal?

Yes. The base branch has no ConversationState.view, while the PR branch exposes a cached view that incrementally reflects appended events, applies condensation, survives cold resume with the watermark positioned correctly, and is rebuilt before malformed-history condensation recovery. I exercised these behaviors by importing and using the SDK directly with real ConversationState, Conversation, EventLog, View, and Agent.step() calls; no test suite, linter, or formatter was run.

Phase	Result
Environment Setup	✅ `make build` completed successfully and installed the uv-managed environment.
CI Status	✅ 31 checks passing; only this QA/pr-review automation currently pending, with expected skipped artifact cleanup jobs.
Functional Verification	✅ SDK behavior matched the PR claims in before/after branch runs.

Functional Verification

Test 1: `ConversationState.view` lazy cache, condensation, rebuild, resume, and fork

Step 1 — Establish baseline without the PR:
Ran on origin/main:

git switch --detach origin/main
uv run python -u /tmp/qa_state_view_cache.py

Relevant output:

initial_append: event_log_count=3
after_initial_append: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: view_available=False

This confirms the base branch has no cached view surface to maintain.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR in place:
Ran:

uv run python -u /tmp/qa_state_view_cache.py

Relevant output:

initial_append: event_log_count=3
after_initial_append: view_count=3 event_log_count=3 unhandled_condensation_request=False
after_initial_append: view_event_texts=['first', 'second', 'third']
repeated_read_same_instance=True
after_condensation: view_count=2 event_log_count=4 unhandled_condensation_request=False
after_condensation: view_event_texts=['second', 'third']
after_condensation_request: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_rebuild: rebuilt_matches_fresh=True rebuilt_count=2 fresh_count=2
after_resume: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_resume_append: view_count=3 event_log_count=6 unhandled_condensation_request=True
conversation_fork: view_count=2 event_log_count=2 unhandled_condensation_request=False
conversation_fork: view_event_texts=['SystemPromptEvent', 'fork seed']
conversation_fork: view_available=True

This shows the new state.view catches up to appended events, repeated reads return the same cached instance when unchanged, condensation removes forgotten events, condensation requests update the view flag, rebuild_view() matches a fresh View.from_events(...), resumed persisted state continues incrementally after cold-load rebuild, and a forked conversation has a usable rebuilt view.

Test 2: Malformed-history recovery rebuilds the cached view before retry

Step 1 — Establish baseline without the PR:
Ran on origin/main:

git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

rebuild_view_available=False
rebuild_view_call_count=unavailable
condensation_request_seen=True
events_emitted=['CondensationRequest']

This confirms the old branch can emit the condensation retry but has no cached-view rebuild hook.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

rebuild_view_available=True
rebuild_view_call_count=1
condensation_request_seen=True
events_emitted=['CondensationRequest']

This verifies the actual Agent.step() malformed-history path still emits CondensationRequest and now calls rebuild_view() exactly once before retry handling.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

✅ QA Report: PASS

The PR’s lazy ConversationState.view cache works in real SDK usage across append, condensation, rebuild, persistence resume, fork, and malformed-history recovery paths.

Does this PR achieve its stated goal?

Yes. The base branch has no ConversationState.view, while the PR branch exposes a cached view that incrementally reflects appended events, applies condensation, survives cold resume with the watermark positioned correctly, and is rebuilt before malformed-history condensation recovery. I exercised these behaviors by importing and using the SDK directly with real ConversationState, Conversation, EventLog, View, and Agent.step() calls; no test suite, linter, or formatter was run.

Phase	Result
Environment Setup	✅ `make build` completed successfully and installed the uv-managed environment.
CI Status	✅ 31 checks passing; only this QA/pr-review automation currently pending, with expected skipped artifact cleanup jobs.
Functional Verification	✅ SDK behavior matched the PR claims in before/after branch runs.

Functional Verification

Test 1: `ConversationState.view` lazy cache, condensation, rebuild, resume, and fork

Step 1 — Establish baseline without the PR:
Ran on origin/main:

git switch --detach origin/main
uv run python -u /tmp/qa_state_view_cache.py

Relevant output:

initial_append: event_log_count=3
after_initial_append: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: state.view unavailable (AttributeError: 'ConversationState' object has no attribute 'view')
conversation_fork: view_available=False

This confirms the base branch has no cached view surface to maintain.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR in place:
Ran:

uv run python -u /tmp/qa_state_view_cache.py

Relevant output:

initial_append: event_log_count=3
after_initial_append: view_count=3 event_log_count=3 unhandled_condensation_request=False
after_initial_append: view_event_texts=['first', 'second', 'third']
repeated_read_same_instance=True
after_condensation: view_count=2 event_log_count=4 unhandled_condensation_request=False
after_condensation: view_event_texts=['second', 'third']
after_condensation_request: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_rebuild: rebuilt_matches_fresh=True rebuilt_count=2 fresh_count=2
after_resume: view_count=2 event_log_count=5 unhandled_condensation_request=True
after_resume_append: view_count=3 event_log_count=6 unhandled_condensation_request=True
conversation_fork: view_count=2 event_log_count=2 unhandled_condensation_request=False
conversation_fork: view_event_texts=['SystemPromptEvent', 'fork seed']
conversation_fork: view_available=True

This shows the new state.view catches up to appended events, repeated reads return the same cached instance when unchanged, condensation removes forgotten events, condensation requests update the view flag, rebuild_view() matches a fresh View.from_events(...), resumed persisted state continues incrementally after cold-load rebuild, and a forked conversation has a usable rebuilt view.

Test 2: Malformed-history recovery rebuilds the cached view before retry

Step 1 — Establish baseline without the PR:
Ran on origin/main:

git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

rebuild_view_available=False
rebuild_view_call_count=unavailable
condensation_request_seen=True
events_emitted=['CondensationRequest']

This confirms the old branch can emit the condensation retry but has no cached-view rebuild hook.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at bea43ba2d905a6987d56f4efeb3b1dd2d2b540e7.

Step 3 — Re-run with the PR in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

rebuild_view_available=True
rebuild_view_call_count=1
condensation_request_seen=True
events_emitted=['CondensationRequest']

This verifies the actual Agent.step() malformed-history path still emits CondensationRequest and now calls rebuild_view() exactly once before retry handling.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟢 Taste Rating: Good taste — the pull-based cache keeps the design simple, and I don't see fresh actionable code issues in the current diff.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser semantics and may affect future LLM message construction/eval behavior. I’m leaving a COMMENT rather than an approval so a human maintainer can decide after appropriate eval validation.

VERDICT: No code changes requested from this pass.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26304956030

- view property: catch append_event failures and fall back to a full rebuild via View.from_events, so callers never see a stuck view. - view property docstring: cross-reference rebuild_view() invalidation (relevant once #3328 lands and callers hold view references across condensation cycles). - Add test_rebuild_view_leaves_cache_unchanged_on_error: verifies the documented contract that a failed rebuild preserves the old cache. Co-authored-by: openhands <openhands@all-hands.dev>

csmith49 · 2026-05-22T18:47:25Z

Note to the review bot: this PR changes how views are maintained, but the semantics by which they're constructed/used has not. Purely a performance change. If the tests are all still passing that is sufficient evidence.

all-hands-bot

The pull-based watermark is the right architectural choice, and the implementation is clean. All 21 prior review threads are resolved; a fresh pass over the current diff turns up two minor points worth noting (both non-blocking).

This review was generated by an AI agent (OpenHands) on behalf of the reviewer via OpenHands Automation.

Without this, the bare except silently swallows the root cause, making unexpected append_event failures undiagnosable in production. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

✅ QA Report: PASS

The PR delivers the lazy watermark-based ConversationState.view cache behavior described, with persistence/fork rebuilds and sync/async malformed-history recovery verified by running SDK scripts against real SDK objects.

Does this PR achieve its stated goal?

Yes. On origin/main, ConversationState.view is absent; on the PR branch, reading state.view after appending events lazily exposes the expected events, matches a full View.from_events(...) rebuild after condensation, and rebuild_view() refreshes the cached instance. I also verified cold resume, conversation fork, and both Agent.step()/Agent.astep() malformed-history recovery restore a deliberately stale cached view before emitting CondensationRequest.

Phase	Result
Environment Setup	✅ `make build` completed and installed the uv dev environment.
CI Status	⚠️ Several core checks were passing when observed; some build/test jobs were still pending/in progress, and `unresolved-review-threads` was failing.
Functional Verification	✅ SDK scripts exercised the changed behavior directly; no test suite, linter, formatter, or pre-commit run was used.

Functional Verification

Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental view behavior

Step 1 — Establish baseline without the fix:
Ran:

git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# then switched back to lazy-view-watermark

Relevant output:

ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0

This confirms the base branch did not expose the new ConversationState.view surface.

Step 2 — Apply the PR's changes:
Checked out lazy-view-watermark at f550790d58668265101797a00baa4293cdd198bc.

Step 3 — Re-run with the fix in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py

Relevant output:

ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2

This shows the cache lazily catches up after appends, applies condensation consistently with a full rebuild, and rebuild_view() refreshes the cached projection.

Test 2: Cold resume and conversation fork rebuild the cached view

Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.

Step 2 — Run PR behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.py

Relevant output:

Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True
resume_fork_exit=0

This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.

Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry

Step 1 — Baseline:
The base branch lacks state.view, so the cached-view recovery behavior does not exist there.

Step 2 — Run PR sync recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

Cached view count after deliberate stale mutation: 0
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True
malformed_exit=0

This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.

Step 3 — Run PR async recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.py

Relevant output:

Async cached view count after stale mutation: 0
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True
async_malformed_exit=0

This confirms async parity for Agent.astep().

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

✅ QA Report: PASS

The current PR head delivers the lazy watermark-based ConversationState.view cache behavior described, verified through real SDK usage rather than the test suite.

Does this PR achieve its stated goal?

Yes. The PR aims to add a pull-based, lazily updated View cache to ConversationState, with rebuilds on cold load/fork/error recovery. I verified origin/main lacks ConversationState.view, then on the current PR head (86d75dfa) verified lazy event catch-up, condensation parity with View.from_events(...), explicit rebuild behavior, persistence resume, conversation fork, and both sync/async malformed-history recovery restoring a deliberately stale cached view before emitting CondensationRequest.

Phase	Result
Environment Setup	✅ `make build` completed and installed the uv dev environment.
CI Status	⚠️ At latest observation, several checks were passing, several build/test jobs were still pending/in progress, and `unresolved-review-threads` was failing.
Functional Verification	✅ SDK scripts exercised the changed behavior directly; I did not run pytest, linters, formatters, type checkers, or pre-commit.

Functional Verification

Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental behavior

Step 1 — Establish baseline without the fix:
Ran:

git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# switched back to lazy-view-watermark afterward

Relevant output:

ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0

This confirms the base branch did not expose the new ConversationState.view surface.

Step 2 — Apply the PR's changes:
Fetched and checked out current PR head 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.

Step 3 — Re-run with the fix in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py

Relevant output:

ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2

This shows the cache lazily catches up after event appends, preserves condensation semantics, and rebuild_view() refreshes the cached projection.

Test 2: Cold resume and conversation fork rebuild the cached view

Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.

Step 2 — Run PR behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.py

Relevant output:

Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True

This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.

Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry

Step 1 — Baseline:
The base branch lacks state.view, so the cache-specific recovery behavior does not exist there.

Step 2 — Run PR sync recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

Cached view count after deliberate stale mutation: 0
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True

This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.

Step 3 — Run PR async recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.py

Relevant output:

Async cached view count after stale mutation: 0
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True

This confirms async parity for Agent.astep().

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

✅ QA Report: PASS

The current PR head delivers the lazy watermark-based ConversationState.view cache behavior described, verified through real SDK usage rather than the test suite.

Does this PR achieve its stated goal?

Yes. The PR aims to add a pull-based, lazily updated View cache to ConversationState, with rebuilds on cold load/fork/error recovery. I verified origin/main lacks ConversationState.view, then on the current PR head (86d75dfa) verified lazy event catch-up, condensation parity with View.from_events(...), explicit rebuild behavior, persistence resume, conversation fork, and both sync/async malformed-history recovery restoring a deliberately stale cached view before emitting CondensationRequest.

Phase	Result
Environment Setup	✅ `make build` completed and installed the uv dev environment.
CI Status	⚠️ At latest observation, several checks were passing, several build/test jobs were still pending/in progress, and `unresolved-review-threads` was failing.
Functional Verification	✅ SDK scripts exercised the changed behavior directly; I did not run pytest, linters, formatters, type checkers, or pre-commit.

Functional Verification

Test 1: Base branch lacks the new view cache; PR branch provides lazy incremental behavior

Step 1 — Establish baseline without the fix:
Ran:

git switch --detach origin/main
OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py
# switched back to lazy-view-watermark afterward

Relevant output:

ConversationState.view present: False
Reading state.view result: AttributeError: 'ConversationState' object has no attribute 'view'
base_script_exit=0

This confirms the base branch did not expose the new ConversationState.view surface.

Step 2 — Apply the PR's changes:
Fetched and checked out current PR head 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.

Step 3 — Re-run with the fix in place:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view.py

Relevant output:

ConversationState.view present: True
Incremental view event count after 3 appends: 3
Incremental view texts: message-0,message-1,message-2
View matches full rebuild after condensation: True
First message removed by condensation: True
rebuild_view replaces cached instance: True
View count after rebuild: 2

This shows the cache lazily catches up after event appends, preserves condensation semantics, and rebuild_view() refreshes the cached projection.

Test 2: Cold resume and conversation fork rebuild the cached view

Step 1 — Baseline:
Before this PR there is no state.view API to verify on resumed/forked state, as shown in Test 1.

Step 2 — Run PR behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_resume_fork.py

Relevant output:

Resumed state view event count: 1
Resumed first event preserved: True
Resumed view event count after append: 2
Fork source view count: 2
Forked view count: 2
Forked event IDs match source: True

This shows persisted conversations cold-load the view correctly and continue incrementally updating after resume; forked conversations also receive a rebuilt view matching the source event IDs.

Test 3: Malformed-history recovery rebuilds stale cached views before condensation retry

Step 1 — Baseline:
The base branch lacks state.view, so the cache-specific recovery behavior does not exist there.

Step 2 — Run PR sync recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py

Relevant output:

Cached view count after deliberate stale mutation: 0
Original event-log-derived count before mutation: 2
Cached view count after malformed-history step: 2
CondensationRequest emitted: True

This shows Agent.step() recovered the cached view from the event log and emitted the condensation retry request.

Step 3 — Run PR async recovery behavior:
Ran:

OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_async.py

Relevant output:

Async cached view count after stale mutation: 0
Async original event-log-derived count: 2
Async cached view count after malformed-history astep: 2
Async CondensationRequest emitted: True

This confirms async parity for Agent.astep().

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

csmith49 · 2026-05-22T19:00:05Z

Review summary for the bot

We have been through 7 review rounds with 24 total threads. Here is where things stand, for context on the next review pass.

Accepted & implemented (14 threads, 58%)

All of these improved the code and are reflected in the current diff:

#	What	Commit
1	Thread-safety: add `_view_lock`	`eb08421c8`
2	Invalidation note on `rebuild_view()` docstring	`eb08421c8`
3	Post-resume incremental-append assertion	`eb08421c8`
4	Lock both `view` and `rebuild_view()`	`eb08421c8`
6	`Lock` → `RLock` for reentrancy safety	`20f74d3a9`
7	Hoist `unittest.mock` imports to top-level	`20f74d3a9`
8	Snapshot events before rebuild, watermark from snapshot	`20f74d3a9`
9	Per-event watermark advance in incremental loop	`26711fff4`
10	Exception-safety docstring on `rebuild_view()`	`26711fff4`
15	Parity test with `Condensation` in sequence	`34ab59189`
19	Cross-reference `rebuild_view()` invalidation in `view` docstring	`f550790d5`
20	Self-healing fallback: catch `append_event` failure → full rebuild	`f550790d5`
21	`test_rebuild_view_leaves_cache_unchanged_on_error`	`f550790d5`
22	`logger.warning` with `exc_info=True` on self-healing path	`86d75dfa6`

Declined with rationale (5 threads, 21%)

These were evaluated and declined for documented reasons:

Update README and add repo.md #5 (expose-mutable-View): View is already documented as read-only; adding a defensive copy adds cost on the hot path for no practical benefit.
Add minimal config & port over the LLM class #12 (watermark skips recoverable events): The scenario is real but self-correcting — orphan tool_result triggers LLMMalformedConversationHistoryError → rebuild_view().
Port over tests for config #13 (duplicate cross-ref docstring): Already documented on rebuild_view() itself. Later re-raised as Set litellm logging to WARNING by default #19 with better justification (Consume cached View in prepare_llm_messages #3328 landing), and we accepted that version.
Port over tests for LLM #14 (rebuild_view double-raise): Python exception chaining already surfaces both errors; wrapping adds noise.

Circular / contradictory (5 threads, 21%)

These were re-raises of previously addressed or declined feedback:

Update ci badge on pr only #11 → logger: set LiteLLM loggers to WARNING by default; enable DEBUG with DEBUG_LLM confirmation #23 → Remove json repair #24 (instance-level patch.object): Suggested 3 separate times. Fails on Pydantic models because __setattr__ blocks arbitrary attribute assignment. Declined each time with the same explanation.
Add ruff configuration for import sorting #17 (RLock → Lock): Direct reversal of ci: add coverage report and summary to tests workflow (closes #3) #6, which asked for Lock → RLock.
Port over minimal set of agent, config, context, microagent; adjust existing tool definition #18 (TOCTOU watermark hole): Same concern as Port over remaining tests for execute_bash #8. We implemented the snapshot, then removed it because event appends happen on the same thread. Re-raised identically.
Add configs for ruff to sort the imports #16 (EventLog skips empty files): Premise is factually wrong — EventLog has consistent __len__/__getitem__/__iter__ semantics.

Current state

0 unresolved threads
14/14 tests passing (view cache) + 16/16 (condensation)
All CI checks green
No code changes pending

The implementation is stable. We would appreciate an approval if there are no further substantive concerns.

This comment was generated by an AI agent (OpenHands) on behalf of the PR author.

all-hands-bot

🟢 Taste Rating: Good taste — the pull-based cache keeps the design simple, and I don’t see fresh actionable code issues in the current diff.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser history semantics and could affect future LLM message construction/eval behavior once wired into the hot path. I’m leaving a COMMENT rather than approval so a human maintainer can decide after appropriate eval validation.

VERDICT: No code changes requested from this pass.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26305908143

csmith49 · 2026-05-22T19:02:21Z

Review bot: I'm a maintainer, and I know the risk is low. These are not changes that will impact performance of the agent, these changes are there to improve the efficiency of the harness. No evals are necessary -- if the tests pass, then we're good.

all-hands-bot

The pull-based watermark design is the right call, and the implementation at HEAD is clean and correct. All 24 review threads across eight prior rounds are now resolved.

A fresh read of the current diff confirms the key properties hold:

view property: _view_watermark advances only on a successful append_event; when that fails, the except handler logs with exc_info=True and falls back to a full View.from_events rebuild. If the rebuild itself fails, the exception propagates and neither _view nor _view_watermark is mutated — clean exception safety in both code paths.
rebuild_view(): assigns self._view before advancing self._view_watermark inside the lock; if from_events raises, neither field is mutated and the existing cache continues to serve callers.
Cold-load path in create(): calls rebuild_view() after attaching the EventLog, so resumed states start with a fully-enforced view.
Malformed-history recovery: both step() and astep() call rebuild_view() before emitting CondensationRequest, ensuring the condenser sees a clean view on the retry.
condenser/base.py docstring: correctly contracts that view is read-only and explains why, directing implementations to return a new View or a Condensation.

Test coverage is solid: 12 focused unit tests in test_state_view_cache.py — including the new test_rebuild_view_leaves_cache_unchanged_on_error that directly verifies the exception-safety guarantee — plus 4 agent-level tests covering both sync and async malformed-history paths. No gaps visible.

The author's decisions to decline the class-level patching suggestion (Pydantic's __setattr__ prevents instance-level patching for PrivateAttr fields) and the bounded-watermark hole flag (concurrent append during rebuild is not a typical usage scenario for the three call sites: cold load, fork, recovery) are both reasonable.

Ready to merge once the unresolved-review-threads CI gate clears.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

all-hands-bot

✅ QA Report: PASS

The PR delivers the claimed lazy ConversationState.view cache behavior across direct SDK use, persistence resume, fork, and malformed-history recovery paths.

Does this PR achieve its stated goal?

Yes. On main, ConversationState has no view or rebuild_view API and users must rebuild a View manually from events. On the PR commit, SDK scripts that create real conversations show state.view lazily reflects appended events, condensation, persisted resume, and forked conversations; sync and async malformed-history recovery each call rebuild_view() before emitting a CondensationRequest.

Phase	Result
Environment Setup	✅ `make build` completed and installed the uv-managed dev environment
CI Status	⚠️ 30 passing, 1 failing (`unresolved-review-threads`), 2 pending automation checks, 4 skipped at review time
Functional Verification	✅ Real SDK scripts exercised the changed library behavior without running the test suite

Functional Verification

Test 1: Baseline vs PR lazy `ConversationState.view` API

Step 1 — Establish baseline on main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_behavior.py:

has_state_view False
has_rebuild_view False
manual_view_event_count 1
manual_view_first_text baseline-one

This shows the base branch does not expose the new cached view API; a developer can only derive a view manually with View.from_events(...).

Step 2 — Apply the PR's changes:
Checked out 86d75dfa653c55dbc7ca7133abbb985734f5c9ba.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_behavior.py and extracted the behavior markers:

has_state_view True
has_rebuild_view True
initial_view_event_count 0
same_view_object_after_append True
after_append_event_texts ['one', 'two', 'three']
after_condense_event_texts ['two', 'three']
condensation_removed_first True
condensation_request_flag True
rebuild_replaced_view True
rebuilt_matches_full True

This confirms the PR adds the new public surface and that the cached view updates after direct event appends, applies condensation, records condensation requests, and can be fully rebuilt to match View.from_events(...).

Test 2: Cold-load resume and forked conversations

Step 1 — Baseline:
The same main run above showed no state.view API to inspect after resume/fork.

Step 2 — Apply the PR's changes:
Used the same PR checkout as Test 1.

Step 3 — Exercise persistence and fork paths:
The PR script created a persisted ConversationState, resumed it by ID, appended another event after resume, then created a real Conversation, sent a message, and forked it:

resume_view_event_texts ['persisted']
resume_after_append_texts ['persisted', 'after-resume']
fork_view_event_types ['SystemPromptEvent', 'MessageEvent']
fork_view_message_texts ['fork-source']

This confirms the cold-load/rebuild path produces an immediately usable cached view and that forked conversations expose the copied event history through forked.state.view.

Test 3: Malformed-history recovery rebuilds the cached view before retry

Step 1 — Baseline:
On main, there is no ConversationState.rebuild_view API, so malformed-history recovery cannot refresh a cached ConversationState.view because the cache does not exist yet.

Step 2 — Apply the PR's changes:
Used the same PR checkout as Test 1.

Step 3 — Exercise sync and async agent paths:
Ran custom SDK scripts with an LLM subclass that raises LLMMalformedConversationHistoryError through both completion and responses APIs:

$ OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py > /tmp/qa_malformed_recovery.out 2>&1; grep 'malformed_recovery_case' /tmp/qa_malformed_recovery.out
malformed_recovery_case completion rebuild_calls 1 condensation_requests 1
malformed_recovery_case responses rebuild_calls 1 condensation_requests 1

$ OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery_async.py > /tmp/qa_malformed_recovery_async.out 2>&1; grep 'malformed_recovery_async_case' /tmp/qa_malformed_recovery_async.out
malformed_recovery_async_case completion rebuild_calls 1 condensation_requests 1
malformed_recovery_async_case responses rebuild_calls 1 condensation_requests 1

This confirms both Agent.step() and Agent.astep() rebuild the cached view once and then emit the condensation retry request when malformed history is reported.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

all-hands-bot

🟢 Taste Rating: Good taste — the pull-based watermark cache keeps the design simple, and the current fixes address the prior cache consistency concerns I checked.

[RISK ASSESSMENT] 🟡 MEDIUM — this touches SDK conversation View/condenser history semantics and may affect future LLM message construction/eval behavior once wired into the hot path. I’m leaving a COMMENT rather than approval so a human maintainer can decide after appropriate eval validation.

VERDICT: No code changes requested from this pass.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26306623251

github-actions Bot added the release-note-required PR requires explicit release-note coverage for behavioral or default changes label May 20, 2026

csmith49 mentioned this pull request May 20, 2026

Cache View on ConversationState, update incrementally #3324

Closed

all-hands-bot reviewed May 20, 2026

View reviewed changes

csmith49 mentioned this pull request May 20, 2026

Consume cached View in prepare_llm_messages #3328

Draft

Merge branch 'main' into lazy-view-watermark

cb9d2fd

csmith49 requested a review from all-hands-bot May 20, 2026 23:20

all-hands-bot reviewed May 20, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

Comment thread tests/sdk/conversation/test_state_view_cache.py

all-hands-bot reviewed May 20, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

github-actions Bot removed the release-note-required PR requires explicit release-note coverage for behavioral or default changes label May 21, 2026

Merge branch 'main' into lazy-view-watermark

01dd45a

csmith49 requested a review from all-hands-bot May 21, 2026 14:46

all-hands-bot reviewed May 21, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

Comment thread tests/sdk/agent/test_agent_context_window_condensation.py Outdated

all-hands-bot reviewed May 21, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

csmith49 and others added 2 commits May 21, 2026 10:23

Merge branch 'main' into lazy-view-watermark

5a96980

csmith49 requested a review from all-hands-bot May 21, 2026 20:59

all-hands-bot reviewed May 21, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

all-hands-bot reviewed May 21, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

Comment thread tests/sdk/agent/test_agent_context_window_condensation.py

csmith49 requested a review from all-hands-bot May 21, 2026 21:27

all-hands-bot reviewed May 21, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

all-hands-bot reviewed May 22, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

csmith49 and others added 2 commits May 22, 2026 10:32

Merge branch 'main' into lazy-view-watermark

bea43ba

csmith49 requested a review from all-hands-bot May 22, 2026 17:34

all-hands-bot reviewed May 22, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

all-hands-bot reviewed May 22, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

csmith49 requested a review from all-hands-bot May 22, 2026 18:25

all-hands-bot reviewed May 22, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated

Comment thread tests/sdk/conversation/test_state_view_cache.py

all-hands-bot reviewed May 22, 2026

View reviewed changes

csmith49 requested a review from all-hands-bot May 22, 2026 18:47

all-hands-bot reviewed May 22, 2026

View reviewed changes

Comment thread openhands-sdk/openhands/sdk/conversation/state.py

Comment thread tests/sdk/agent/test_agent_context_window_condensation.py

Comment thread tests/sdk/agent/test_agent_context_window_condensation.py

Log warning on self-healing view rebuild fallback

86d75df

Without this, the bare except silently swallows the root cause, making unexpected append_event failures undiagnosable in production. Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot reviewed May 22, 2026

View reviewed changes

csmith49 requested a review from all-hands-bot May 22, 2026 19:02

all-hands-bot reviewed May 22, 2026

View reviewed changes

csmith49 added 2 commits May 22, 2026 13:54

Merge branch 'main' into lazy-view-watermark

6f2ac44

Merge branch 'main' into lazy-view-watermark

4e78239

Conversation

csmith49 commented May 20, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The key difference: pull instead of push

Comparison

Changes

openhands-sdk/openhands/sdk/conversation/state.py

openhands-sdk/openhands/sdk/agent/agent.py

openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py

openhands-sdk/openhands/sdk/context/condenser/base.py

Files NOT changed (unlike #3324)

Tests

Backward compatibility

Follow-up

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ✅ PASSED

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: ConversationState.view cache, condensation, resume, and rebuild

Test 2: Conversation fork gets a rebuilt cached view

Test 3: Malformed-history recovery rebuilds a stale cached view

Issues Found

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Establish baseline without the PR

Issues Found

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

⚠️ QA Report: PASS WITH ISSUES

Does this PR achieve its stated goal?

Test 1 — Base branch establishes the old behavior

Test 2 — Conversation fork rebuilds the copied view

Test 3 — Malformed-history recovery rebuilds before condensation retry

Issues Found

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Lazy ConversationState.view cache, condensation, rebuild, and resume

Test 2: Malformed-history recovery rebuilds the cached view

Test 3: LocalConversation fork rebuild path

Issues Found

csmith49 commented May 20, 2026 •

edited by github-actions Bot

Loading

`openhands-sdk/openhands/sdk/conversation/state.py`

`openhands-sdk/openhands/sdk/agent/agent.py`

`openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py`

`openhands-sdk/openhands/sdk/context/condenser/base.py`

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

Test 1: `ConversationState.view` cache, condensation, resume, and rebuild

Test 1: Lazy `ConversationState.view` cache, condensation, rebuild, and resume

Test 1: `ConversationState.view` cache, condensation, rebuild, and persistence resume

Test 1: `ConversationState.view` lazy cache, condensation, rebuild, resume, and fork