harden stream fallbacks, namespace action logs by run, and add playbook failure messages#52
Merged
Conversation
…ok failure messages
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (22)
📝 WalkthroughWalkthroughThis pull request introduces run-aware action logging by threading a Changes
Sequence DiagramssequenceDiagram
participant Agent as Agent Session
participant Router as ActionRouter
participant Logger as persist_action_log
participant Disk as Local Disk
Agent->>Router: execute_action(run_id="run-123")
Router->>Router: _record_action(entry, run_id)
Router->>Logger: persist_action_log(entry, run_id="run-123")
Logger->>Logger: sanitize_run_id("run-123")
Logger->>Disk: os.makedirs("run-123/", exist_ok=True)
Logger->>Disk: write(run-123/action_N.json)
Logger-->>Router: path to persisted file
Router-->>Agent: ActionResult
sequenceDiagram
participant Client as HTTP Client
participant Service as RecordingService
participant Upstream as Upstream Stream
participant Storage as Persisted Storage
Client->>Service: get_trace(run_id)
Service->>Upstream: stream upstream
Upstream->>Service: yield chunk 1
Service->>Service: bytes_sent += len(chunk)
Upstream-->>Service: HTTPError before chunk 2
alt bytes_sent == 0
Service->>Storage: load persisted trace.zip
Storage-->>Service: bytes(trace.zip)
Service-->>Client: persisted bytes
else bytes_sent > 0
Service-->>Client: partial upstream bytes only
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
This was referenced Apr 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Action log isolation:
persist_action_lognow accepts arun_idand writes logs into per-run subdirectories,preventing interleaved logs across concurrent runs.
ActionRoutergenerates a fallback UUID when no run ID isprovided.
Finalizer lifecycle split: Split
RunFinalizer.persist()intopublish_status()(in-sandbox status API) andpersist()(volume commit), withpersist()running aftercleanup()so recordings are finalized before thevolume is committed. Each phase has independent error handling.
Stream/recording fallback hardening: Track bytes sent during live trace proxy — only fall back to the persisted
trace file when zero live bytes were delivered (prevents corrupt mixed streams). Similarly,
RunService.stream_runmarks the run inactive on upstream failure and replays persisted events when no live events were yielded.
Playbook
failure_messagefield: Steps can now declare a customfailure_messagethat surfaces as theuser-facing error instead of raw exception text. Supports
{param}placeholder substitution. LLM recovery failuresnow combine the original step error with the recovery error and preserve token accounting.
Bug fix:
PlaybookRunnernow appends the failed step result tostep_resultsbefore returning an abort, socallers always see what failed.
Guardrail setting: Add
stuck_revisit_gaptoGuardrailSettingsfor tuning revisit detection.Test plan
test_bridge_execution.py— coversexecute_dom_actionunknown action rejection +SequenceExecutorhappy/failure paths
test_bridge_router.py— covers DOM filtering, action log namespacing by run ID, and captcha messageprepending
test_recording_service.py— covers persisted trace fallback (zero bytes) vs. partial live stream (nofallback)
test_run_service.py— stream proxy marks run inactive and replays persisted events on connectionfailure
test_session_finalizer.py— asserts correct publish → cleanup → persist call orderingtest_playbooks.py— failure_message schema, param binding, executor verification error, LLM recoveryfailure surfacing, abort includes failed step
test_actionlog.py— per-run-id file namespacingtest_streaming.py—stuck_revisit_gapmodel field + removed redundant timeout argsSummary by CodeRabbit
Release Notes
New Features
stuck_revisit_gapto customize revisit behavior (range: 1–50, default: 5)Bug Fixes
Tests