Skip to content

fix(workflow-runtime): tighten PTY chrome scrubbing, document idle override, tame stale-state warning#930

Merged
khaliqgant merged 4 commits into
mainfrom
fix/workflow-runtime-game-lessons
May 20, 2026
Merged

fix(workflow-runtime): tighten PTY chrome scrubbing, document idle override, tame stale-state warning#930
khaliqgant merged 4 commits into
mainfrom
fix/workflow-runtime-game-lessons

Conversation

@khaliqgant
Copy link
Copy Markdown
Member

Summary

Three quality-of-life fixes to the workflow runtime, surfaced by running a multi-CLI workflow end-to-end with interactive PTY agents. All non-breaking; pre-existing test failures in verification-traceback.test.ts / verification-custom.test.ts are unrelated (also failing on origin/main).

1. Strengthen scrubForChannel for Claude Code TUI noise

Interactive-agent step Output: blocks were unreadable. The captured PTY stream included Claude Code's bottom status bar (Opus 4.7 (1M context) ctx:5% $1.45), vim-style mode indicators (--INSERT--⏵⏵bypasspermissionson), no-whitespace UI hint variants (pasteagaintoexpand), and ambient thinking-status fragments (↓ 13 tokens · thinking with high effort, Crunched for 32s). None were caught by the existing regexes.

Added: claudeFooterRe, vimModeRe, thinkingStatusRe; relaxed uiHintRe to match the no-whitespace TUI variants. New scrub-pty-chrome.test.ts covers the patterns from a real capture and guards against over-stripping (prose that mentions model names is preserved).

2. Document idleThresholdSecs as the per-agent idle escape hatch

The option is already plumbed end-to-end (AgentOptionsconstraints → broker spawn arg) — 0 already disables idle detection — but the JSDoc was one line and didn't say so. Users hitting "agent exited before the awaited event" had no obvious way to fix it. Expanded the doc with when-to-override / when-not-to-override guidance, pointing at the updated writing-agent-relay-workflows skill.

3. Stop warning about .agent-relay/ on every ephemeral run

The broker warned stale .agent-relay/ directory found whenever an ephemeral run found a .agent-relay/ in cwd. But the SDK workflow runner always writes .agent-relay/step-outputs/ and .agent-relay/team/worker-logs/ regardless of broker mode — those are durable artifacts, not broker state. The warning fired on virtually every workflow run as a false positive. Narrowed the check to look for .agent-relay/state.json specifically, which is the only file that indicates actual prior broker state worth warning about.

Test plan

  • npm run check in packages/sdk — clean
  • cargo check in crates/broker — clean
  • New scrub-pty-chrome.test.ts — 6/6 pass
  • Pre-existing failing tests reproduced on bare origin/main (unrelated to this change)
  • Reviewer: stress-test the regex on a real interactive run and confirm the new regression suite hasn't over-stripped anything signal-bearing

Companion skill PR

AgentWorkforce/skills #56 — documents the idleThresholdSecs: 0 pattern and the new "per-turn interactive spawn" coordination shape that the JSDoc points to.

🤖 Generated with Claude Code

…erride, tame stale-state warning

Three quality-of-life fixes surfaced by running a multi-CLI workflow
end-to-end with interactive PTY agents:

1. **Strengthen scrubForChannel for Claude Code TUI noise.**
   Interactive-agent step "Output:" blocks were unreadable — the captured
   PTY stream included Claude Code's bottom status bar
   ("Opus 4.7 (1M context) ctx:5% $1.45"), vim-style mode indicators
   ("--INSERT--⏵⏵bypasspermissionson"), no-whitespace UI hints
   ("pasteagaintoexpand"), and ambient thinking-status fragments
   ("↓ 13 tokens · thinking with high effort", "Crunched for 32s").
   None of these were caught by the existing regexes.

   Added: claudeFooterRe, vimModeRe, thinkingStatusRe; relaxed uiHintRe
   to match the no-whitespace TUI rendering variants. New regression test
   (scrub-pty-chrome.test.ts) covers the patterns from a real capture and
   guards against over-stripping (prose that mentions model names is
   preserved).

2. **Document idleThresholdSecs as the per-agent idle escape hatch.**
   The option is already plumbed end-to-end (AgentOptions → constraints
   → broker spawn arg) but the JSDoc was one line and didn't mention that
   `0` disables idle detection — so users hitting "agent exited before
   the awaited event" had no idea how to fix it. Expanded the doc with
   when-to-override / when-not-to-override guidance pointing at the
   skill.

3. **Stop warning about .agent-relay/ on every ephemeral run.**
   The broker warned "stale .agent-relay/ directory found" whenever an
   ephemeral run found a .agent-relay/ dir in cwd. But the SDK workflow
   runner *always* writes .agent-relay/step-outputs/ and .agent-relay/
   team/worker-logs/ regardless of broker mode — those are durable
   artifacts, not broker state. The warning fired on virtually every
   workflow run as a false positive. Narrowed the check to look for
   .agent-relay/state.json specifically, which is the only file that
   indicates actual prior broker state worth warning about.

All three are non-breaking. Pre-existing test failures in
verification-traceback.test.ts and verification-custom.test.ts are
unrelated and present on origin/main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@khaliqgant khaliqgant requested a review from willwashburn as a code owner May 20, 2026 10:33
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 841262ee-ef16-4266-a2f0-7dbc81770c37

📥 Commits

Reviewing files that changed from the base of the PR and between 87cda72 and f78f390.

📒 Files selected for processing (5)
  • .trajectories/index.json
  • packages/sdk/src/workflows/__tests__/channel-messenger.test.ts
  • packages/sdk/src/workflows/__tests__/scrub-pty-chrome.test.ts
  • packages/sdk/src/workflows/channel-messenger.ts
  • packages/sdk/src/workflows/runner.ts
✅ Files skipped from review due to trivial changes (1)
  • .trajectories/index.json

📝 Walkthrough

Walkthrough

Refines broker init stale-state detection, centralizes and hardens PTY/TUI output scrubbing in channel-messenger, delegates runner sanitization to that module, reformats reliability/spawn expressions, adds comprehensive PTY scrubbing tests, expands a builder JSDoc, and updates trajectories metadata.

Changes

SDK Workflow Reliability & PTY Output Handling

Layer / File(s) Summary
PTY/TUI chrome scrubbing expansion
packages/sdk/src/workflows/channel-messenger.ts
Adds malformed PTY-frame detection/cleanup, broadens regexes for VIM/Claude/footer/thinking-status, uses cleaned lines for JSON-depth and meaningful output assembly.
Runner reliability, spawn formatting & delegation
packages/sdk/src/workflows/runner.ts
Imports and delegates scrubForChannel to channel-messenger; parenthesizes reliability-related repairRetries expressions; reformats multiline permission/resolve calls and spawnAndWait handlers without behavioral change.
Scrubbing regression tests and formatStepOutput tests
packages/sdk/src/workflows/__tests__/scrub-pty-chrome.test.ts, packages/sdk/src/workflows/__tests__/channel-messenger.test.ts
New Vitest tests covering PTY/TUI chrome removal scenarios, malformed-frame stripping, secret redaction, preservation of workflow control signals, and a guard against over-aggressive scrubbing; minor test-data formatting refactor.
Agent idle threshold documentation
packages/sdk/src/workflows/builder.ts
Expanded JSDoc for AgentOptions.idleThresholdSecs describing default behavior, disabling with 0, and override guidance.

Broker Initialization State Detection

Layer / File(s) Summary
Targeted stale-state file detection
crates/broker/src/runtime/init.rs
Replaces broad .agent-relay/ existence warning with a read-and-filter for state-*.json files and emits warnings only when such files are present, listing count and paths.

Trajectories metadata

Layer / File(s) Summary
Trajectories index update
.trajectories/index.json
Updates lastUpdated timestamp and appends a completed trajectory record traj_af7iew24eiip.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • willwashburn

Poem

🐰 Hops through tty-crumbs and clears the mess,

trims footers, vim signs, and spinner stress.
Keeps the transcript honest, secrets hid tight,
returns clean lines so workflows sleep at night. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the three main changes: tightening PTY scrubbing, documenting idle override, and reducing false-positive warnings.
Description check ✅ Passed The description covers all required template sections with detailed context, comprehensive test plan, and specific implementation details for each of the three fixes.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/workflow-runtime-game-lessons

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment thread crates/broker/src/runtime/init.rs Outdated
// on virtually every workflow run — a noisy false positive. The
// discriminator is `state.json`, which only the broker writes and only
// in persist mode.
let stale_state = runtime_cwd.join(".agent-relay").join("state.json");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Stale state check looks for state.json but persist mode creates state-{name}.json

The new stale-state detection checks for {cwd}/.agent-relay/state.json (init.rs:44), but ensure_runtime_paths in crates/broker/src/runtime/paths.rs:140,176,187 always creates state-{safe_name}.json (e.g., state-my-project.json) in persist mode. The only code that creates a bare state.json is ensure_ephemeral_paths (paths.rs:64), but that writes to a temp directory, not .agent-relay/. As a result, this stale-state warning will never fire for any persist-mode run, silently defeating the intent of the check.

Filename used in persist mode vs stale check

Persist mode (paths.rs:187):

state: root.join(format!("state-{safe_name}.json")),

Stale check (init.rs:44):

let stale_state = runtime_cwd.join(".agent-relay").join("state.json");
Prompt for agents
The stale state check hardcodes `state.json` but `ensure_runtime_paths` (paths.rs:140,176,187) creates `state-{safe_name}.json` where safe_name is derived from the broker name. This means the stale check never matches files left by a persist-mode broker.

To fix this, the check should look for any file matching the `state-*.json` glob pattern inside `.agent-relay/`, rather than a single hardcoded name. For example, use `std::fs::read_dir` on the `.agent-relay/` directory and check for any entry whose name starts with `state-` and ends with `.json`. Alternatively, just check for `.agent-relay/state-{safe_name}.json` using the same `resolved_name` variable that is already computed above (after sanitizing it the same way `ensure_runtime_paths` does).

The suggested-removal message should also be updated to reference the correct filename(s).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in e2ceccf. ensure_runtime_paths writes state-{safe_name}.json (not bare state.json), so my original check would never have fired. Replaced the single-name check with a glob of .agent-relay/state-*.json so it catches state files from any broker name, and the warning now lists every match.

Proactive Runtime Bot and others added 2 commits May 20, 2026 12:41
PR feedback from Devin Review on #930: the prior commit checked for
.agent-relay/state.json, but ensure_runtime_paths in runtime/paths.rs
writes state files as state-{safe_name}.json (where safe_name is the
sanitized broker name) — only ensure_ephemeral_paths writes a bare
state.json, and that's to a temp dir, never to .agent-relay/.

As a result the stale-state warning would never fire for any persist-
mode run, silently defeating the intent of the check.

Fix: glob .agent-relay/ for any state-*.json entry instead of looking
for a single hardcoded name. Surface every match so the user can see
exactly what's stale regardless of which broker name produced it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/sdk/src/workflows/__tests__/scrub-pty-chrome.test.ts`:
- Around line 23-35: The test currently only ensures tokens are removed but may
allow footer line fragments to remain; update the test around scrub to assert
the entire footer lines are absent by checking that the exact footer strings are
not present as whole lines (e.g., use expect(out.split('\n')).not.toContain(...)
or a multiline regex like not.toMatch(/^[^\n]*Opus.*ctx.*\$\d+\.\d+$/m)) so the
full footer lines produced by Claude are stripped; reference the scrub helper
(and scrubForChannel if used elsewhere) when locating the logic to validate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d077cd2e-9de4-4794-9ff6-92dc68afced2

📥 Commits

Reviewing files that changed from the base of the PR and between f5f62e0 and 87cda72.

📒 Files selected for processing (4)
  • crates/broker/src/runtime/init.rs
  • packages/sdk/src/workflows/__tests__/scrub-pty-chrome.test.ts
  • packages/sdk/src/workflows/builder.ts
  • packages/sdk/src/workflows/runner.ts

Comment on lines +23 to +35
it('strips the Claude Code bottom status bar (model + ctx% + cost)', () => {
const input = [
'real content line',
'workflows git:(main) Opus 4.7 (1M context) ctx:5% $1.45',
'Opus4.7(1Mcontext) ctx:6% $1.54',
'another real line',
].join('\n');
const out = scrub(input);
expect(out).toContain('real content line');
expect(out).toContain('another real line');
expect(out).not.toMatch(/ctx\s*:\s*\d+%/);
expect(out).not.toMatch(/\$\d+\.\d+/);
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert full footer-line removal, not just token removal.

This test can still pass if scrubForChannel leaves non-token footer fragments behind. Add exact-line absence checks to guard against partial stripping regressions.

Suggested test tightening
   const out = scrub(input);
   expect(out).toContain('real content line');
   expect(out).toContain('another real line');
+  expect(out).not.toContain('workflows git:(main) Opus 4.7 (1M context) ctx:5% $1.45');
+  expect(out).not.toContain('Opus4.7(1Mcontext) ctx:6% $1.54');
   expect(out).not.toMatch(/ctx\s*:\s*\d+%/);
   expect(out).not.toMatch(/\$\d+\.\d+/);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdk/src/workflows/__tests__/scrub-pty-chrome.test.ts` around lines
23 - 35, The test currently only ensures tokens are removed but may allow footer
line fragments to remain; update the test around scrub to assert the entire
footer lines are absent by checking that the exact footer strings are not
present as whole lines (e.g., use expect(out.split('\n')).not.toContain(...) or
a multiline regex like not.toMatch(/^[^\n]*Opus.*ctx.*\$\d+\.\d+$/m)) so the
full footer lines produced by Claude are stripped; reference the scrub helper
(and scrubForChannel if used elsewhere) when locating the logic to validate.

runner.ts: packages/sdk/src/workflows/runner.ts
runner.ts: packages/sdk/src/workflows/runner.ts
channel-messenger.ts: packages/sdk/src/workflows/channel-messenger.ts

Co-Authored-By: My Senior Dev <dev@myseniordev.com>
@khaliqgant khaliqgant merged commit 0ec31de into main May 20, 2026
44 checks passed
@khaliqgant khaliqgant deleted the fix/workflow-runtime-game-lessons branch May 20, 2026 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant