Skip to content

Apply browser-leak fix to claude_code_harness + pi_harness (extends PR #14)#15

Open
Alezander9 wants to merge 1 commit into
codex/bu-bench-framework-reverificationfrom
fix-cch-pi-harness-symmetry
Open

Apply browser-leak fix to claude_code_harness + pi_harness (extends PR #14)#15
Alezander9 wants to merge 1 commit into
codex/bu-bench-framework-reverificationfrom
fix-cch-pi-harness-symmetry

Conversation

@Alezander9
Copy link
Copy Markdown
Member

Summary

Extends PR #14's framework verifier review fixes. The cubic-ai review caught the browser-leak class in codex_harness (fix in commit e3a3ebf), but claude_code_harness and pi_harness share the same start_remote_daemon API and the same execute() shape — _start_browser is called before env/cmd build and before asyncio.create_subprocess_exec, but _stop_browser only runs in an inner try/finally that begins AFTER subprocess spawn. A failure in env build, system_prompt.read_text(), _build_*_cmd, or create_subprocess_exec leaks the provisioned cloud browser session — exactly the bug cubic flagged on codex_harness.

This PR applies the same try/except _stop_browser wrap to cch and pi_harness. After this, all three browser-harness frameworks (cch, pi_harness, codex_harness) have parallel structure: three _stop_browser(browser_name, bu_name) call sites each — env-build early-exit, subprocess-spawn early-exit, normal finally.

This was discovered while porting PR #14's fixes to the internal benchmark-x-laminar repo (browser-use/benchmark-x-laminar#24). The audit there ran an AST comparison of every execute() body and found these two were the only ones still vulnerable.

Targeting

This PR targets codex/bu-bench-framework-reverification (PR #14's head branch) so the fix merges into the existing review thread rather than landing as a separate PR after #14 merges.

Validation

  • python3 -m compileall -q frameworks/claude_code_harness/run_task.py frameworks/pi_harness/run_task.py clean
  • Diff structure mirrors the codex_harness fix in e3a3ebf (just adapted for cch's / pi_harness's slightly different env dict shape)
  • 2 files changed, +74 / -46

Related

The cubic-ai review caught this leak class in codex_harness (commit
e3a3ebf), but claude_code_harness and pi_harness share the same
start_remote_daemon API and the same execute() shape: _start_browser
is called before env/cmd build and before subprocess.create_subprocess_exec,
but _stop_browser only runs in an inner try/finally that begins AFTER
subprocess spawn. A failure in env build, system_prompt read,
_build_*_cmd, or create_subprocess_exec leaks the provisioned cloud
browser.

Wraps env+cmd build in try/except _stop_browser, and the subprocess
spawn + stderr_task creation in a separate try/except that also
kills any partially-started proc before tearing down the browser.
Same pattern as the codex_harness fix; symmetric now for all three
browser-harness frameworks (cch, pi_harness, codex_harness).
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant