Skip to content

feat(harness): relocate to data dir, split runtime/scratch, archive on upgrade#38

Merged
Alezander9 merged 9 commits intomainfrom
feat/harness-data-dir-relocation-v2
May 6, 2026
Merged

feat(harness): relocate to data dir, split runtime/scratch, archive on upgrade#38
Alezander9 merged 9 commits intomainfrom
feat/harness-data-dir-relocation-v2

Conversation

@Alezander9
Copy link
Copy Markdown
Member

Summary

Relocates the vendored harness from ~/.cache/bcode/harness/ to <dataDir>/harness/, splits sock from screenshots so screenshots land in a clean persistent path, and snapshots the harness on every binary upgrade so the agent can read past versions when migrating its helpers. Also tightens up the browser_execute tool description.

Five focused commits on top of clean main (post upstream-v1.14.39 sync). Zero Red-zone changes - the harness Python diff that this work needed shipped via browser-use/browser-harness PR #318 and was synced into this repo before this PR was rebased.

Commits

SHA Subject Why
72f58df8 browser_execute: defer workflow guidance to vendored SKILL.md Tool description was hard-wrapped, broken in traces, and duplicated SKILL.md. Replaced with four concise paragraphs that say only what the agent needs.
531cced2 harness: relocate to data dir, add build-hash extraction sentinel The harness contains agent-edited files; it is data, not cache. Move to <dataDir>/harness/ so a ~/.cache wipe no longer destroys agent self-improvements. Add SHA-256 sentinel at .bcode-build so warm launches skip extraction (one stat).
c706f37f harness: split BH_RUNTIME_DIR (sock) from BH_TMP_DIR (screenshots) Adopts the upstream BH_RUNTIME_DIR/BH_TMP_DIR split. Sock stays short under /tmp/bcode/<sid>/ (AF_UNIX 104-byte budget); screenshots live deep under <dataDir>/sessions/<sid>/.
a22a4c58 harness: snapshot to harness-archive/ on binary upgrade When the sentinel mismatches, copy the active tree to <dataDir>/harness-archive/<old-buildHash>/ (excludes .venv/, __pycache__/). Agent reads this read-only history to migrate its own helpers across upgrades.
ec490cad browser_execute: resolve harness path in tool description Description referenced packages/bcode-browser/harness/SKILL.md by source-tree path, which does not exist on disk in compiled mode. Templated to {{HARNESS_DIR}} + substitute the resolved path at make-time.

Why this matters

Before this PR, the harness lived in ~/.cache/, which:

  • gets nuked by routine cache wipes, taking the agents edits to agent_helpers.py with it
  • forced screenshots into the same path-length-budgeted dir as the AF_UNIX sock, so screenshots ended up in awkward locations
  • left no trail across binary upgrades - the agent had no way to see what its previous version of a helper looked like

After:

  • harness is data, lives where data lives, survives cache wipes
  • sock and screenshots have their own dirs sized for their respective constraints
  • every binary upgrade leaves a read-only snapshot the agent can diff against
  • the entire agent-workspace/ subtree (helpers + agent-authored skills) persists across upgrades, not just agent_helpers.py

Modification zones

  • Green: packages/bcode-browser/src/{harness.ts,browser-execute.ts}, packages/bcode-browser/script/embed-harness.ts.
  • Yellow: packages/opencode/src/agent/agent.ts (permission whitelist for new paths), packages/opencode/src/tool/browser-execute.{ts,txt} (Level-2 adapter, schema/context translation only).
  • Red: none. The harness Python edits that this work originally needed went through browser-use/browser-harness PR #318 and arrived via the standard sync path (merge 3d7f38ff2).

Tests

  • bun run typecheck (filtered): 5/5 packages clean.
  • pytest tests/ in packages/bcode-browser/harness: 93/93 passing.
  • Manual smoke pending on the users mac - harness extraction + sock budget + first-call SKILL.md read all need a real run.

Alezander9 added 5 commits May 6, 2026 15:29
The tool description was hard-wrapped at ~80 columns mid-paragraph,
showing up in traces with broken sentences, and duplicated content from
the vendored harness's SKILL.md. Replace with four concise paragraphs
that say only what the agent needs to decide whether to use the tool,
and require reading SKILL.md before the first call. helpers.py is
optional reference for exact signatures.
The vendored harness contains agent-edited files (agent_helpers.py and,
later, domain-skills) which is data, not cache. Move the extraction
target from ~/.cache/bcode/harness/ to <Global.Path.data>/harness/
(~/.local/share/bcode/harness/ on Linux/Mac) so a ~/.cache wipe no
longer destroys agent self-improvements.

Add a content-hash sentinel at <harness>/.bcode-build that records the
embed bundle that produced the on-disk tree. Warm launches stat the
sentinel and skip extraction; binary upgrades trigger a fresh extract
that overwrites every embed file except anything under agent-workspace/
(the Green-zone subtree: agent_helpers.py and any agent-authored files
like domain-skills/<host>/*.md persist across upgrades).

Resolve the harness eagerly at BrowserExecute.make() time rather than
lazily on first browser_execute call, so SKILL.md is on disk by the
time the agent reads the tool description (which now requires reading
SKILL.md before the first call). Migration: pre-existing harness at
~/.cache/bcode/harness/ is renamed to the new location on first launch
(EXDEV fallback to recursive cp+rm), preserving the agent-workspace tree.

Pass dataDir as a parameter to resolveHarnessDir/make() so
@browser-use/bcode-browser stays decoupled from @opencode-ai/core; the
opencode adapter supplies Global.Path.data. Agent permission whitelist
updated to use the new path via the exported Harness.harnessDir helper.
Adopts the upstream BH_RUNTIME_DIR/BH_TMP_DIR split (browser-harness
PR #318): the harness now keeps sock/port/pid in BH_RUNTIME_DIR and
log/screenshots/debug overlays in BH_TMP_DIR. Wire that through bcode:

  bhScratchDir   <dataDir>/sessions/<sid>/   persistent, deep path OK
  bhRuntimeDir   /tmp/bcode/<sid>/ (POSIX) or os.tmpdir()/bcode/<sid>/
                                             volatile, AF_UNIX-budget short

Splits ExecuteContext.bhTmpDir into bhScratchDir + bhRuntimeDir; adds
sessionRuntimeDir() alongside sessionScratchDir(). The opencode adapter
passes Global.Path.data for scratch and the platform tmpdir for runtime.
When the build-hash sentinel mismatches (= bcode binary upgrade), copy
the active <dataDir>/harness/ tree to
<dataDir>/harness-archive/<old-buildHash>/ before re-extracting. The
agent uses this read-only history when migrating its own helpers across
upgrades — e.g. checking how a helper signature changed between
versions, or recovering an interaction-skill it had locally edited.

Excludes .venv/ and __pycache__/ from the snapshot (regenerable +
bulky). Idempotent: if the archive subdir already exists, skip the copy
(handles concurrent first-callers).

Whitelist <dataDir>/harness-archive/* in the agent permission ruleset
so reads/edits don't prompt — symmetric with the active harness whitelist.
The description references SKILL.md and helpers.py by source-tree path,
which only resolves correctly in dev mode. In compiled binaries the
harness extracts to <dataDir>/harness/ and those source paths don't
exist on disk, so the agent's first read fails.

Template the path: replace packages/bcode-browser/harness/ with
{{HARNESS_DIR}} in the description, and substitute the resolved path
(dev or compiled) in the opencode adapter at make-time. Eager harness
extraction (already in the data-dir relocation commit) guarantees the
files exist before the agent reads the description.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/bcode-browser/src/harness.ts">

<violation number="1" location="packages/bcode-browser/src/harness.ts:140">
P2: `resolveHarnessDir(dataDir)` caches extraction with a single global promise, so subsequent calls with a different `dataDir` can return the wrong harness path. Cache this per `dataDir` (or enforce a single immutable `dataDir`) to avoid cross-directory contamination.</violation>
</file>

<file name="packages/opencode/src/agent/agent.ts">

<violation number="1" location="packages/opencode/src/agent/agent.ts:104">
P2: `harness-archive` is whitelisted as fully allowed external directory access, which permits accidental edits/deletes of snapshots that are intended to be read-only.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/bcode-browser/src/harness.ts Outdated
Comment thread packages/opencode/src/agent/agent.ts
Alezander9 added 2 commits May 6, 2026 15:47
Tells the agent up front: edit only under agent-workspace/ (those
edits persist), everything else gets overwritten on upgrade, and
previous trees are kept at <harness-archive>/<old-build>/ for
reference. One short paragraph; the description still defers all
workflow guidance to SKILL.md.

Templates {{HARNESS_ARCHIVE_DIR}} alongside {{HARNESS_DIR}}; both
substituted in the opencode adapter from BrowserExecute.make().
resolveHarnessDir cached extraction with a single module-level promise
keyed by nothing, so a second call with a different dataDir returned
the first call's path. In production opencode passes a singleton
Global.Path.data so this never bit, but tests and any future
multi-instance scenario would silently get cross-dataDir contamination.
Switch to a Map<dataDir, Promise<path>> — same dataDir still
deduplicates, distinct dataDirs each get their own extraction.

harness-archive/ was whitelisted in external_directory:allow, which
let edit/write/apply_patch silently mutate snapshots that are intended
to be read-only history. Keep the dir-level whitelist (so reads stay
silent — the agent is supposed to browse the archive when migrating
helpers across upgrades), but add an edit:deny rule keyed on
'*/harness-archive/*'. The leading * absorbs the worktree-relative
prefix that edit/write/apply_patch produce; the dir name is the anchor.
All three edit-class tools route through permission='edit' so one rule
covers them. Bash-level mutations (rm -rf) are still possible, but the
agent has no prompt-driven path to them and the user can deny bash
explicitly via config if desired.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/bcode-browser/src/harness.ts">

<violation number="1" location="packages/bcode-browser/src/harness.ts:147">
P2: Avoid caching failed extraction promises permanently; clear the cache entry on rejection so later calls can retry.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic.

Comment thread packages/bcode-browser/src/harness.ts
Alezander9 added 2 commits May 6, 2026 16:01
Per-dataDir cache from previous commit retained rejected promises forever,
so a transient failure (disk full mid-extract, ephemeral file lock,
network blip in a sub-call) would poison resolveHarnessDir for the rest
of the process — only a restart could recover.

Attach a sibling .catch that evicts the cache entry on rejection. The
returned promise still rejects to the original caller and any concurrent
waiters; only the cache slot is freed so the next call retries fresh.
Caching a rejected promise meant a single transient extraction failure
(FS hiccup, partial write, race during install) bricked every later
resolveHarnessDir call until process restart. Attach a .catch that
deletes the entry, gated by '=== fresh' so a retry that started
between failure and handler doesn't get evicted.
@Alezander9 Alezander9 merged commit 1b55206 into main May 6, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant