Skip to content

Harness-config decoupling: top-level harnesses/ bundles + opt-in install#161

Open
scion-gteam[bot] wants to merge 25 commits into
mainfrom
scion/harness-refactor
Open

Harness-config decoupling: top-level harnesses/ bundles + opt-in install#161
scion-gteam[bot] wants to merge 25 commits into
mainfrom
scion/harness-refactor

Conversation

@scion-gteam
Copy link
Copy Markdown

@scion-gteam scion-gteam Bot commented Jun 6, 2026

Summary

Decouples harness configs from the scion binary. Moves OpenCode/Codex out of Go embeds into a new top-level harnesses/ directory of plain on-disk bundles, ports the Antigravity harness, shrinks the default-install set to {claude, gemini}, and removes the compiled-in OpenCode/Codex Go implementations entirely. OpenCode, Codex, and Antigravity become opt-in bundles installed via scion harness-config install harnesses/<name>.

Implements the plan in .design/harness-config-decoupling.md (decisions locked 2026-06-06).

Phases

  • A — Establish harnesses/ + relocate OpenCode/Codex. New harnesses/{opencode,codex}/ with explicit home/** layout, co-located Dockerfile + per-bundle cloudbuild.yaml, per-bundle READMEs, and a golden install/layout safety-net test (TestBundleInstall_OpenCode/Codex) proving byte-parity with the embeds before they were removed.
  • B — Port Antigravity. harnesses/antigravity/ ported from ptone/scion-antigravity. No schema changes needed — the existing HarnessConfigEntry already supports the mcp block, oauth-token, vertex-ai: {}, and dialect. Image build repointed to the in-repo scion-base chain.
  • C — Shrink default-install set. harness.All(){gemini, claude}; call sites audited; 4-harness assertions updated; hub pull-images fallback shrunk to the default set.
  • D — Drop the Go. Removed pkg/harness/{opencode,codex} (embeds + impls), codex_config.go, and the parity tests; pruned the New()/newBuiltin() cases. Excluded Dockerfile/cloudbuild.yaml/README.md/.gitkeep from ComputeHarnessConfigRevision. Repointed harness-config reset to a clear "reinstall from bundle" error for opt-in harnesses. Removed image-build/{opencode,codex}/ and repointed the build DAG + cloudbuild-harnesses.yaml at the bundle dirs.
  • E — Discoverability & docs. New harnesses/README.md index; updated image-build/README.md, top-level README.md, and design-doc cross-refs.
  • F — Migration for existing installs. Confirmed init/upgrade is additive (existing configs untouched). Closed the legacy-provisioner.type: builtin gap: resolve.go now warns when a legacy on-disk opencode/codex config has no container-script provisioner, and scion harness-config upgrade auto-activates container-script when a provision.py is present (else warns to reinstall). Added a migration guide.

Testing

  • go build ./... clean; go vet clean.
  • New tests: bundle install/layout (opencode, codex, antigravity), revision-hash exclusions, bundle-reset error, additive-only seeding, legacy-builtin resolve warning, and upgrade auto-activation.
  • Zero regressions: a full-repo base-vs-branch failing-test diff shows an identical set of failures on both (all pre-existing, environment-dependent: hub/fixturegen/project-discovery tests that need container runtime / network / hub infra unavailable in CI sandbox). No new failures introduced by this branch.

Notes

  • scion harness-config list --available is deferred as a documented follow-up.
  • Out of scope (per plan): migrating Claude/Gemini to container-script bundles; splitting bundles into separate repos.

🤖 Generated with Claude Code

ptone and others added 25 commits June 6, 2026 20:40
Plan to move harness configs out of the default-install set into a new
top-level harnesses/ bundle directory. Refactors OpenCode and Codex into
on-disk bundles, ports the Antigravity harness config, and drops the Go
embeds + built-in implementations in favor of opt-in
'scion harness-config install'.

Decisions locked with ptone:
- harnesses/ at repo root
- default-install set shrinks to {claude, gemini}
- drop Go entirely (embeds + codex.go/opencode.go/codex_config.go)
- co-locate Dockerfile + cloudbuild.yaml in each bundle
- keep first-party bundles in-repo
… subdirs

Phase A.1 of the harness-config decoupling refactor: establish the new
top-level harnesses/ directory that will hold harness bundles as plain
on-disk artifacts (replacing the Go embeds in a later phase).
…layout

Phase A.2 of the harness-config decoupling refactor. Copies (not moves)
the embedded harness-config files from pkg/harness/{opencode,codex}/embeds/
into the new harnesses/ directory using the explicit home/** layout that
replaces the implicit mapEmbedFileToHomePath placement:

OpenCode:
  - config.yaml, provision.py at bundle root
  - opencode.json → home/.config/opencode/opencode.json

Codex:
  - config.yaml, provision.py at bundle root
  - config.toml → home/.codex/config.toml
  - scion_notify.sh → home/.codex/scion_notify.sh
  - bashrc → home/.bashrc

The Go embeds under pkg/harness/*/embeds/ are intentionally preserved
for Phase D — the tree must keep compiling with both copies.
Phase A.3 of the harness-config decoupling refactor (Decision 4). Copies
the Dockerfiles from image-build/{opencode,codex}/ into each bundle and
adds per-bundle cloudbuild.yaml files extracted from the centralized
image-build/cloudbuild-harnesses.yaml, threading BASE_IMAGE from
scion-base. The centralized image-build/ dirs are NOT deleted yet (Phase D).
Phase A.4 of the harness-config decoupling refactor. Adds
TestBundleInstall_OpenCode and TestBundleInstall_Codex which serve as the
safety net replacing the parity oracle before Phase D removes the Go
implementations:

- Validates config.yaml via LoadHarnessConfigDir
- Simulates opt-in install via util.CopyDir (same as installLocally)
- Asserts the home/ file layout matches what SeedHarnessConfig produces
  from the Go embeds (golden path + content parity)
- Verifies Provision stages the expected bundle files (provision.py,
  config.yaml, manifest.json, scion_harness.py, hook wrapper)
Phase A.5 of the harness-config decoupling refactor. Each README
documents the bundle purpose, install command, supported auth modes,
bundle layout, and image build instructions.
…hase B.1)

Copy the antigravity harness config from github.com/ptone/scion-antigravity
into harnesses/antigravity/ matching the Phase A bundle layout:

- config.yaml, provision.py, dialect.yaml at bundle root
- skills/.gitkeep (empty skills directory)
- Dockerfile and cloudbuild.yaml at bundle root (co-located per Decision 4)
- home/.gitkeep (provision.py generates home files at runtime)

The antigravity harness uses OAuth via gnome-keyring for authentication
and supports oauth-token and vertex-ai auth types.
…ase B.2)

Add a comprehensive test that validates the antigravity bundle through
the in-repo loader/validator:

- LoadHarnessConfigDir accepts the antigravity config.yaml
- Schema validates all fields including mcp block, oauth-token and
  vertex-ai auth types (the latter with empty {} body)
- CopyDir install simulation succeeds
- Provision stages expected files including dialect.yaml

No schema changes were needed — the existing HarnessConfigEntry struct
and settings-v1.schema.json already support all fields used by the
antigravity config: mcp (HarnessMCPConfig), auth.types with oauth-token
and vertex-ai, dialect (map[string]interface{}), and all provisioner
fields.
Replace hardcoded external registry references
(us-docker.pkg.dev/ptone-misc/scion-alt/) in the antigravity
Dockerfile and cloudbuild.yaml with the in-repo $_REGISTRY/$_TAG
substitution pattern matching opencode/codex bundles.

Image chain: core-base -> scion-base -> scion-antigravity

The keyring packages (gnome-keyring, libsecret, dbus-x11) needed by
antigravity are already present in core-base, so no additional apt
installs are needed in the antigravity Dockerfile.

Also adds copyright headers and buildx multi-arch support consistent
with the other harness bundles.
Add README covering purpose, install command, auth modes (oauth-token
and vertex-ai), bundle layout, and image build chain
(core-base -> scion-base -> scion-antigravity).

Mirrors the opencode/codex README format.
Phase C.1 of harness-config decoupling: the default-install set
returned by harness.All() now contains only GeminiCLI and ClaudeCode.
OpenCode and Codex become opt-in bundles installed via
`scion harness-config install`. The OpenCode/Codex Go types remain
in place (Phase D removes them).
Phase C.3: update tests that asserted the old 4-harness default:
- harness_test.go TestAll_ReturnsBuiltins: len==4 -> len==2, drop
  opencode/codex assertions
- mock_harness_test.go GetMockHarnesses: return only gemini+claude
  mocks (matches the new default-install set)
- init_test.go and init_project_test.go: per-harness template
  absence loops narrowed to {gemini, claude}

Tests that use opencode/codex in non-default-set contexts (e.g.
harness_config_test.go config resolution, parity tests) are left
unchanged — they test individual harness resolution, not the default
set.
Phase C.4: the maintenance pull-images executor's fallback harness
list was hardcoded to all 4 harnesses. With opencode/codex now
opt-in, the fallback should only pull default-install images.
When harnesses are explicitly configured (e.g. via settings or
maintenance config), the configured list is used instead.

Web UI fallback lists (agent-create.ts, project-settings.ts) and
auth.go harness name matching intentionally left unchanged — they
list all known/installable harnesses for UI selection and auth
resolution, not the default-install set.
… D.1)

Remove the compiled-in Go harness implementations for OpenCode and Codex.
These harnesses are now opt-in bundles installed from harnesses/{opencode,codex}/
via `scion harness-config install`, making the Go embeds and built-in types
redundant.

Deleted files:
- pkg/harness/opencode.go, codex.go, codex_config.go (Go implementations)
- pkg/harness/opencode/, codex/ (embed dirs + embeds.go)
- pkg/harness/opencode_test.go, codex_test.go (unit tests)
- pkg/harness/opencode_parity_test.go, codex_parity_test.go (parity tests)
- pkg/config/harness_config_external_test.go (all tests used deleted types)
- pkg/agent/opencode_provision_test.go (provisioned via Go embeds)

Edited:
- harness.go New(): removed opencode/codex cases (fall through to Generic)
- resolve.go newBuiltin(): removed opencode/codex cases
- resolve.go: added slog.Warn hint when opencode/codex resolve without an
  installed bundle, guiding users to `scion harness-config install`
- common_test.go, auth_config_test.go, capabilities_test.go, harness_test.go:
  removed opencode/codex entries from test tables

Golden fixture approach for bundle_install_test.go section 5:
Dropped the embed-vs-bundle cross-check entirely rather than creating golden
fixtures. The harnesses/ bundle dirs are the sole source of truth now (Phase A.4
already proved byte-equality). Sections 1-4 (load, install, layout, root files)
and section 6 (provisioning) provide sufficient regression coverage. The
cross-check was a transitional Phase A.4 safety net that served its purpose.

NOT touched (per spec): pkg/sciontool/hooks/dialects/codex.go (CodexDialect),
pkg/harness/auth.go:229 opencode/codex opt-in names, mock harness stubs,
GetHarnessEmbedsFS interface method, SeedHarnessConfig.
ComputeHarnessConfigRevision now skips files whose basename is
Dockerfile, cloudbuild.yaml, README.md, or .gitkeep when walking a
harness-config directory. These non-runtime files (co-located in
bundles since Phase A) should not perturb the revision hash — a
Dockerfile change does not invalidate a provisioned agent's config.

Adds TestComputeHarnessConfigRevision_SkipsNonRuntimeFiles asserting
that adding these files does not change the revision while a
config.yaml change does.

Inspected cmd/harness_config_install.go: install uses util.CopyDir to
copy the whole bundle directory — no separate seed/install allowlist
exists that would need updating. The non-runtime files get copied into
~/.scion/harness-configs/<name>/ but are harmlessly ignored at
provision time and now also ignored by the revision hash.
`scion harness-config reset` now checks whether the resolved harness
has embedded defaults (via GetHarnessEmbedsFS basePath). For opt-in
harnesses like opencode/codex (which resolve to Generic with no
embeds after Phase D.1), reset returns a descriptive error directing
the user to reinstall from the bundle directory instead of silently
seeding nothing.

Reset continues to work normally for built-in embed-backed harnesses
(claude, gemini).

Adds TestHarnessConfigReset_BundleHarnessReturnsError verifying the
error message for opt-in harnesses.
Removes image-build/opencode/Dockerfile and image-build/codex/Dockerfile
(now duplicates of harnesses/{opencode,codex}/Dockerfile added in Phase A).

Repoints build wiring to use the bundle locations:
- targets.sh step_dockerfile: scion-opencode/scion-codex now point to
  ${REPO_ROOT}/harnesses/{opencode,codex}/Dockerfile
- targets.sh step_context_dir: scion-opencode/scion-codex now point to
  ${REPO_ROOT}/harnesses/{opencode,codex}
- cloudbuild-harnesses.yaml: build-scion-opencode dir repointed from
  image-build/opencode to harnesses/opencode; build-scion-codex from
  image-build/codex to harnesses/codex

Target names, DAG groupings, build-args, and parent references are
unchanged — the centralized build still builds these images, just from
the bundle location. The -f Dockerfile reference in cloudbuild remains
valid since it is relative to the dir field.
…E.3)

- Create harnesses/README.md indexing opt-in bundles with install commands
- Update image-build/README.md hierarchy to show bundle-local builds
- Update top-level README.md to reflect {claude,gemini} default set
- Add cross-reference in decoupled-harness-implementation.md
- Mark Phase E complete in harness-config-decoupling.md
- Add clarifying comments on web UI harness fallback lists
- Note `harness-config list --available` as documented follow-up
…configs

Add TestSeedHarnessConfig_AdditiveOnly verifying that seeding the default
harness set (claude, gemini) does not modify or delete a pre-existing
opencode harness-config directory with custom config.yaml, provision.py,
and home directory files. This confirms the Phase F.1 migration safety
contract: existing installed configs are left untouched when the default
set shrinks.
resolve.go: Broaden the legacy warning to fire when an opencode/codex
harness-config dir exists on disk but uses provisioner.type "builtin"
(or missing) instead of "container-script". The warning recommends
`scion harness-config upgrade <name> --activate-script` (if provision.py
exists) or `scion harness-config install harnesses/<name>` (if not).
This fires before the declarative-generic fallback, so resolution still
succeeds — the warning is informational.

harness_config_upgrade.go: Add upgradeLegacyBuiltinConfig to handle
upgrade for configs whose compiled-in Go implementation was removed
(opencode, codex). When provisioner.type is "builtin" or missing and
provision.py exists on disk, auto-activate container-script provisioning
even without --activate-script. When no provision.py exists, add a
warning action telling the user to reinstall from the bundle. Configs
already on container-script are left unchanged.

Tests: Resolve tests for legacy-builtin opencode (falls to generic) and
missing codex (no dir). Upgrade tests for auto-activation with
provision.py, warning without provision.py, missing provisioner field,
container-script no-op, and dry-run correctness.
Document the four migration scenarios in harnesses/README.md:
(1) container-script configs need no action, (2) legacy builtin configs
must upgrade or reinstall, (3) fresh installs need explicit install
command, (4) existing agents are unaffected.
Check error returns from os.WriteFile/MkdirAll in
pkg/config/harness_config_test.go and replace the os.Getenv/Setenv
HOME dance with t.Setenv in cmd/harness_config_test.go (auto-cleanup,
no unchecked error). Resolves the 6 golangci-lint errcheck findings on
PR #161.
Fixes gofmt CI failures introduced by PR GoogleCloudPlatform#332 visibility work.

Co-authored-by: Preston Holmes <ptone@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant