Skip to content

refactor(test): migrate destructive e2e (L4/L5) to local Tart VM#80

Merged
fullstackjam merged 7 commits into
mainfrom
feat/tart-vm-e2e
May 17, 2026
Merged

refactor(test): migrate destructive e2e (L4/L5) to local Tart VM#80
fullstackjam merged 7 commits into
mainfrom
feat/tart-vm-e2e

Conversation

@fullstackjam
Copy link
Copy Markdown
Collaborator

Summary

  • Destructive e2e tests move off GHA macos-latest (a shared, dirty runner) into a local Tart VM on Apple Silicon. L4 + L5 collapse into a single L4 VM e2e tier (make test-vm).
  • No CI gate replaces the destructive jobs — running make test-vm before tagging is convention, not enforced. auto-release.yml keeps auto-tagging fix:-only patch bumps but now opens a release-ready issue for feat: thresholds to nudge a human to run the local VM suite first.
  • The 12 e2e test files run unchanged; MacHost keeps its API. The migration is driver-layer plumbing: scripts/vm/run.sh clones an ephemeral Tart VM from a local base image, rsyncs the working tree, and SSHs in to run make test-vm-inner.

What changed

  • New: scripts/vm/{run.sh, lib.sh, README.md} — Tart driver + one-time setup docs (base image: ghcr.io/cirruslabs/macos-tahoe-base:latest).
  • New Makefile targets: test-vm, test-vm-run TEST=..., plus internal test-vm-inner / test-vm-inner-run invoked over SSH.
  • Removed Makefile targets: test-vm-quick, test-vm-release, test-vm-full, test-destructive, test-smoke, test-smoke-prebuilt.
  • Build tags: e2e,destructive retired; everything is e2e,vm now.
  • testutil/machost.go: requireEphemeralHost gains an OPENBOOT_IN_VM branch (set by run.sh); legacy CI=true / OPENBOOT_E2E_DESTRUCTIVE=1 stay as fallbacks.
  • CI deletes: macos-e2e + destructive jobs from test.yml; Destructive tests step + smoke-test job from release.yml; entire smoke-test.yml. The remaining gate-tests job runs Vet + L1 only.
  • auto-release.yml: patch fast lane auto-tags as before; feat threshold opens an issue labeled release-ready with a make test-vm checklist instead of tagging.
  • Docs: CONTRIBUTING.md (Test Layering table + new VM E2E setup section), CLAUDE.md (Commands block), docs/HARNESS.md (table + "intentionally NOT in the harness" entry), AGENTS.md (stale tag list), one ship-pr / bootstrap-feature SKILL.md fix.

Test plan

  • make test-unit (L1) — green
  • go vet ./... — clean
  • go test ./internal/archtest/... — passes (no archtest baseline drift)
  • make test-vm-run TEST=TestVM_Infra end-to-end on local Apple Silicon — VM clones, boots, test passes, VM destroyed (verified during Task 1 implementation)
  • OPENBOOT_VM_KEEP=1 debug knob: VM stays, attach via tart ssh <vm> (uses Tart's admin/admin)
  • Leaked-VM sweep: pre-created openboot-ephemeral-99999 is cleaned by next run.sh startup
  • Host-side go test -tags="e2e,vm" -run TestVM_Infra ./test/e2e/... correctly SKIPs with the new message (gate works)
  • YAML parse + job-graph for both test.yml and release.yml — no dangling needs edges

Notes

  • Requires Apple Silicon + Tart locally for the destructive suite. Intel contributors run L1 only; L4 is pre-release-only and lives on a maintainer's Mac.
  • The implementation deviates from the design in two ways the spec didn't anticipate: an ephemeral ED25519 keypair is injected via tart exec (the base image has no pre-authorized keys; a local 1Password agent floods MaxAuthTries before any password attempt), and Go is installed via mise install go@latest on each cold boot (the base image has mise but not Go). Both are documented in the Task 1 commit message.

scripts/vm/run.sh provisions an ephemeral Tart VM, rsyncs the working
tree in, runs an in-VM make target over SSH, and tears down on EXIT.
New Makefile targets test-vm / test-vm-run / test-vm-inner /
test-vm-inner-run plumb this through.

Implementation notes vs. the original plan:
- The macos-tahoe-base image ships mise but not Go; run.sh installs
  Go via `mise install go@latest` on each fresh clone. This adds ~10s
  on first run but keeps the base image unmodified.
- An ephemeral ED25519 key is generated per run and injected via
  `tart exec` before SSH. This avoids the 1Password SSH agent (or any
  local agent) exhausting MaxAuthTries before a real key is tried.
  lib.sh's ssh_exec now takes the key path as the second argument.
- `OPENBOOT_VM_KEEP=1` debug message updated to print the ephemeral
  SSH key path for attaching to the running VM.

Old destructive targets (test-vm-quick/release/full, test-destructive,
test-smoke) stay for now — removed in a follow-up after build tags
collapse. See docs/superpowers/specs/2026-05-17-l4-l5-tart-local-design.md.
After Task 1 introduced the Tart VM driver, every destructive test
runs inside an ephemeral VM — there is no longer a meaningful
'destructive vs vm' distinction. Merge into a single e2e,vm tag.

The e2e,destructive build tag is retired and unused after this commit.
scripts/vm/run.sh sets OPENBOOT_IN_VM=1 over SSH when it invokes the
in-VM make target. requireEphemeralHost now accepts that as a more
precise signal than CI=true (which leaks in from any GHA runner, not
just throwaway ones). CI=true and OPENBOOT_E2E_DESTRUCTIVE=1 stay as
fallbacks for ad-hoc/legacy use.

Comment block at top of file rewritten to drop the obsolete 'no Tart
VM, no SSH' description.
Removed from test.yml:
  - macos-e2e job (L4)
  - destructive job (L5)
  - the run_destructive workflow_dispatch input

Removed from release.yml:
  - the 'Destructive tests' step in gate-tests
  - the smoke-test job and its dependent edge from release.needs

Removed entirely:
  - .github/workflows/smoke-test.yml (redundant with release.yml's
    smoke-test, which also goes away here)

Destructive e2e now runs only locally via scripts/vm/run.sh (added in
the previous commits). No CI gate replaces this — running 'make test-vm'
before tagging is a documented expectation, not enforced.

See docs/superpowers/specs/2026-05-17-l4-l5-tart-local-design.md.
Deleted:
  - test-destructive
  - test-smoke / test-smoke-prebuilt
  - test-vm-quick / test-vm-release / test-vm-full
  - the temporary test-vm-OLD-DELETE-ME alias

The new test-vm / test-vm-run / test-vm-inner / test-vm-inner-run
targets (added two commits back) are now the only entrypoints.
Header comment block rewritten.
Patch (fix:-only) bumps continue to auto-tag and dispatch release.yml.
Minor bumps (feat: present) now open a 'release-ready' labeled issue
with a checklist instead of auto-tagging — the human is expected to
run make test-vm locally and then tag manually.

Skipping test-vm is allowed; the issue is a nudge, not a hard gate.
Rationale: feat: changes carry more risk and benefit from the local
Tart VM e2e suite added in earlier commits. fix: patches keep going
through the existing fast lane.

Adds 'issues: write' to the workflow permissions. Header comment
block rewritten.
CONTRIBUTING.md: L4 and L5 rows collapse to a single L4 VM e2e row
('runs inside Tart VM, local only, no CI gate'). New 'VM E2E setup'
section walks through tart pull / tart clone. Rules of thumb updated.

CLAUDE.md: Commands block drops test-vm-release / test-destructive,
adds test-vm with a 'requires Apple Silicon + Tart' note.

docs/HARNESS.md: table rows for L4/L5 merge; auto-release row
reflects patch-vs-feat split; new 'intentionally NOT' entry explains
why there is no CI gate for VM e2e.
@github-actions github-actions Bot added tests Tests only ci CI/CD changes docs labels May 17, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@fullstackjam fullstackjam merged commit 56cc550 into main May 17, 2026
12 checks passed
@fullstackjam fullstackjam deleted the feat/tart-vm-e2e branch May 17, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CI/CD changes docs tests Tests only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant