feat(boxel-cli): add lint, parse, test validator commands#4881
Conversation
Lints every lintable (.gts/.gjs/.ts/.js) file in a realm via the realm `_lint` endpoint, or a single file when a realm-relative path is passed. Aggregates per-file violations into a single summary; exits non-zero on any error-severity violation. This is the realm-wide companion to the existing single-file `boxel file lint <path>` command. Closes the first gap from the Phase 1 runbook (CS-11149): the software factory's `runLintInMemory` validator becomes reachable from an interactive Claude Code session via Bash. `software-factory` keeps its in-process `runLintInMemory` for now; both coexist during the migration window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts the factory's glint runner + JSON document validator from `packages/software-factory/src/parse-execution.ts` into a top-level `boxel parse` command in boxel-cli. Behavior matches the existing factory tool: - Without a path: discovers every `.gts` / `.gjs` / `.ts` in the realm plus every `.json` file linked as a `Spec.linkedExamples`, runs glint (`ember-tsc`) over the GTS batch in a temp dir with monorepo-aware tsconfig paths, and validates the document structure of each JSON example. - With a path: parses just that single file (GTS → glint, JSON → document validation). Path resolution is anchored on this file's `__dirname`, so the command requires the Boxel monorepo layout — `packages/base`, `packages/host`, `packages/boxel-ui`, and `@glint/ember-tsc` (added as a boxel-cli devDependency) must all be resolvable. This is a factory-developer tool, not an end-user CLI feature. The factory keeps its own copy of `parse-execution.ts` for now; both coexist during the CS-11149 migration window. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts the factory's in-memory QUnit runner (`runTestsInMemory`) from
`packages/software-factory/src/test-run-execution.ts` into a
top-level `boxel test` command in boxel-cli. The runner:
- Discovers every `*.test.gts` file in the realm.
- Locates the host app's compiled `dist/` (env override, sibling
packages/host, or the root checkout when in a git worktree).
- Spins up a tiny HTTP server that serves the host's test bundles
+ a synthesized QUnit harness page with live-test enabled.
- Drives a headless Chromium against that page with the realm URL
in the query string; injects the per-realm JWT (if the active
profile has one) via `page.route()` so private realms can be
reached.
- Collects per-test QUnit results via `QUnit.on('testEnd' / 'runEnd')`
hooks and aggregates them into pass/fail/skip counts + per-failure
details.
Unlike the factory's `executeTestRunFromRealm`, this command does
NOT create or update a TestRun card — results are returned in-memory
only. Card persistence is the agent's responsibility in the new
Phase 1 flow.
`@playwright/test` is added as a boxel-cli devDependency. The
`findHostDistPackageDir` discovery helper is inlined from
`@cardstack/realm-test-harness/host-dist` to avoid pulling the
harness in as a dependency. Like `boxel parse`, this is a
monorepo-only command and not usable from the published CLI.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`pnpm --filter @cardstack/boxel-cli build` was failing with `No loader is configured for ".node" files: fsevents.node`. esbuild was trying to bundle Playwright's transitive deps — specifically the native `.node` files that ship with fsevents (macOS file watcher) and playwright itself — and obviously can't inline those. Added Playwright (`@playwright/test`, `playwright`, `playwright-core`) and `fsevents` to the external list. They stay as runtime `require`s; node resolves them from `packages/boxel-cli/node_modules/` when `boxel test` actually runs. Matches `boxel test`'s existing monorepo-only constraint — in published form, those `devDependencies` aren't installed and `boxel test` errors at import-time with a clear message. Bundle grew from 144 KB to 253 KB (still small) because the playwright-free bundle was missing realm-server / runtime-common code; the externals tweak doesn't bundle node_modules wholesale, so the inlined size is now correct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`boxel parse` and `boxel test` were both computing the monorepo layout by counting `..` segments up from `__dirname`. That works when the CLI runs from `src/commands/...` (ts-node fallback) but breaks when the same code runs from `dist/index.js` (the bundled form) — `__dirname` is at a different depth. Replaced the `__dirname` walking with a `findBoxelCliRoot` helper that walks up looking for the `@cardstack/boxel-cli` package.json. Robust against both entry modes (and any future bundling relocations). Drive-by fixes: - `parse.ts` was missing `dirname` from its `node:path` imports after the refactor; restored. - `find-package-root.ts` uses `for (;;)` instead of `while (true)` to keep ESLint's `no-constant-condition` happy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ts-node fallback (used when `dist/` is missing) was previously
calling `ts-node.register({ transpileOnly: true })` with no project
path. ts-node discovered tsconfig.json from cwd, which worked when
the CLI was invoked from inside the monorepo but failed from any
other tree (e.g. `/var/folders/.../tmp.XXX/`) because no tsconfig
is reachable walking up from there.
Pointed ts-node at boxel-cli's own tsconfig.json explicitly so the
fallback works regardless of caller cwd.
Three fixes from PR review (codex + Copilot):
- Lazy-load `@playwright/test` in `boxel test`. The top-level
`import { chromium }` made `boxel --help` (and every other
subcommand) crash in published installs where the devDependency
isn't present. Moved behind an async loader that fires only when
the test runner actually runs.
- Zero `*.test.gts` files → validator failure, not pass.
Previously `boxel test` returned `status: 'passed'` for a realm
with no tests, which would let the factory agent mark an Issue
done without ever writing one. Now returns `failed` with an
explicit errorMessage.
- Bounded-poll Spec discovery in `boxel parse`. The realm's
search index settles asynchronously, so a `boxel realm push`
immediately followed by `boxel parse` could silently miss the
freshly-pushed `linkedExamples` and pass when it shouldn't.
Wrapped the Spec search in a 30s/250ms retry-with-poll while
the result is ok-but-empty.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cb7302416f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (result.testFiles.length === 0) { | ||
| console.log(`${DIM}No .test.gts files found in the realm.${RESET}`); | ||
| return; |
There was a problem hiding this comment.
Return failure when no realm tests are found
When runTestsForRealm reports a failed/error result with an empty testFiles array (for example, a realm has no *.test.gts files, or discovery fails before any files are found), the non-JSON CLI takes this branch and returns before the final result.status !== 'passed' exit check. That makes boxel test --realm ... exit 0 in the exact no-tests case that the runner marks as a validator failure, so automation can treat an implementation with no tests as passing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
(Claude here, replying on behalf of @jurgenwerk.)
Fixed in fa373ee — boxel test now exits 1 whenever result.status !== 'passed' in the empty-testFiles early-return branch too, so a realm with no *.test.gts is treated as the validator failure that runTestsForRealm already returns.
PR #4881 splits the new validator commands (lint/parse/test) into their own boxel-cli PR. Once that lands and @cardstack/boxel-cli is republished, this branch can stop documenting the development shim (pnpm link --global / shell function / manual ln -sf symlink) and just say "install boxel-cli the normal way". - Skill setup blocks (bootstrap + scheduling): drop the install recipe entirely. The agent just verifies `boxel --help` lists lint/parse/test and bails out asking the user to run `pnpm i -g @cardstack/boxel-cli` if not. - Runbook prerequisites: replace the multi-paragraph dev-CLI build + shell-function block with one line — `pnpm i -g @cardstack/boxel-cli`. The host-app build and Playwright Chromium install are grouped under "monorepo-only validators (boxel parse and boxel test) need these extras", with a forward reference to the follow-up tickets for the realm-server `_parse` endpoint and the built-in QUnit harness. No agent-facing behavior change — the skills still verify, still bail out cleanly if `boxel` is missing. Just the install advice got straight.
There was a problem hiding this comment.
Pull request overview
This PR extends @cardstack/boxel-cli with three new validator-style commands (lint, parse, test) intended to support the software-factory workflow, plus supporting CLI/root-resolution and build changes to make the commands usable both from the monorepo and via the bundled dist/ entrypoint.
Changes:
- Add
boxel lint,boxel parse, andboxel testcommands, including JSON output modes for automation. - Add
findBoxelCliRoot()to stabilize monorepo-relative path resolution from bothdist/andsrc/(ts-node fallback) execution modes. - Update build/runtime plumbing: esbuild externals for Playwright/native deps, and ts-node registration now pins the CLI’s tsconfig.
Reviewed changes
Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| pnpm-lock.yaml | Adds lock entries for new CLI dependencies (@glint/ember-tsc, @playwright/test). |
| packages/boxel-cli/src/lib/find-package-root.ts | Introduces findBoxelCliRoot() to locate the CLI package root reliably. |
| packages/boxel-cli/src/commands/lint.ts | Adds whole-realm/single-file linting via the realm _lint endpoint. |
| packages/boxel-cli/src/commands/parse.ts | Adds monorepo-only parsing/typecheck (ember-tsc) + Spec linkedExamples JSON document validation. |
| packages/boxel-cli/src/commands/test.ts | Adds monorepo-only QUnit test runner using Playwright-driven headless Chromium against host dist/. |
| packages/boxel-cli/src/build-program.ts | Registers the new lint, parse, and test commands in the CLI program. |
| packages/boxel-cli/scripts/build.ts | Marks Playwright-related modules as esbuild externals to avoid bundling native/runtime-resolve deps. |
| packages/boxel-cli/package.json | Adds @glint/ember-tsc and @playwright/test dependencies. |
| packages/boxel-cli/bin/boxel.js | Improves ts-node fallback by explicitly setting the CLI tsconfig path. |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)
packages/boxel-cli/src/commands/parse.ts:418
- The temp-dir safety check uses
resolved.startsWith(tempDir + '/'), which is not portable (fails on Windows path separators) and silently skips files that resolve outside the temp dir. This can produce a misleading “passed” result (or drop diagnostics) if any unsafe path slips through. Use apath.relative(tempDir, resolved)-based check (orstartsWith(tempDir + path.sep)) and surface an explicit parse error when a file path is rejected.
for (let file of files) {
let normalized = join(tempDir, file.path);
let resolved = resolve(normalized);
if (!resolved.startsWith(tempDir + '/')) continue;
mkdirSync(dirname(resolved), { recursive: true });
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if (result.testFiles.length === 0) { | ||
| console.log(`${DIM}No .test.gts files found in the realm.${RESET}`); | ||
| return; |
There was a problem hiding this comment.
(Claude here, replying on behalf of @jurgenwerk.)
Fixed in fa373ee — added the if (result.status !== 'passed') process.exit(1) check before the early-return so the no-tests case (which runTestsForRealm already returns as status: 'failed') exits non-zero.
| if (options?.path) { | ||
| let path = options.path; | ||
| if (PARSEABLE_GTS_EXTENSIONS.some((ext) => path.endsWith(ext))) { | ||
| gtsFiles = [path]; | ||
| } else if (path.endsWith(PARSEABLE_JSON_EXTENSION)) { |
There was a problem hiding this comment.
(Claude here, replying on behalf of @jurgenwerk.)
Fixed in fa373ee — added a validateRealmRelativePath helper (mirrors packages/software-factory/src/realm-relative-path.ts) that rejects URL schemes, leading /, backslashes, percent-encoded traversal, and .. segments. It runs before the extension check on the public entry point, and runGlintCheck (the other site at line 414) now also validates each path and throws on tempDir-escapes instead of silently continue-ing.
| if (options?.path) { | ||
| let path = options.path; | ||
| if (!LINTABLE_EXTENSIONS.some((ext) => path.endsWith(ext))) { | ||
| return emptyErrorResult( | ||
| `Path "${path}" is not lintable — must end with one of ${LINTABLE_EXTENSIONS.join(', ')}`, | ||
| ); |
There was a problem hiding this comment.
(Claude here, replying on behalf of @jurgenwerk.)
Fixed in fa373ee — same validateRealmRelativePath helper as in parse.ts. boxel lint now rejects URL schemes, leading /, backslashes, percent-encoded traversal, and .. traversal segments before the extension check.
Host Test Results 1 files 1 suites 1h 30m 57s ⏱️ Results for commit 7b8859e. Realm Server Test Results 1 files ± 0 1 suites +1 8m 49s ⏱️ + 8m 49s Results for commit 7b8859e. ± Comparison against earlier commit 4f5f81e. |
Vendors a `validateRealmRelativePath` helper that rejects paths with URL schemes, leading `/`, backslashes, percent-encoded escapes, and `..` traversal segments. The validator commands previously accepted anything ending in the right extension, so `Cards/../foo.ts` or an absolute URL would reach realm-server URL handling with whatever normalization the layer chose. Mirrors the equivalent gate in `packages/software-factory/src/realm-relative-path.ts`. `boxel test` now also exits non-zero when `runTestsForRealm` returns `status: 'failed'` with an empty `testFiles` array — previously the no-tests early-return swallowed the validator failure. `runGlintCheck` no longer silently continues on a `..`-style path that resolves outside its temp dir; it throws instead. Addresses review comments on #4881. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The published package always ships `dist/index.js`, so end users never hit the fallback. Monorepo devs who want to run unbuilt source can use `pnpm --filter @cardstack/boxel-cli start` (`ts-node --transpileOnly` on `src/index.ts`), which is what the build/start scripts already use. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit b60e7a7.
Drops the `project:` tsconfig path added in 3d6d8e7 — back to the version on main. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the vendor-neutral skills that drive the software factory from
inside an interactive Claude Code session, plus the runbook that
documents how to use them:
- `.agents/skills/software-factory-bootstrap` (rewritten)
- `.agents/skills/software-factory-operations` (rewritten)
- `.agents/skills/software-factory-scheduling` (new)
- `.agents/skills-sdk/{bootstrap,operations}` — verbatim pre-rewrite
snapshots for the existing SDK orchestrator path
- `docs/runbook.md` — single-prompt end-to-end flow
- `src/factory-skill-loader.ts` — load from `.agents/skills-sdk/` so
the orchestrator and interactive flow don't fight
- `.claude/CLAUDE.md` — documents the dual skill directories
- `.gitignore` — ignore the `factory-test-*/` mktemp workspaces
The boxel-cli validator commands (`lint`, `parse`, `test`) the skills
call live on the base branch (PR #4881).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Code
Adds the vendor-neutral skills that drive the software factory from
inside an interactive Claude Code session, plus the runbook that
documents how to use them:
- `.agents/skills/software-factory-bootstrap` (rewritten)
- `.agents/skills/software-factory-operations` (rewritten)
- `.agents/skills/software-factory-scheduling` (new)
- `.agents/skills-sdk/{bootstrap,operations}` — verbatim pre-rewrite
snapshots for the existing SDK orchestrator path
- `docs/runbook.md` — single-prompt end-to-end flow
- `src/factory-skill-loader.ts` — load from `.agents/skills-sdk/` so
the orchestrator and interactive flow don't fight
- `.claude/CLAUDE.md` — documents the dual skill directories
- `.gitignore` — ignore the `factory-test-*/` mktemp workspaces
The boxel-cli validator commands (`lint`, `parse`, `test`) the skills
call are added in #4881. This PR depends on that one merging first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
I think another deeper thing that we've chatted about for the caveats is that the dist that the test validation uses is a test build and not a production build. such that qunit and test helpers are bundled into the dist so that they are statically available to the qunit index.html page (which i believe puppeteer is hosting). do we have a ticket to deal with packaging this into the cli? |
What's in this PR
Three new validator commands on
boxel-cli, plus supporting infrastructure:boxel lint [path] --realm <url>— ESLint + Prettier with@cardstack/boxelrules via the realm's existing_lintendpoint. Whole-realm or single-file. Works against any realm — no monorepo dependency.boxel parse [path] --realm <url>— glint (ember-tsc) type-check over.gts/.gjs/.tsfiles plus JSON document validation for SpeclinkedExamples. Monorepo-only (see caveat below).boxel test --realm <url>— runs the realm's QUnit test suite by driving headless Chromium against the host app's compileddist/. Monorepo-only (see caveat below).Supporting changes:
bin/boxel.jsts-node fallback now passes an explicit tsconfig path so it works from any cwd (not just inside the monorepo)..nodefiles that can't be bundled.findBoxelCliRoothelper resolves monorepo paths from both the bundleddist/and the ts-node entry mode.boxel --helpworks in published installs without the devDependency.Why split out from CS-11149
The interactive Claude Code factory flow (PR #4843) needs these three commands available as ordinary CLI tools. Once this lands and
@cardstack/boxel-cliis republished to npm, the factory PR can install boxel-cli the normal way (pnpm i -g @cardstack/boxel-cli) and drop all the dev-symlink / shell-function hacks it currently documents. Splitting the validator commands out makes both PRs easier to review and gives us a clean release point for the CLI.Monorepo-only caveat (temporary)
boxel parseandboxel testwork today only when run from inside the boxel monorepo:boxel parserunsember-tsclocally with tsconfig paths pointing atpackages/base,packages/host,packages/boxel-ui. Without those, glint can't resolvehttps://cardstack.com/base/*imports.boxel testdiscovers the host app's compileddist/and drives a headless Chromium against it. Without the monorepo, no host dist to test against.Both commands fail loudly with clear messages when invoked outside the monorepo, so they're safe to ship.
The proper fix is server-side: realm-server endpoints that do the work and stream results back, the same way
_lintalready works. Two follow-up tickets are tracked separately for those endpoints:_parseendpoint (mirror of_lint, withember-tscrunning in the server's environment). When it lands,boxel parsebecomes a thin client and works everywhere.boxel testdoesn't need Playwright + Chromium + the host dist locally.Test plan
pnpm --filter @cardstack/boxel-cli lint— passespnpm --filter @cardstack/boxel-cli lint:types— passespnpm --filter @cardstack/boxel-cli build— succeeds;dist/index.jsproducedboxel --helplistslint,parse,testboxel lint --realm http://localhost:4201/<your-realm>/against a known-clean realm reports no errorsboxel parse --realm <url>andboxel test --realm <url>run end-to-end inside the monorepo (requiresmise run dev-all,pnpm --filter @cardstack/host build,npx playwright install chromium)🤖 Generated with Claude Code