Skip to content

feat(boxel-cli): add lint, parse, test validator commands#4881

Merged
jurgenwerk merged 11 commits into
mainfrom
cs-11149a-boxel-cli-validators
May 19, 2026
Merged

feat(boxel-cli): add lint, parse, test validator commands#4881
jurgenwerk merged 11 commits into
mainfrom
cs-11149a-boxel-cli-validators

Conversation

@jurgenwerk
Copy link
Copy Markdown
Contributor

What's in this PR

Three new validator commands on boxel-cli, plus supporting infrastructure:

  • boxel lint [path] --realm <url> — ESLint + Prettier with @cardstack/boxel rules via the realm's existing _lint endpoint. Whole-realm or single-file. Works against any realm — no monorepo dependency.
  • boxel parse [path] --realm <url> — glint (ember-tsc) type-check over .gts / .gjs / .ts files plus JSON document validation for Spec linkedExamples. Monorepo-only (see caveat below).
  • boxel test --realm <url> — runs the realm's QUnit test suite by driving headless Chromium against the host app's compiled dist/. Monorepo-only (see caveat below).

Supporting changes:

  • bin/boxel.js ts-node fallback now passes an explicit tsconfig path so it works from any cwd (not just inside the monorepo).
  • esbuild build script externals for Playwright + fsevents — they ship native .node files that can't be bundled.
  • findBoxelCliRoot helper resolves monorepo paths from both the bundled dist/ and the ts-node entry mode.
  • Playwright is lazy-loaded inside the test runner so boxel --help works in published installs without the devDependency.

Why split out from CS-11149

The interactive Claude Code factory flow (PR #4843) needs these three commands available as ordinary CLI tools. Once this lands and @cardstack/boxel-cli is republished to npm, the factory PR can install boxel-cli the normal way (pnpm i -g @cardstack/boxel-cli) and drop all the dev-symlink / shell-function hacks it currently documents. Splitting the validator commands out makes both PRs easier to review and gives us a clean release point for the CLI.

Monorepo-only caveat (temporary)

boxel parse and boxel test work today only when run from inside the boxel monorepo:

  • boxel parse runs ember-tsc locally with tsconfig paths pointing at packages/base, packages/host, packages/boxel-ui. Without those, glint can't resolve https://cardstack.com/base/* imports.
  • boxel test discovers the host app's compiled dist/ and drives a headless Chromium against it. Without the monorepo, no host dist to test against.

Both commands fail loudly with clear messages when invoked outside the monorepo, so they're safe to ship.

The proper fix is server-side: realm-server endpoints that do the work and stream results back, the same way _lint already works. Two follow-up tickets are tracked separately for those endpoints:

  • Realm-server _parse endpoint (mirror of _lint, with ember-tsc running in the server's environment). When it lands, boxel parse becomes a thin client and works everywhere.
  • Built-in QUnit test harness on realm-server — surfaces the same test-run capability as a server endpoint so boxel test doesn't need Playwright + Chromium + the host dist locally.

Test plan

  • pnpm --filter @cardstack/boxel-cli lint — passes
  • pnpm --filter @cardstack/boxel-cli lint:types — passes
  • pnpm --filter @cardstack/boxel-cli build — succeeds; dist/index.js produced
  • boxel --help lists lint, parse, test
  • boxel lint --realm http://localhost:4201/<your-realm>/ against a known-clean realm reports no errors
  • boxel parse --realm <url> and boxel test --realm <url> run end-to-end inside the monorepo (requires mise run dev-all, pnpm --filter @cardstack/host build, npx playwright install chromium)

🤖 Generated with Claude Code

jurgenwerk and others added 7 commits May 19, 2026 09:09
Lints every lintable (.gts/.gjs/.ts/.js) file in a realm via the realm
`_lint` endpoint, or a single file when a realm-relative path is
passed. Aggregates per-file violations into a single summary; exits
non-zero on any error-severity violation.

This is the realm-wide companion to the existing single-file
`boxel file lint <path>` command. Closes the first gap from the
Phase 1 runbook (CS-11149): the software factory's
`runLintInMemory` validator becomes reachable from an interactive
Claude Code session via Bash.

`software-factory` keeps its in-process `runLintInMemory` for now;
both coexist during the migration window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts the factory's glint runner + JSON document validator from
`packages/software-factory/src/parse-execution.ts` into a top-level
`boxel parse` command in boxel-cli. Behavior matches the existing
factory tool:

- Without a path: discovers every `.gts` / `.gjs` / `.ts` in the
  realm plus every `.json` file linked as a `Spec.linkedExamples`,
  runs glint (`ember-tsc`) over the GTS batch in a temp dir with
  monorepo-aware tsconfig paths, and validates the document
  structure of each JSON example.
- With a path: parses just that single file (GTS → glint, JSON →
  document validation).

Path resolution is anchored on this file's `__dirname`, so the
command requires the Boxel monorepo layout — `packages/base`,
`packages/host`, `packages/boxel-ui`, and `@glint/ember-tsc` (added
as a boxel-cli devDependency) must all be resolvable. This is a
factory-developer tool, not an end-user CLI feature.

The factory keeps its own copy of `parse-execution.ts` for now;
both coexist during the CS-11149 migration window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts the factory's in-memory QUnit runner (`runTestsInMemory`) from
`packages/software-factory/src/test-run-execution.ts` into a
top-level `boxel test` command in boxel-cli. The runner:

- Discovers every `*.test.gts` file in the realm.
- Locates the host app's compiled `dist/` (env override, sibling
  packages/host, or the root checkout when in a git worktree).
- Spins up a tiny HTTP server that serves the host's test bundles
  + a synthesized QUnit harness page with live-test enabled.
- Drives a headless Chromium against that page with the realm URL
  in the query string; injects the per-realm JWT (if the active
  profile has one) via `page.route()` so private realms can be
  reached.
- Collects per-test QUnit results via `QUnit.on('testEnd' / 'runEnd')`
  hooks and aggregates them into pass/fail/skip counts + per-failure
  details.

Unlike the factory's `executeTestRunFromRealm`, this command does
NOT create or update a TestRun card — results are returned in-memory
only. Card persistence is the agent's responsibility in the new
Phase 1 flow.

`@playwright/test` is added as a boxel-cli devDependency. The
`findHostDistPackageDir` discovery helper is inlined from
`@cardstack/realm-test-harness/host-dist` to avoid pulling the
harness in as a dependency. Like `boxel parse`, this is a
monorepo-only command and not usable from the published CLI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`pnpm --filter @cardstack/boxel-cli build` was failing with
`No loader is configured for ".node" files: fsevents.node`.
esbuild was trying to bundle Playwright's transitive deps —
specifically the native `.node` files that ship with fsevents
(macOS file watcher) and playwright itself — and obviously can't
inline those.

Added Playwright (`@playwright/test`, `playwright`,
`playwright-core`) and `fsevents` to the external list. They
stay as runtime `require`s; node resolves them from
`packages/boxel-cli/node_modules/` when `boxel test` actually
runs. Matches `boxel test`'s existing monorepo-only constraint
— in published form, those `devDependencies` aren't installed
and `boxel test` errors at import-time with a clear message.

Bundle grew from 144 KB to 253 KB (still small) because the
playwright-free bundle was missing realm-server / runtime-common
code; the externals tweak doesn't bundle node_modules wholesale,
so the inlined size is now correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`boxel parse` and `boxel test` were both computing the monorepo
layout by counting `..` segments up from `__dirname`. That works
when the CLI runs from `src/commands/...` (ts-node fallback) but
breaks when the same code runs from `dist/index.js` (the bundled
form) — `__dirname` is at a different depth.

Replaced the `__dirname` walking with a `findBoxelCliRoot` helper
that walks up looking for the `@cardstack/boxel-cli` package.json.
Robust against both entry modes (and any future bundling
relocations).

Drive-by fixes:
- `parse.ts` was missing `dirname` from its `node:path` imports
  after the refactor; restored.
- `find-package-root.ts` uses `for (;;)` instead of `while (true)`
  to keep ESLint's `no-constant-condition` happy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ts-node fallback (used when `dist/` is missing) was previously
calling `ts-node.register({ transpileOnly: true })` with no project
path. ts-node discovered tsconfig.json from cwd, which worked when
the CLI was invoked from inside the monorepo but failed from any
other tree (e.g. `/var/folders/.../tmp.XXX/`) because no tsconfig
is reachable walking up from there.

Pointed ts-node at boxel-cli's own tsconfig.json explicitly so the
fallback works regardless of caller cwd.
Three fixes from PR review (codex + Copilot):

- Lazy-load `@playwright/test` in `boxel test`. The top-level
  `import { chromium }` made `boxel --help` (and every other
  subcommand) crash in published installs where the devDependency
  isn't present. Moved behind an async loader that fires only when
  the test runner actually runs.

- Zero `*.test.gts` files → validator failure, not pass.
  Previously `boxel test` returned `status: 'passed'` for a realm
  with no tests, which would let the factory agent mark an Issue
  done without ever writing one. Now returns `failed` with an
  explicit errorMessage.

- Bounded-poll Spec discovery in `boxel parse`. The realm's
  search index settles asynchronously, so a `boxel realm push`
  immediately followed by `boxel parse` could silently miss the
  freshly-pushed `linkedExamples` and pass when it shouldn't.
  Wrapped the Spec search in a 30s/250ms retry-with-poll while
  the result is ok-but-empty.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb7302416f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +692 to +694
if (result.testFiles.length === 0) {
console.log(`${DIM}No .test.gts files found in the realm.${RESET}`);
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return failure when no realm tests are found

When runTestsForRealm reports a failed/error result with an empty testFiles array (for example, a realm has no *.test.gts files, or discovery fails before any files are found), the non-JSON CLI takes this branch and returns before the final result.status !== 'passed' exit check. That makes boxel test --realm ... exit 0 in the exact no-tests case that the runner marks as a validator failure, so automation can treat an implementation with no tests as passing.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Claude here, replying on behalf of @jurgenwerk.)

Fixed in fa373eeboxel test now exits 1 whenever result.status !== 'passed' in the empty-testFiles early-return branch too, so a realm with no *.test.gts is treated as the validator failure that runTestsForRealm already returns.

jurgenwerk added a commit that referenced this pull request May 19, 2026
PR #4881 splits the new validator commands (lint/parse/test) into
their own boxel-cli PR. Once that lands and @cardstack/boxel-cli
is republished, this branch can stop documenting the development
shim (pnpm link --global / shell function / manual ln -sf
symlink) and just say "install boxel-cli the normal way".

- Skill setup blocks (bootstrap + scheduling): drop the install
  recipe entirely. The agent just verifies `boxel --help` lists
  lint/parse/test and bails out asking the user to run
  `pnpm i -g @cardstack/boxel-cli` if not.
- Runbook prerequisites: replace the multi-paragraph dev-CLI
  build + shell-function block with one line — `pnpm i -g
  @cardstack/boxel-cli`. The host-app build and Playwright
  Chromium install are grouped under "monorepo-only validators
  (boxel parse and boxel test) need these extras", with a forward
  reference to the follow-up tickets for the realm-server
  `_parse` endpoint and the built-in QUnit harness.

No agent-facing behavior change — the skills still verify, still
bail out cleanly if `boxel` is missing. Just the install advice
got straight.
@jurgenwerk jurgenwerk requested a review from Copilot May 19, 2026 07:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends @cardstack/boxel-cli with three new validator-style commands (lint, parse, test) intended to support the software-factory workflow, plus supporting CLI/root-resolution and build changes to make the commands usable both from the monorepo and via the bundled dist/ entrypoint.

Changes:

  • Add boxel lint, boxel parse, and boxel test commands, including JSON output modes for automation.
  • Add findBoxelCliRoot() to stabilize monorepo-relative path resolution from both dist/ and src/ (ts-node fallback) execution modes.
  • Update build/runtime plumbing: esbuild externals for Playwright/native deps, and ts-node registration now pins the CLI’s tsconfig.

Reviewed changes

Copilot reviewed 7 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pnpm-lock.yaml Adds lock entries for new CLI dependencies (@glint/ember-tsc, @playwright/test).
packages/boxel-cli/src/lib/find-package-root.ts Introduces findBoxelCliRoot() to locate the CLI package root reliably.
packages/boxel-cli/src/commands/lint.ts Adds whole-realm/single-file linting via the realm _lint endpoint.
packages/boxel-cli/src/commands/parse.ts Adds monorepo-only parsing/typecheck (ember-tsc) + Spec linkedExamples JSON document validation.
packages/boxel-cli/src/commands/test.ts Adds monorepo-only QUnit test runner using Playwright-driven headless Chromium against host dist/.
packages/boxel-cli/src/build-program.ts Registers the new lint, parse, and test commands in the CLI program.
packages/boxel-cli/scripts/build.ts Marks Playwright-related modules as esbuild externals to avoid bundling native/runtime-resolve deps.
packages/boxel-cli/package.json Adds @glint/ember-tsc and @playwright/test dependencies.
packages/boxel-cli/bin/boxel.js Improves ts-node fallback by explicitly setting the CLI tsconfig path.
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported
Comments suppressed due to low confidence (1)

packages/boxel-cli/src/commands/parse.ts:418

  • The temp-dir safety check uses resolved.startsWith(tempDir + '/'), which is not portable (fails on Windows path separators) and silently skips files that resolve outside the temp dir. This can produce a misleading “passed” result (or drop diagnostics) if any unsafe path slips through. Use a path.relative(tempDir, resolved)-based check (or startsWith(tempDir + path.sep)) and surface an explicit parse error when a file path is rejected.
    for (let file of files) {
      let normalized = join(tempDir, file.path);
      let resolved = resolve(normalized);
      if (!resolved.startsWith(tempDir + '/')) continue;
      mkdirSync(dirname(resolved), { recursive: true });

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


if (result.testFiles.length === 0) {
console.log(`${DIM}No .test.gts files found in the realm.${RESET}`);
return;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Claude here, replying on behalf of @jurgenwerk.)

Fixed in fa373ee — added the if (result.status !== 'passed') process.exit(1) check before the early-return so the no-tests case (which runTestsForRealm already returns as status: 'failed') exits non-zero.

Comment on lines +142 to +146
if (options?.path) {
let path = options.path;
if (PARSEABLE_GTS_EXTENSIONS.some((ext) => path.endsWith(ext))) {
gtsFiles = [path];
} else if (path.endsWith(PARSEABLE_JSON_EXTENSION)) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Claude here, replying on behalf of @jurgenwerk.)

Fixed in fa373ee — added a validateRealmRelativePath helper (mirrors packages/software-factory/src/realm-relative-path.ts) that rejects URL schemes, leading /, backslashes, percent-encoded traversal, and .. segments. It runs before the extension check on the public entry point, and runGlintCheck (the other site at line 414) now also validates each path and throws on tempDir-escapes instead of silently continue-ing.

Comment on lines +63 to +68
if (options?.path) {
let path = options.path;
if (!LINTABLE_EXTENSIONS.some((ext) => path.endsWith(ext))) {
return emptyErrorResult(
`Path "${path}" is not lintable — must end with one of ${LINTABLE_EXTENSIONS.join(', ')}`,
);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Claude here, replying on behalf of @jurgenwerk.)

Fixed in fa373ee — same validateRealmRelativePath helper as in parse.ts. boxel lint now rejects URL schemes, leading /, backslashes, percent-encoded traversal, and .. traversal segments before the extension check.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Host Test Results

    1 files      1 suites   1h 30m 57s ⏱️
2 665 tests 2 650 ✅ 15 💤 0 ❌
2 684 runs  2 669 ✅ 15 💤 0 ❌

Results for commit 7b8859e.

Realm Server Test Results

    1 files  ±    0      1 suites  +1   8m 49s ⏱️ + 8m 49s
1 408 tests +1 408  1 408 ✅ +1 408  0 💤 ±0  0 ❌ ±0 
1 495 runs  +1 495  1 495 ✅ +1 495  0 💤 ±0  0 ❌ ±0 

Results for commit 7b8859e. ± Comparison against earlier commit 4f5f81e.

Vendors a `validateRealmRelativePath` helper that rejects paths with
URL schemes, leading `/`, backslashes, percent-encoded escapes, and
`..` traversal segments. The validator commands previously accepted
anything ending in the right extension, so `Cards/../foo.ts` or an
absolute URL would reach realm-server URL handling with whatever
normalization the layer chose. Mirrors the equivalent gate in
`packages/software-factory/src/realm-relative-path.ts`.

`boxel test` now also exits non-zero when `runTestsForRealm` returns
`status: 'failed'` with an empty `testFiles` array — previously the
no-tests early-return swallowed the validator failure.

`runGlintCheck` no longer silently continues on a `..`-style path
that resolves outside its temp dir; it throws instead.

Addresses review comments on #4881.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jurgenwerk jurgenwerk changed the title boxel-cli: add lint, parse, test validator commands feat: boxel-cli — add lint, parse, test validator commands May 19, 2026
jurgenwerk and others added 3 commits May 19, 2026 09:54
The published package always ships `dist/index.js`, so end users never
hit the fallback. Monorepo devs who want to run unbuilt source can use
`pnpm --filter @cardstack/boxel-cli start` (`ts-node --transpileOnly`
on `src/index.ts`), which is what the build/start scripts already use.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops the `project:` tsconfig path added in 3d6d8e7 — back to the
version on main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jurgenwerk jurgenwerk changed the title feat: boxel-cli — add lint, parse, test validator commands boxel-cli — add lint, parse, test validator commands May 19, 2026
@jurgenwerk jurgenwerk changed the title boxel-cli — add lint, parse, test validator commands feat(boxel-cli): add lint, parse, test validator commands May 19, 2026
@jurgenwerk jurgenwerk requested a review from a team May 19, 2026 08:19
jurgenwerk added a commit that referenced this pull request May 19, 2026
Adds the vendor-neutral skills that drive the software factory from
inside an interactive Claude Code session, plus the runbook that
documents how to use them:

- `.agents/skills/software-factory-bootstrap` (rewritten)
- `.agents/skills/software-factory-operations` (rewritten)
- `.agents/skills/software-factory-scheduling` (new)
- `.agents/skills-sdk/{bootstrap,operations}` — verbatim pre-rewrite
  snapshots for the existing SDK orchestrator path
- `docs/runbook.md` — single-prompt end-to-end flow
- `src/factory-skill-loader.ts` — load from `.agents/skills-sdk/` so
  the orchestrator and interactive flow don't fight
- `.claude/CLAUDE.md` — documents the dual skill directories
- `.gitignore` — ignore the `factory-test-*/` mktemp workspaces

The boxel-cli validator commands (`lint`, `parse`, `test`) the skills
call live on the base branch (PR #4881).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jurgenwerk added a commit that referenced this pull request May 19, 2026
…Code

Adds the vendor-neutral skills that drive the software factory from
inside an interactive Claude Code session, plus the runbook that
documents how to use them:

- `.agents/skills/software-factory-bootstrap` (rewritten)
- `.agents/skills/software-factory-operations` (rewritten)
- `.agents/skills/software-factory-scheduling` (new)
- `.agents/skills-sdk/{bootstrap,operations}` — verbatim pre-rewrite
  snapshots for the existing SDK orchestrator path
- `docs/runbook.md` — single-prompt end-to-end flow
- `src/factory-skill-loader.ts` — load from `.agents/skills-sdk/` so
  the orchestrator and interactive flow don't fight
- `.claude/CLAUDE.md` — documents the dual skill directories
- `.gitignore` — ignore the `factory-test-*/` mktemp workspaces

The boxel-cli validator commands (`lint`, `parse`, `test`) the skills
call are added in #4881. This PR depends on that one merging first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra
Copy link
Copy Markdown
Contributor

I think another deeper thing that we've chatted about for the caveats is that the dist that the test validation uses is a test build and not a production build. such that qunit and test helpers are bundled into the dist so that they are statically available to the qunit index.html page (which i believe puppeteer is hosting). do we have a ticket to deal with packaging this into the cli?

@jurgenwerk
Copy link
Copy Markdown
Contributor Author

jurgenwerk commented May 19, 2026

@habdelra yes, we do - https://linear.app/cardstack/issue/CS-11164/bundle-a-self-contained-qunit-test-harness-into-boxel-cli

@jurgenwerk jurgenwerk merged commit d7877af into main May 19, 2026
79 of 80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants