Software factory: relocate skills + cleanup + rewrite for current factory tools#4756
Conversation
CS-10666 (under CS-10613). `packages/boxel-cli/.agents/skills/boxel-api/SKILL.md` — new canonical home for Boxel platform API knowledge. Covers: - `boxel search` / `client.search()` — federated search across one or more realms, with the full query syntax (`type`, `eq`, `contains`, `range`, `every`/`any`/`not`, `sort`, `page`, CodeRef matching, common mistakes). - `boxel realm create` / `client.createRealm()` — provisioning a new realm, including the `waitForReady` default that polls `/_readiness-check` for you. - `client.waitForReady()` — standalone readiness polling. - A "when to use what" decision matrix mapping common goals to the right CLI command or `BoxelCLIClient` method. - Boundary statements pointing at sibling skills (`boxel-development`, `boxel-file-structure`, `boxel-sync`, `boxel-command`) so this skill doesn't try to cover everything. Auth deliberately not documented. boxel-cli owns auth internally — consumers don't see JWTs, and `BoxelCLIClient` handles tokens, refresh, and 401 retries through `ProfileManager`. The skill only tells consumers "use `BoxelCLIClient`; don't roll your own `fetch`." Diverges from CS-10666's "covers auth model" acceptance line — the auth machinery is an implementation detail, not API surface. Retired `packages/software-factory/.agents/skills/boxel-development/ references/dev-realm-search.md` — its substantive query-syntax content moved into the new skill. Removed from `ALWAYS_LOAD_REFERENCES` and `REFERENCE_KEYWORD_MAP` in `factory-skill-loader.ts` so the loader no longer tries to read a file that doesn't exist. `pnpm test:node`: 345/345 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ev refs Picks up where the earlier `boxel-api` skill commit left off — CS-10613's remaining mechanical / content work. Relocations (git mv preserves history): - `boxel-development`, `boxel-file-structure` — into monorepo root `.agents/skills/`. These describe Boxel card development idioms in general; they aren't software-factory-specific. The factory's `factory-skill-loader.ts` already walks the monorepo root as a fallback dir, so it still picks them up. - `boxel-sync`, `boxel-track`, `boxel-watch`, `boxel-repair`, `boxel-restore`, `boxel-setup` — into `packages/boxel-cli/.agents/skills/`. These document interactive CLI workflows for humans using Claude Code on a synced workspace. They aren't loaded by the factory agent (CLI_ONLY_SKILLS filter remains in place), so moving them out of `packages/software-factory/.agents/skills/` puts them next to the code that implements those commands. Rewrites: - `software-factory-operations/SKILL.md` — dropped the dual "Claude backend vs OpenRouter backend" branches. OpenRouter no longer exposes the old factory tools (read_file / write_file / search_realm / fetch_transpiled_module / run_command); both backends now use native fs (`Read`/`Write`/`Edit`/`Glob`/`Grep`) plus the `boxel` CLI through `Bash`. Realm-side reads section now points at the `boxel-api` skill for the full search query syntax and at the `boxel-command` skill for prerendered host commands. - `dev-qunit-testing.md` — swap the lone `read_file` reference for native `Read` + `Glob`. - `dev-spec-usage.md` — swap `create_catalog_spec` + `write_file` for the live `get_card_schema` introspection + native `Write` flow. Added the required-shape JSON example, the dotted `linkedExamples.0` key form (the indexer rejects the array form), and the explicit warning to never call `run_instantiate` on the Spec file itself (its module lives in the base realm, the prerender enforces same-origin module loads, the call always fails — a trap factory runs have walked into). No code changes — the loader's existing fallback chain (primary: `packages/software-factory/.agents/skills/`, fallback: `MONOREPO_ROOT/.agents/skills/`) resolves moved skills correctly. Tests in `factory-skill-loader.test.ts` (43 cases) still pass — they construct synthetic skill dirs and don't depend on the real layout. Closes CS-10666; advances CS-10613's content-rewrite phase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ons skill Prettier interpreted `+ \`Write\`` at the start of a continuation line as a list bullet and reflowed the paragraph into a broken pseudo-list. Reworded to avoid the leading `+` token.
2d3eec3 to
90a3145
Compare
Host Test Results 1 files 1 suites 1h 49m 20s ⏱️ Results for commit 90a3145. |
There was a problem hiding this comment.
Pull request overview
This PR reorganizes and updates the “skills” documentation used by the software-factory and boxel-cli agents so that guidance matches the current tool surfaces (native fs tools + boxel CLI via Bash) and the current ownership boundaries (API knowledge in boxel-cli).
Changes:
- Removed the factory’s always-loaded realm-search reference (
dev-realm-search.md) and rewired operational docs to preferboxelCLI usage. - Rewrote
software-factory-operationsto describe workspace-native fs usage, realm-side reads viaboxel, and the current validator tool behavior. - Added/relocated multiple skills into monorepo root
.agents/skills/andpackages/boxel-cli/.agents/skills/, plus newboxel-apiandboxel-commandskills.
Reviewed changes
Copilot reviewed 7 out of 33 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/software-factory/src/factory-skill-loader.ts | Stops always-loading the realm-search reference and adjusts keyword mapping accordingly. |
| packages/software-factory/.agents/skills/software-factory-operations/SKILL.md | Updates factory operational guidance to native fs + boxel CLI workflows and documents validators. |
| packages/software-factory/.agents/skills/boxel-development/references/dev-realm-search.md | Removes the legacy realm-search reference document. |
| packages/boxel-cli/.agents/skills/boxel-watch/SKILL.md | Adds a boxel watch workflow skill for human CLI sessions. |
| packages/boxel-cli/.agents/skills/boxel-track/SKILL.md | Adds a boxel track workflow skill for human CLI sessions. |
| packages/boxel-cli/.agents/skills/boxel-sync/SKILL.md | Adds a boxel sync workflow skill for human CLI sessions. |
| packages/boxel-cli/.agents/skills/boxel-setup/SKILL.md | Adds onboarding/setup guidance for boxel-cli profiles and first sync. |
| packages/boxel-cli/.agents/skills/boxel-restore/SKILL.md | Adds restore workflow guidance (history restore + follow-up sync). |
| packages/boxel-cli/.agents/skills/boxel-repair/SKILL.md | Adds realm metadata repair workflow guidance. |
| packages/boxel-cli/.agents/skills/boxel-command/SKILL.md | Documents boxel run-command and BoxelCLIClient.runCommand() usage and failure modes. |
| packages/boxel-cli/.agents/skills/boxel-api/SKILL.md | Introduces a canonical API skill: federated search, realm creation, readiness, and client usage. |
| .agents/skills/boxel-file-structure/SKILL.md | Adds consolidated guidance on workspace file structure, adoptsFrom paths, and relationship encoding. |
| .agents/skills/boxel-development/SKILL.md | Adds the top-level Boxel development skill and reference-loading guidance. |
| .agents/skills/boxel-development/references/dev-theme-design-system.md | Adds detailed theming/token guidance and CSS safety rules. |
| .agents/skills/boxel-development/references/dev-template-patterns.md | Adds strict-mode template patterns and common pitfalls. |
| .agents/skills/boxel-development/references/dev-technical-rules.md | Adds core technical rules (contains vs linksTo, glint constraints, etc.). |
| .agents/skills/boxel-development/references/dev-styling-design.md | Adds CSS safety + styling philosophy guidance. |
| .agents/skills/boxel-development/references/dev-spec-usage.md | Updates Spec guidance to get_card_schema + native Write, with linkedExamples encoding notes. |
| .agents/skills/boxel-development/references/dev-replicate-ai.md | Adds Replicate API integration reference patterns. |
| .agents/skills/boxel-development/references/dev-qunit-testing.md | Updates QUnit/TestRun guidance to native fs tools + boxel search. |
| .agents/skills/boxel-development/references/dev-quick-reference.md | Adds a concise “approved imports / patterns” quick reference. |
| .agents/skills/boxel-development/references/dev-query-systems.md | Adds query construction guidance for .gts usage (including the on rule). |
| .agents/skills/boxel-development/references/dev-fitted-formats.md | Adds fitted-format strategy and container-query skeleton guidance. |
| .agents/skills/boxel-development/references/dev-file-editing.md | Adds SEARCH/REPLACE + edit-tracking conventions for .gts editing workflows. |
| .agents/skills/boxel-development/references/dev-file-def.md | Adds FileDef usage, hierarchy, and import guidance. |
| .agents/skills/boxel-development/references/dev-external-libraries.md | Adds patterns for loading/using third-party libraries in Boxel code. |
| .agents/skills/boxel-development/references/dev-enumerations.md | Adds enumField patterns, import rules, and helper usage guidance. |
| .agents/skills/boxel-development/references/dev-delegated-rendering.md | Adds delegated rendering guidance and common UI patterns. |
| .agents/skills/boxel-development/references/dev-defensive-programming.md | Adds defensive coding patterns for Boxel templates/components. |
| .agents/skills/boxel-development/references/dev-data-management.md | Adds file organization + JSON instance patterns and relationship link rules. |
| .agents/skills/boxel-development/references/dev-core-patterns.md | Adds core CardDef/FieldDef implementation patterns and anti-patterns. |
| .agents/skills/boxel-development/references/dev-core-concept.md | Adds foundational Boxel concepts and decision trees. |
| .agents/skills/boxel-development/references/dev-command-development.md | Adds command-development guidance for host command patterns and IO via commands. |
Comments suppressed due to low confidence (1)
.agents/skills/boxel-development/references/dev-spec-usage.md:5
- This line says Specs adopt from
https://cardstack.com/base/spec#Spec, but the example (andget_card_schema) usemeta.adoptsFrom.module: "https://cardstack.com/base/spec"withname: "Spec". Using a fragment here is inconsistent and could cause copy/paste errors; consider changing it to the module+name form used elsewhere.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const ALWAYS_LOAD_REFERENCES: readonly string[] = [ | ||
| 'dev-core-concept.md', | ||
| 'dev-technical-rules.md', | ||
| 'dev-quick-reference.md', | ||
| 'dev-realm-search.md', | ||
| 'dev-qunit-testing.md', |
There was a problem hiding this comment.
Good catch — addressed in 4a6257f. The loader now has packages/boxel-cli/.agents/skills/ in its fallback chain, and boxel-api + boxel-command are auto-loaded by DefaultSkillResolver so the factory agent's prompt always includes the search query syntax + host-command failure modes inline.
| Single-quote the entire JSON object so the shell does not expand or | ||
| split it; keep keys and string values double-quoted inside. Pipe | ||
| through `jq` to project. **For the full query syntax (filter / eq / | ||
| contains / range / every / any / not / sort / page, CodeRef matching, | ||
| common mistakes) see the `boxel-api` skill.** |
There was a problem hiding this comment.
Addressed in 4a6257f. The factory loader now searches packages/boxel-cli/.agents/skills/, and boxel-api is auto-loaded for every implementation issue — so the cross-reference resolves to actual content in the agent's prompt now.
| boxel run-command <command-specifier> --realm <target-realm-url> --input '<json>' --json | ||
| ``` | ||
| Most agent tasks won't need this — the validators below already wrap | ||
| the common host commands. See the `boxel-command` skill for the | ||
| programmatic surface and failure modes. |
There was a problem hiding this comment.
Addressed in 4a6257f along with the boxel-api fix — boxel-command is now auto-loaded too, so the cross-reference works.
| `Bash` is also available for read-only `boxel` CLI commands | ||
| (`boxel status`, `boxel history`, `boxel search`, `boxel read-transpiled`, | ||
| `boxel run-command`) — see the **Realm-side reads** section below. |
There was a problem hiding this comment.
Fixed in 4a6257f. Reworded the section to separate the read-only inspection commands (status / history / search / read-transpiled) from run-command, with a note that run-command's safety is "as safe as the named command" since the specifier governs what it actually does.
| ## Self-Validation (optional, no side effects) | ||
|
|
||
| All five validators are factory tools, safe to call repeatedly mid-turn; | ||
| none of them write a realm artifact. The orchestrator still runs the full | ||
| validation pipeline (which persists the durable `TestRun` / `LintResult` |
There was a problem hiding this comment.
Fixed in 4a6257f. Retitled the section to "in-memory results" and added an explicit "Side effect to know about" paragraph: the realm-touching validators (run_evaluate / run_instantiate / run_tests) sync the workspace to the realm before invoking the prerenderer, while run_lint / run_parse run entirely in-process.
Copilot review surfaced five issues: 1. Factory skill loader couldn't find `boxel-api` / `boxel-command` because `packages/boxel-cli/.agents/skills/` wasn't in its search path, so the cross-references from `software-factory-operations` to those skills were dead pointers in the factory agent's prompt. Fix: add `packages/boxel-cli/.agents/skills/` to the loader's fallback chain (between the package primary dir and the monorepo root). Also auto-load `boxel-api` and `boxel-command` from `DefaultSkillResolver` — the agent always needs the realm-search query syntax and host-command failure modes, so both belong in the always-loaded set. Side effect: `CLI_ONLY_SKILLS` is now removed. It was a defensive filter for the old layout where every skill lived in the factory's own `.agents/skills/` and could be picked up by accident. After the relocations the CLI skills live in `packages/boxel-cli/` and are never auto-loaded by the resolver — a knowledge-article author can explicitly opt in via a `skill:boxel-sync` tag, which is the deliberate path. Tests updated: the four "excludes CLI-only skills" cases became two cases — one verifying free-text keyword matching does NOT pull in CLI skills (still true via the resolver's hard-coded auto-load set), one verifying a knowledge-tag opt-in DOES include them (new behavior). 2. `software-factory-operations/SKILL.md` listed `boxel run-command` under "read-only `boxel` CLI commands." It dispatches to arbitrary host commands and isn't read-only in general — rewrote the section to separate read-only inspection commands from `run-command` and to note that its safety is "as safe as the named command." 3. Same SKILL.md called the Self-Validation section "no side effects," but `run_evaluate` / `run_instantiate` / `run_tests` sync the workspace to the realm before invoking the prerenderer. Renamed the section to "in-memory results" and called out the realm push as an explicit side effect, while clarifying that `run_lint` / `run_parse` do run entirely in-process. 4. `dev-spec-usage.md` opened by saying Specs adopt from `https://cardstack.com/base/spec#Spec` (fragment form), then the JSON example used `module: "https://cardstack.com/base/spec", name: "Spec"`. Reworded the prose to match the module+name shape used in the example so a hurried reader doesn't copy `spec#Spec` into a CodeRef. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
These two API surfaces aren't agent-facing — the software-factory provisions the target realm in factory-target-realm.ts before the agent loop starts, and waitForReady is called from the orchestrator. With boxel-api now auto-loaded into every factory agent prompt, the realm-creation prose was just dead weight in context. Kept federated search (the agent does use boxel search via Bash) and the "when to use what" table. Pointed readers needing createRealm/waitForReady at boxel-cli/src/api.ts or boxel realm create --help.
Six skills moved into `packages/boxel-cli/.agents/skills/` earlier in this PR turned out to describe commands that don't exist in the monorepo's boxel-cli: - `boxel-sync` — no top-level `boxel sync`; only `boxel realm sync` - `boxel-track` — no `boxel track` - `boxel-watch` — no top-level `boxel watch`; only `boxel realm watch` - `boxel-restore` — no `boxel restore`; closest is `boxel realm history` - `boxel-repair` — no `boxel repair-realm` / `boxel repair-realms` - `boxel-setup` — no `boxel setup`; setup happens via `boxel profile add` These skills came from the standalone `cardstack/boxel-cli` GitHub repo, which has a much richer CLI surface than the monorepo's slimmed-down fork. The actual `boxel` here is six commands: `profile`, `file`, `realm`, `run-command`, `search`, `read-transpiled`. Anyone reading those skills and trying to run the documented commands would get "unknown command" — they're worse than no docs. Deleted the six skill directories. `boxel-api` and `boxel-command` stay — they describe `boxel search` and `boxel run-command`, which actually exist. Loader cleanup that fell out: - `SKILL_PRIORITY` no longer references the deleted skills. - Removed the explanatory comment fragment about knowledge-tag opt-in for CLI skills, since CLI skills are gone. - The "knowledge article tag opt-in" test now uses hypothetical names (`custom-extension`, `another-domain-skill`) instead of the deleted `boxel-sync` / `boxel-repair`. The test still verifies the resolver honors knowledge tags; it just doesn't pretend any specific CLI skill is a valid target anymore. Tests: 40/40 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`packages/software-factory/.claude/CLAUDE.md` (783 lines) and `packages/software-factory/AGENTS.md` (224 lines) both described the standalone `cardstack/boxel-cli` GitHub repo's rich CLI surface — `boxel sync`, `boxel track`, `boxel watch`, `boxel restore`, `boxel repair-realm`, `boxel skills`, `boxel share`, `boxel gather`, `boxel realms`, `boxel stop`, `boxel edit`, top-level `boxel list` / `boxel history` / `boxel status`. None of those exist in this monorepo's slimmed-down `boxel-cli`. The AGENTS.md also still referred to a "dark-factory" / "guidance-tasks" workspace setup and a "one hour to produce demo" priority that predates the current factory architecture. Rewrote both as thin, accurate pointers (~40 lines each): - Available commands (`pnpm factory:go`, `pnpm test:node`, `pnpm lint`). - Pointers to README.md for architecture. - Description of the three-directory skill loader chain. - The architectural boundary: boxel-cli owns the entire Boxel API surface; the factory imports `BoxelCLIClient` and never calls fetch() against a realm directly. - Key source-file map: entrypoint, issue loop, workspace-fs, agent backend, tool builder. Cleanup that fell out: - `scripts/smoke-tests/factory-skill-smoke.ts`: replaced the `boxel-sync` / `boxel-track` / `boxel-watch` / `boxel-restore` / `boxel-repair` / `boxel-setup` list with the actual currently-loadable skills (`boxel-api`, `boxel-command`). - `tests/factory-skill-loader.test.ts` (budget tests): synthetic "lower-priority" fixture renamed from `'boxel-sync'` to `'low-priority-test-skill'` so it's clear the name is just a test placeholder, not a real skill. - `tests/factory-agent-claude-code.test.ts` (registry-leak test): synthetic `'boxel-sync'` registered-tool example renamed to `'sample-registered-tool'` for the same reason. `packages/software-factory/docs/phase-1-plan.md` and `phase-2-plan.md` still mention the deleted skills — left alone because they're archived planning docs (snapshots of intended state at a point in time, not authoritative guidance). `tests/factory-tool-registry.test.ts` mentions `boxel-sync` in its retired-tools list, which is correct historical record of CS-10883's retirements. Tests: 40/40 skill-loader, 84/84 across the three affected suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Tested this using |
At a glance — skill moves
Factory skill loader now walks three directories (was two):
boxel-apiandboxel-commandare auto-loaded for every implementation issue, so the realm-search query syntax and host-command failure modes are inline in the agent's prompt.Why these skills existed in the first place
Worth saying up front, since the diff looks weird otherwise:
boxel-cliwas its own repo (github.com/cardstack/boxel-cli) with a full interactive workspace-management CLI for humans:sync,track,watch,restore,repair-realm,setup,share,gather, etc. The factory's skill directory had.mdfiles describing all of those commands, designed to teach Claude Code users running in a synced workspace.BoxelCLIClientprogrammatically (client.sync(),client.search(),client.pull()) and a handful of CLI subcommands (boxel search,boxel run-command,boxel read-transpiled). The interactive human-facing commands didn't make the cut.packages/software-factory/.agents/skills/because the factory's loader was wired to find them there, even though the underlying commands didn't exist in this codebase. TheCLAUDE.mdforpackages/software-factoryeven still pointed at "Official repo: https://github.com/cardstack/boxel-cli" as the source of truth.CS-10613 was the ticket asking "where should each skill live?" This PR ended up answering the deeper question: "should each skill exist at all?"
Detail — what changed file by file
Created (in
packages/boxel-cli/.agents/skills/):boxel-api— federated search query syntax (filter / eq / contains / range / every / any / not / sort / page, CodeRef matching, common mistakes). Trimmed to focus on search; realm-creation and readiness are orchestrator concerns, not agent-facing.boxel-command—boxel run-command/client.runCommand()for the prerendered-host-command flow and its three failure modes.Relocated (
git mv, history preserved):boxel-development,boxel-file-structure→ monorepo root.agents/skills/. These describe Boxel card development idioms in general; not software-factory-specific.Deleted:
boxel-sync,boxel-track,boxel-watch,boxel-restore,boxel-repair,boxel-setup— described commands not in this codebase (see archeology above). No callers, no agent use, no point.dev-realm-search.md— content consolidated intoboxel-api.Rewrote:
software-factory-operations/SKILL.md— dropped the dual "Claude backend vs OpenRouter backend" tool branches. Both backends now use the same surface (native fs +boxelCLI via Bash). Realm-side reads point at the newboxel-apiandboxel-commandskills.dev-qunit-testing.md— swap the loneread_filemention for nativeRead+Glob.dev-spec-usage.md— swapcreate_catalog_spec+write_filefor the liveget_card_schemaintrospection + nativeWriteflow. Added the required-shape JSON example, the dottedlinkedExamples.0key form (the indexer rejects the array form), and the explicit "don'trun_instantiateon the Spec file itself" warning (the prerender enforces same-origin module loads — Spec lives in the base realm).packages/software-factory/AGENTS.md(224 → 40 lines) — was describing the legacy standalone-repo CLI plus a "dark-factory" / "guidance-tasks" workspace setup that predates the current factory architecture. Replaced with a thin, accurate pointer.packages/software-factory/.claude/CLAUDE.md(783 → 50 lines) — same problem, same fix.Loader updates (
factory-skill-loader.ts):packages/boxel-cli/.agents/skills/to the fallback chain so the relocated skills are findable.boxel-apiandboxel-commandare auto-loaded for every implementation issue.CLI_ONLY_SKILLS— was a defensive filter for the old layout, moot after the relocations.What's NOT in it
.claude/skillssymlink churn. Existing symlinks already point at.agents/skills/.boxel-api. Those run infactory-target-realm.tsbefore the agent loop starts — not agent-facing. Pointed atboxel-cli/src/api.ts/boxel realm create --helpfor callers who actually need those APIs.Test plan
tests/factory-skill-loader.test.ts— 40/40 pass.tests/factory-context-builder.test.ts— 25/25 pass.tests/factory-agent-claude-code.test.ts+tests/factory-tool-registry.test.ts— 84/84 across affected suites.pnpm factory:go --agent openrouteragainst a fresh target realm — bootstrap completed cleanly (4 tool calls, agent fetched 3 schemas viaget_card_schemaand used nativewrite/bash, no retired tools in sight). Implementation issue stopped at an OpenRouter billing error — external.Tickets
🤖 Generated with Claude Code