software-factory: retire structured update tools — agent writes raw JSON:API#4652
software-factory: retire structured update tools — agent writes raw JSON:API#4652jurgenwerk wants to merge 3 commits intomainfrom
Conversation
…SON:API Drop the five structured wrapper tools that used to construct tracker-card JSON for the agent: `update_project`, `update_issue`, `create_knowledge`, `create_catalog_spec`, `add_comment`. CS-10883 always called for retiring these once the skill teaches the JSON:API + adoptsFrom shape directly; the partial PR (#4633) kept them while the skill rewrite caught up. This finishes that work. The agent now writes those `.json` files via native `Write` (Claude backend) or `write_file` (OpenRouter backend) — same workspace fs surface it already uses for `.gts` definitions and content-card instances. The schemas the wrappers used to embed in tool parameters are documented in the `software-factory-bootstrap` skill (envelope, attribute tables per type, relationship shapes); the tracker module URL is plumbed through `AgentContext.darkfactoryModuleUrl` into the system prompt so the agent puts the right `meta.adoptsFrom.module` on each tracker card. Issue invariants the wrappers used to enforce silently — `description` immutability, restricted status transitions (`blocked` / `backlog` only) — are now spelled out in the bootstrap skill as rules the agent must follow itself. The validators (parse / instantiate / test) catch malformed JSON; the immutability/transition rules don't have a structural backstop yet, so they're prompt-enforced. Wiring side effects: - `factory-tool-builder.ts` loses ~330 lines (the five `build*Tool` functions, the `readPatchDocument` helper, `cardTypeSchemas` config, per-type schema resolution). - `factory-issue-loop-wiring.ts` no longer fetches card-type schemas via `loadDarkFactorySchemas` at startup (one less prerender-pool roundtrip per factory boot, ~5 fewer seconds of warm-up). - `factory-tool-builder.test.ts` drops the schemas-and-document-assembly module and the structured-tool path-arg tests (~800 lines). - `factory-tool-executor.spec.ts` shrinks to its remaining unregistered-tool sanity check (~390 lines). End-to-end run on a fresh realm: `outcome=all_issues_done` over 2 cycles, all 5 validations green, 11/11 QUnit tests passing. Bootstrap iteration emits Project + 3 Knowledge Articles + Issue via `Write` directly with the right `adoptsFrom`; sticky-note iteration writes the card def, test, 3 instances, and Spec the same way. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…edTools The `canUseTool` hook from PR #4633 was dead code: empirical testing (scripts/canusetool-repro.ts) shows the SDK auto-approves any tool in `allowedTools` and never invokes the hook. Because Write/Edit/etc were in allowedTools under `permissionMode: 'dontAsk'`, the path-scoping safety net never fired and the model could write to absolute paths outside the factory workspace. Fix: - permissionMode: 'dontAsk' → 'default' (only mode that actually invokes canUseTool for tools outside allowedTools) - Remove NATIVE_FS_TOOLS from allowedTools (they stay in `tools` so they remain available, just no longer auto-approved) - MCP factory tools stay in allowedTools — auto-approve is fine for our own in-process implementations Verification: - scripts/canusetool-fix-verify.ts: hook fires for every Write call; in-workspace Write succeeds, absolute out-of-workspace Write is rejected - New unit test locks in both structural conditions (native fs out of allowedTools + permissionMode='default') so any future regression on either knob fails at unit-test time Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fca56b85b7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // immutability, and so on. CS-10883 retired all five; the agent now | ||
| // writes those `.json` files directly via `Write` (Claude) / | ||
| // `write_file` (OpenRouter). The shapes and invariants are taught in |
There was a problem hiding this comment.
Update Claude guidance when retiring tracker wrapper tools
Retiring the five tracker wrapper tools here leaves the Claude backend with conflicting instructions: buildSystemPrompt() still says “do NOT use Write or Edit for tracker-schema cards” and claims dedicated tools like update_issue/create_catalog_spec exist (packages/software-factory/src/factory-agent/claude-code.ts:430-436). After this change those tools are no longer registered, so the model is told both that it must not write tracker JSON directly and that it must not call non-existent tools, which can block bootstrap/issue updates in real runs.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in c0dfb5a.
The Claude-specific tool addendum in claude-code.ts:367-440 was stale — it still carried a "Critical rule — do NOT use Write or Edit for tracker-schema cards" + a list of retired tools (update_project / update_issue / etc.). Dropped the whole block; the addendum now explicitly says use native fs for every workspace file, including Projects/*.json, Issues/*.json, Knowledge Articles/*.json, Spec/*.json — pointing at the bootstrap and operations skills for the JSON:API shapes and Issue invariants. Verified with the 6 ClaudeCodeFactoryAgent tests + a real factory:go --agent claude end-to-end run that completed outcome=all_issues_done writing all tracker cards via native Write.
| "body": "...comment text, markdown allowed...", | ||
| "author": "factory-agent", | ||
| "createdAt": "<ISO timestamp>" | ||
| } |
There was a problem hiding this comment.
Use the schema’s
datetime field for appended comments
The new comment-shape example instructs agents to append createdAt, but the actual Comment card schema uses datetime (packages/software-factory/realm/darkfactory.gts:245-250). With this guidance, agent-written issue comments will miss the expected timestamp field, so comment metadata won’t match the tracker schema and UI/date rendering logic.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in c0dfb5a.
Confirmed against realm/darkfactory.gts:245-249 — the Comment field schema is { body, author, datetime }, no createdAt. Updated the operations skill to use datetime and added a note calling out that the Issue card itself has separate createdAt / updatedAt attributes (which the bootstrap skill correctly references) — the two are not the same field.
There was a problem hiding this comment.
Pull request overview
This PR completes the software-factory structured-tool retirement by removing the tracker-card wrapper tools and teaching the agent to write raw JSON:API tracker cards (Project / Issue / KnowledgeArticle / Spec) directly via the workspace filesystem, with the live darkfactoryModuleUrl surfaced in prompts.
Changes:
- Removed the five structured tracker-card tools from
buildFactoryToolsand deleted the associated schema/document assembly logic. - Plumbed
darkfactoryModuleUrlthrough the issue loop into the system prompt so the agent can setmeta.adoptsFrom.modulecorrectly. - Slimmed and updated test coverage to focus on the remaining tool surface and the
canUseToolgating behavior.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/software-factory/tests/factory-tool-executor.spec.ts | Removes integration coverage for retired structured tools; keeps unregistered-tool guard test. |
| packages/software-factory/tests/factory-tool-builder.test.ts | Drops schema/tool tests for retired tools; adds regression assertion that retired tool names are absent. |
| packages/software-factory/tests/factory-agent-claude-code.test.ts | Updates Claude agent options expectations for allowedTools/permissionMode and adds regression guard for CS-11033. |
| packages/software-factory/src/issue-loop.ts | Adds darkfactoryModuleUrl to loop/context builder plumbing. |
| packages/software-factory/src/factory-tool-builder.ts | Removes structured tracker update tool builders and schema-based parameter logic. |
| packages/software-factory/src/factory-prompt-loader.ts | Passes darkfactoryModuleUrl into the system prompt template. |
| packages/software-factory/src/factory-issue-loop-wiring.ts | Removes schema preloading step; forwards darkfactoryModuleUrl into the issue loop config. |
| packages/software-factory/src/factory-context-builder.ts | Adds optional darkfactoryModuleUrl to AgentContext construction. |
| packages/software-factory/src/factory-agent/types.ts | Extends AgentContext with darkfactoryModuleUrl. |
| packages/software-factory/src/factory-agent/openrouter.ts | Includes darkfactoryModuleUrl in system prompt template variables. |
| packages/software-factory/src/factory-agent/claude-code.ts | Changes allowedTools/permissionMode to ensure canUseTool gates native fs tools; includes darkfactoryModuleUrl in prompt vars. |
| packages/software-factory/scripts/smoke-tests/factory-tools-smoke.ts | Updates smoke test tool-builder config after removing structured-tool requirements. |
| packages/software-factory/scripts/canusetool-repro.ts | Adds a repro harness documenting canUseTool behavior across permission modes. |
| packages/software-factory/scripts/canusetool-fix-verify.ts | Adds a faster verification harness for the CS-11033 gating fix. |
| packages/software-factory/prompts/system.md | Adds explicit tracker schema module URL guidance for meta.adoptsFrom.module. |
| packages/software-factory/.agents/skills/software-factory-operations/SKILL.md | Rewrites tracker-card workflow docs to “write JSON directly”; documents update/merge flow and spec shape. |
| packages/software-factory/.agents/skills/software-factory-bootstrap/SKILL.md | Rewrites bootstrap docs to teach JSON:API envelope + adoptsFrom details and issue invariants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| "body": "...comment text, markdown allowed...", | ||
| "author": "factory-agent", | ||
| "createdAt": "<ISO timestamp>" |
There was a problem hiding this comment.
Fixed in c0dfb5a (same fix as the Codex P2 comment above) — the Comment field schema in darkfactory.gts:249 is datetime, not createdAt. Operations skill updated to use datetime, with a note distinguishing it from the Issue card's own createdAt / updatedAt attributes (those are correct in the bootstrap skill).
| # Tracker schema module URL | ||
|
|
||
| When you create a Project, Issue, or KnowledgeArticle JSON file, set | ||
| `data.meta.adoptsFrom.module` to **`{{darkfactoryModuleUrl}}`** — this | ||
| is the live module URL for the tracker schema in this run. The exact | ||
| `adoptsFrom.name` per file (`Project` / `Issue` / `KnowledgeArticle`) | ||
| and the JSON:API document shape are documented in the | ||
| software-factory-bootstrap and software-factory-operations skills. | ||
|
|
||
| Catalog Spec cards adopt from `https://cardstack.com/base/spec` / | ||
| `Spec` instead, regardless of the target realm. |
There was a problem hiding this comment.
Fixed in c0dfb5a — both halves:
-
add_commentreference removed:prompts/system.md:21-22no longer points at the retired tool. The new wording directs the agent to append a new entry to the issue'sattributes.comments[]array (Read the issue, push a Comment, Write the document back), with a pointer to the operations skill for the comment shape. -
Empty
darkfactoryModuleUrlnow fails fast: addedrequireDarkfactoryModuleUrl(context)infactory-prompt-loader.tsthat throws if the field is missing or empty. Wired it intoassembleSystemPrompt,claude-code.ts:339, andopenrouter.ts:334— replacing the silent?? ''fallback. In production the value is always set byfactory-issue-loop-wiring.tsviainferDarkfactoryModuleUrl(targetRealmUrl); a missing value indicates a plumbing bug, so failing the system-prompt render is the correct response. Test fixtures updated to populate the field.
| darkfactoryModuleUrl: context.darkfactoryModuleUrl ?? '', | ||
| skills, | ||
| }); | ||
|
|
There was a problem hiding this comment.
Fixed in c0dfb5a — same fix as the Codex P1 comment above. The Claude-specific addendum in claude-code.ts no longer says "do NOT use Write or Edit for tracker-schema cards" or references the retired structured tools. It now explicitly tells the agent to use native fs for every workspace file including Projects/*.json / Issues/*.json / Knowledge Articles/*.json / Spec/*.json, with pointers to the bootstrap and operations skills for the JSON:API shapes and Issue invariants the agent must enforce itself.
| return loader.load('system', { | ||
| targetRealmUrl: context.targetRealmUrl, | ||
| darkfactoryModuleUrl: context.darkfactoryModuleUrl ?? '', | ||
| skills, | ||
| }); |
There was a problem hiding this comment.
Fixed in c0dfb5a — same fix as the prompt-section comment above. The silent ?? '' fallback is gone. Replaced by requireDarkfactoryModuleUrl(context) in factory-prompt-loader.ts which throws if the field is missing or empty. All three call sites (assembleSystemPrompt, claude-code.ts, openrouter.ts) now go through the helper, so a missing value fails the prompt render with a clear error instead of producing tracker cards with empty adoptsFrom.module. Test fixtures populate the field with a stable URL so the wired guard doesn't break unit tests.
Codex + Copilot reviews flagged three real bugs introduced by the Tier 1 retirement: 1. Stale Claude-side prompt addendum in claude-code.ts still said "do NOT use Write or Edit for tracker-schema cards" and listed retired tools (update_project / update_issue / create_knowledge / create_catalog_spec / add_comment). Drop the entire structured-tool block; rewrite the addendum to direct the agent toward native fs for every workspace file (including tracker JSON). 2. prompts/system.md Rules section still pointed at the retired add_comment tool. Replace with the new comments[]-array workflow. 3. Comment shape doc in software-factory-operations skill used `createdAt`, but the Comment field schema in darkfactory.gts is `datetime`. Following that guidance produced invalid comment objects. Fix to `datetime` and call out the contrast with the Issue card's own createdAt/updatedAt fields. Plus a fail-fast for `darkfactoryModuleUrl`: the prior `?? ''` fallback silently produced bad `meta.adoptsFrom.module` values when the field was missing. Add `requireDarkfactoryModuleUrl()` helper that throws, and use it in claude-code.ts, openrouter.ts, and assembleSystemPrompt. Test fixtures updated to populate the field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lukemelia
left a comment
There was a problem hiding this comment.
Is there / should there be a tool to get the schema for a card def type?
Finishes the CS-10883 retirement that was deferred from PR #4633. The five wrapper tools that used to construct tracker-card JSON for the agent —
update_project,update_issue,create_knowledge,create_catalog_spec,add_comment— are removed. The agent now writes those.jsonfiles via nativeWrite(Claude) orwrite_file(OpenRouter) the same way it already writes.gtsdefinitions and content-card instances. One workspace fs surface, no special cases for tracker cards.What changed
The five tools are gone
buildFactoryToolsno longer registers them;factory-tool-builder.tsloses ~330 lines (the fivebuild*Toolfunctions, thereadPatchDocumenthelper, thecardTypeSchemasconfig, per-type schema resolution, the now-orphanedbuildCardDocumentimport).Schemas live in the skill, not in tool params
software-factory-bootstrap/SKILL.mdis rewritten to:data/type: "card"/attributes/relationships/meta.adoptsFrom)Project,Issue,KnowledgeArticle)descriptionis immutable; status may only transition toblockedorbacklog; the orchestrator ownsdone/in_progresssoftware-factory-operations/SKILL.mdadds matching guidance for the implementation phase: how to update tracker cards (Read → patch → Write back), how to append comments to an existing Issue, and the catalog-Spec shape (which adopts fromhttps://cardstack.com/base/spec, not the tracker module).Live tracker module URL flows into the prompt
AgentContextgains adarkfactoryModuleUrl?field. The wiring already computes it (it's<origin>/software-factory/darkfactory) and now plumbs it throughContextBuilder.buildForIssueinto the system prompt template viaprompts/system.md. The agent reads the URL from there and pastes it verbatim intometa.adoptsFrom.module.Wiring slimmed
factory-issue-loop-wiring.tsdrops theloadDarkFactorySchemasstep at startup (one less prerender-pool roundtrip per factory boot, ~5 fewer seconds of cold-start). ThecardTypeSchemasfield onToolBuilderConfigis gone.Tests
factory-tool-builder.test.ts: drops the schemas-and-document-assembly module, theadd_commentmodule, and the structured-tool path-arg tests. ~800 lines lighter. Adds anotOk(toolNames.includes(retired))assertion so the regression doesn't silently come back.factory-tool-executor.spec.ts: the Playwright spec was almost entirely test cases for the structured tools. Trimmed to its remaining unregistered-tool sanity check. ~390 lines lighter.Verification
End-to-end factory run on a fresh realm (
my-test-tier-1-1-new):all_issues_doneWritewith the rightadoptsFrom. No structured tools invoked.pnpm lint(lint:js + lint:format + lint:types) clean. Unit suite green.Tool surface after this PR
Claude backend MCP catalog: 7 tools (5 validators + 2 signals).
OpenRouter backend FactoryTool[]: read_file / write_file / edit_file (already present), search_realm, fetch_transpiled_module, run_command, 5 validators, 2 signals.
Known issue (separate ticket)
CS-11033 — the
canUseToolworkspace-scoping hook from PR #4633 is not actually enforced underpermissionMode: 'dontAsk'+ tools inallowedTools. The agent canWriteto absolute paths outside the factory workspace and the SDK accepts it. Surfaced by this PR's E2E run when the model invented a/Users/jurgen/code/factory-tier-1/...path. The agent self-corrected afterrun_parsecouldn't find the file, but that's accidental recovery, not enforcement. Tracking the fix separately.Test plan
pnpm factory:go --agent claude --brief-url … --target-realm-url … --debugcompletes withall_issues_donepnpm factory:go --agent openrouter --brief-url … --target-realm-url … --debugcompletes withall_issues_doneWrite(Claude) /write_file(OpenRouter) withdata.meta.adoptsFrom.modulematching the value named in the system prompthttps://cardstack.com/base/spec/Spec, withlinkedExamplespointing at sample instancesupdate_project,update_issue,create_knowledge,create_catalog_spec, oradd_commentin either run🤖 Generated with Claude Code