redesign dev.kit: context.yaml pipeline, AGENTS.md with rules + manifests, promptfoo eval#11
Conversation
- Replace kubernetes-infra/worker-automation/infra-config archetypes with infra-pipeline (requires workflow:github + deploy:terraform) and workflow-repo - Separate Terraform from K8s signals so terraform-only repos get the right fit - Drop legacy shell archetype detection entirely; library-cli catch-all replaces it - Remove lists:/scalars: wrappers from archetype-signals, detection-signals, context-config — all config now flat under config: key - Add dev_kit_yaml_config_list/scalar for 2-level awk reads matching new structure - Remove deploy.yml from repo_root_files (weak root marker, kept in priority_paths) - Fix unquoted YAML glob patterns in detection-signals (*.sh, *.bats, etc.) - Simplify test suite to 3 groups (core, archetypes, install) with direct symlink setup instead of tar/install; all 13 core tests pass - Add repo-scaffold.yaml with infra-pipeline scaffold replacing kubernetes-infra - Clean up deleted commands (action, explore), promptfoo harness, legacy test files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix --scaffold flag being captured as repo_dir before arg-parse loop - Defer manifest write in text mode (output appears in ~5s vs 51s) - Add file-based process cache for archetypes + factor status; survives subshell boundaries where global-var caches cannot (51s → 9s total) - Print title immediately before any computation in both home and repo - Show scaffold preview in dev.kit repo learn mode (what --scaffold would do) - workflow_contract read_repo now emits refs[] array instead of command CSV - dependencies factor: not_applicable for shell repos with no manifest files - Remove deploy.yml/deploy.yaml from repo_root_files detection markers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add plain-English `description` to each archetype in archetype-rules.yaml - Surface `archetype_description` in manifest and use it in AGENTS.md instead of the internal archetype key - Add `message` field to manifest factors when status is missing/partial, drawn from audit-rules.yaml - Redesign AGENTS.md: plain description, Gaps section with actionable messages, Workflow contract steps - Simplify AGENTS.md header note from "do not edit manually" to "Run dev.kit repo to refresh" - Remove TODO.md / todo.md from priority_paths — ephemeral working notes are not agent context - Filter AGENTS.md and CLAUDE.md from AGENTS.md "Start here" and Workflow refs (tool-specific, self-referential) - Remove pre-existing TODO.md from repo Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR refactors dev.kit into a phase-based workflow (env → repo → agent), enriches repo context output (archetypes, factors, gap messages), and improves UX/performance via progressive output and caching.
Changes:
- Replaces legacy
audit/bridge/statuscommand surfaces with newhomeoutput plusrepo,agent, andlearnflows backed by JSON templates and config catalogs. - Adds config-driven detection/archetype catalogs and repo scaffolding/manifest generation (including WordPress fixture coverage).
- Introduces performance improvements (per-process cache, env context cache) and updates docs/tests accordingly.
Reviewed changes
Copilot reviewed 61 out of 69 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/suite.sh | Reworks integration suite into grouped runs; adds repo/agent expectations. |
| tests/run.sh | Removes worker-based test runner script. |
| tests/local-udx.sh | Adds local UDX repo scanning contract checks (JSON). |
| tests/fixtures/wordpress-repo/wp-content/mu-plugins/bootstrap.php | Adds WordPress fixture signal. |
| tests/fixtures/wordpress-repo/wp-config.php | Adds WordPress fixture root config. |
| tests/fixtures/wordpress-repo/package.json | Adds WordPress fixture Node manifest. |
| tests/fixtures/wordpress-repo/README.md | Adds WordPress fixture docs signal. |
| tests/fixtures/wordpress-repo/.rabbit/infra_configs/development/develop-alex/aws-cloudfront-distribution.yaml | Adds infra config fixture signal. |
| tests/fixtures/wordpress-repo/.rabbit/infra_configs/development/aws-cloudfront-distribution.yaml | Adds infra config fixture signal. |
| tests/fixtures/wordpress-repo/.github/workflows/deploy.yml | Adds workflow fixture signal. |
| tests/fixtures/codex-home/sessions/2026/04/02/rollout-2026-04-02T20-54-19-019d4f54-eddc-7350-a757-3bb578d24f99.jsonl | Adds learn-session fixture input. |
| src/templates/repo.json | Adds JSON contract template for dev.kit repo. |
| src/templates/learn.json | Adds JSON contract template for dev.kit learn. |
| src/templates/bridge.json.tmpl | Removes legacy bridge JSON template. |
| src/templates/audit.json.tmpl | Removes legacy audit JSON template. |
| src/templates/agent.json | Adds JSON contract template for dev.kit agent. |
| src/configs/repo-scaffold.yaml | Introduces repo scaffold catalog (baseline/archetype/factor). |
| src/configs/learning-workflows.yaml | Introduces learn workflow catalog and session pattern rules. |
| src/configs/knowledge-base.yaml | Introduces knowledge base hierarchy/source config. |
| src/configs/development-workflows.yaml | Introduces development workflow contract catalog. |
| src/configs/development-practices.yaml | Introduces practice messages catalog. |
| src/configs/detection-signals.yaml | Restructures detection signals into config keys; adds prune dirs. |
| src/configs/context-config.yaml | Adds priority refs, repo markers, and note paths config. |
| src/configs/archetype-signals.yaml | Adds archetype-specific signal lists (WP/K8s/Terraform). |
| src/configs/archetype-rules.yaml | Adds archetype rules + human descriptions. |
| lib/modules/utils.sh | Adds per-process cache + YAML helpers + safer JSON array escaping. |
| lib/modules/rule_catalog.sh | Removes separate rule catalog reader module. |
| lib/modules/repo_workflows.sh | Adds workflow contract + entrypoints JSON. |
| lib/modules/repo_signals.sh | Refactors detection catalog reads + repo root/marker logic + find pruning. |
| lib/modules/repo_scaffold.sh | Adds manifest writer + gaps JSON + scaffold apply. |
| lib/modules/repo_reports.sh | Enriches factor JSON with rule messages; adds agent contract output. |
| lib/modules/repo_navigation.sh | Adds workflow/dependency navigation helpers (not wired into bootstrap list). |
| lib/modules/repo_inspector.sh | Removes compatibility shim. |
| lib/modules/repo_factors.sh | Adds factor status caching; adjusts dependencies applicability and doc partial logic. |
| lib/modules/repo_archetypes.sh | Adds facet-based archetype detection driven by catalogs and caching. |
| lib/modules/output.sh | Adds output formatting helpers for progressive UX. |
| lib/modules/local_env.sh | Adds env tool inventory caching + capability derivation. |
| lib/modules/learning_sources.sh | Adds session discovery + rule/flow scoring for learn command. |
| lib/modules/detection_catalog.sh | Removes legacy detection catalog reader module. |
| lib/modules/config_catalog.sh | Adds unified config catalog reader (archetypes/context/rules/workflows). |
| lib/modules/bootstrap.sh | Defines module list and public commands; updates command file resolution. |
| lib/commands/status.sh | Removes legacy status command. |
| lib/commands/repo.sh | Adds repo learn/check/scaffold modes; manifest + AGENTS generation. |
| lib/commands/learn.sh | Adds learn command output (text + JSON). |
| lib/commands/bridge.sh | Removes legacy bridge command. |
| lib/commands/audit.sh | Removes legacy audit command. |
| lib/commands/agent.sh | Adds agent command reading manifest and generating AGENTS.md. |
| docs/workflow.md | Updates docs to phase-based workflow model. |
| docs/todo.md | Removes legacy TODO doc. |
| docs/pull-requests.md | Removes legacy PR guidance doc. |
| docs/overview.md | Adds new high-level product overview. |
| docs/engineering-guide.md | Removes legacy engineering guide doc. |
| docs/development.md | Removes legacy development doc. |
| docs/detection-facets.md | Adds facet detection documentation. |
| docs/commands.md | Adds new command reference. |
| docs/architecture.md | Adds architecture/config/module map documentation. |
| context7.json | Removes Context7 config file. |
| bin/scripts/uninstall.sh | Uses output helpers for formatted uninstall output. |
| bin/scripts/install.sh | Uses output helpers for formatted install output. |
| bin/dev-kit | Adds home command behavior + progressive output + module path loading + proc cache cleanup. |
| bin/completions/dev.kit.bash | Adds completion handling for action --json. |
| assets/dev-kit-bridge.svg | Removes legacy diagram asset. |
| assets/compliance-improve.svg | Removes legacy diagram asset. |
| README.md | Rewrites README around env/repo/agent model and updated docs links. |
| Makefile | Adds local/worker test targets. |
| .gitignore | Ignores repo manifest and agent artifacts. |
| .githooks/pre-push | Disables pre-push hook. |
Comments suppressed due to low confidence (6)
lib/modules/utils.sh:1
dev_kit_cache_setappends new values butdev_kit_cache_getreturns the first matching key (due toexit), so repeated sets for the same key will return stale data. Consider either (a) makingcache_getreturn the last match, or (b) makingcache_setrewrite the cache file while removing older entries for that key (atomic temp-file + mv) to keep correctness and avoid unbounded growth.
lib/modules/repo_signals.sh:1- The repo-root cache compares
DEV_KIT_REPO_ROOT_CACHE_INPUTagainst the normalizedrepo_dir(absolute path), but stores"$input_dir"when caching. If the caller passes a relative path or a different-but-equivalent path, the cache will never hit. Store the same normalized path you compare against (e.g., the resolvedrepo_dirused for the search) to make caching effective and predictable.
lib/modules/repo_workflows.sh:1 - The JSON emitted for
refs_jsondoes not escape values; any ref containing quotes, backslashes, or other JSON-sensitive characters will produce invalid JSON. Prefer building the JSON array via existing helpers that JSON-escape (e.g., produce newline-delimited refs then pipe throughdev_kit_lines_to_json_array), or apply escaping per element before printing.
lib/modules/repo_scaffold.sh:1 - The scaffold planner does not use
repo-scaffold.yamlor thearchetypeargument, so--scaffoldcurrently only plansdocs/(and any explicitly passed extras) rather than the baseline/archetype/factor-driven structure described in config and the PR description. To make--scaffoldfunctional, build the plan fromsrc/configs/repo-scaffold.yaml(baseline + archetype + factor-driven items) and include both directory and file creation actions.
lib/modules/repo_scaffold.sh:1 - Factor IDs are hardcoded here (and similarly in
lib/commands/repo.sh). This risks drift if factors are renamed/added elsewhere. Prefer iterating a single canonical source (e.g.,dev_kit_repo_factor_ids) to keep scaffold gaps and CLI output aligned without duplicating lists.
lib/modules/repo_navigation.sh:1 repo_navigation.shdefinesdev_kit_cache_get/dev_kit_cache_putwith different semantics/signatures than the per-process cache helpers added inlib/modules/utils.sh. Ifrepo_navigation.shis ever sourced alongside utils, these names will collide and break callers. Renaming these to something module-scoped (e.g.,dev_kit_mem_cache_get/putordev_kit_repo_nav_cache_*) would avoid hard-to-debug runtime overrides.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Typed session refs (claude:<uuid>, codex:<uuid>) unify discovery across sources - Claude sessions discovered from ~/.claude/projects/<project-id>/<uuid>.jsonl - Codex sessions filtered by cwd match via session_meta; Claude sessions matched by project directory name - --sources flag and DEV_KIT_LEARN_SOURCES env var select which sources to include - Incremental filtering via find -newer on .dev-kit/learn-last-run - Artifact written to .dev-kit/lessons-<repo>-<date>.md with session prompts, URLs, and matched patterns - 15 learn tests covering per-source filtering, multi-source merge, artifact content, incremental mode, and no-sessions fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…b templates Replace .dev-kit/manifest.json with .rabbit/context.yaml as the canonical repo-to-agent handoff artifact. AGENTS.md now inlines context directly from context.yaml via awk instead of reading JSON manifests. Add incremental lessons merging — new sessions are appended to prior workflow rules, operational references, and ready templates. Claude history.jsonl is now the preferred prompt source with fallback to raw transcript parsing. Add GitHub issue templates (bug, feature, infra), PR template, and companion config catalogs (github-issues.yaml, github-prs.yaml) that encode label taxonomy, template contracts, bot reviewer guidance, and post-merge workflow. Fix unbound manifest_path variable in repo.sh scaffold mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…te docs Output improvements: - Add spinner for long operations (context.yaml write, artifact save) - Add status indicators (✓/◦/✗/·) for factor rows in repo output - Add [next] navigation section to all commands (repo, agent, learn) Agent auto-context (lesson: self-sufficient tooling): - dev.kit agent now auto-generates .rabbit/context.yaml if missing - No manual dev.kit repo step required — eliminates friction - Error only on generation failure, not on missing context AGENTS.md redesign (lesson: no drift, no scanning): - Add Rules section — anti-drift, anti-scanning constraints at top - Add Config manifests section — traceable YAML dependencies with kind labels - Show full workflow with operational notes (was truncated to 5 steps) - Add instructional text for refs, workflow, and lessons sections - Config manifests detected from src/configs/, .github/workflows/, and root Context.yaml additions: - New manifests section listing all YAML config files with kind labels - Detected from src/configs/*.yaml, .github/workflows/*.yml, and root configs Docs and README updated: - README describes three-layer model (repo → dev env → UDX ecosystem) - All docs updated: .dev-kit/manifest.json → .rabbit/context.yaml - Agent docs reflect auto-generation and AGENTS.md redesign - Learn docs reflect Claude + Codex multi-source support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AGENTS.md is a comprehensive agent guide auto-generated by dev.kit agent. It belongs in the repo so agents have context immediately on clone — no manual setup step required. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Generated by test runs after removing AGENTS.md from gitignore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compares agent accuracy with vs without AGENTS.md on 8 test cases: - verify command knowledge - priority refs awareness - filesystem scanning avoidance - gap identification - verify-before-commit workflow - baseline functionality - file targeting accuracy - context confusion check Results: 100% pass with dev.kit context, 75% without (expected — baseline can't know gaps and resorts to scanning). Run: make eval View: make eval-view Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Config manifests now show one-line descriptions in context.yaml and AGENTS.md so agents know exactly what each YAML controls before reading it: src/configs/development-workflows.yaml — git workflow steps, PR process, and operational notes src/configs/github-prs.yaml — PR templates, bot reviewers, and post-merge checklist Scaffold is now fully config-driven from repo-scaffold.yaml: - Baseline dirs + files (docs, .gitignore, .github/PULL_REQUEST_TEMPLATE.md) - Archetype-specific structure (shell-cli gets bin, lib/commands, etc.) - Factor-driven files (missing config → .env.example, missing runtime → Dockerfile) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- set +e inside subshell prevents bash errexit from killing spinner - write to /dev/tty instead of stdout so spinner shows even when stdout is redirected (e.g. inside context_yaml_write) - check stderr for TTY detection (more reliable than stdout) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 79 out of 87 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Every command now shows a Braille spinner on stderr during its heavy computation so the terminal is never blank: dev.kit — detecting repo (archetype + profile) dev.kit repo — analyzing repo, writing context dev.kit agent — generating repo context, writing agents.md dev.kit learn — scanning agent sessions, writing lessons artifact Spinner writes to stderr so it works even when stdout is redirected (context.yaml write, JSON capture, test pipes). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 79 out of 87 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix docs/architecture.md: update .dev-kit/manifest.json refs to .rabbit/context.yaml - Fix repo-scaffold.yaml: remove stale manifest.path block - Fix local-udx.sh: update explore/action to repo/agent with correct JSON contracts - Fix repo_workflows.sh: properly JSON-escape refs in workflow JSON output - Fix utils.sh cache: return last match on get, encode multi-line values - Sanitize fixture AGENTS.md: replace absolute local paths with GitHub raw URLs - Remove fixture context.yaml from git tracking (regenerated at runtime) - Remove pre-push hook (CI handles validation) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 76 out of 84 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1 @@ | |||
| 1776175365 | |||
There was a problem hiding this comment.
learn-last-run looks like ephemeral state (an epoch timestamp) that will change on every dev.kit learn run and will cause constant working-tree churn. Consider treating this as a purely local artifact (gitignore it and/or store it under $DEV_KIT_HOME) rather than committing it in-repo.
| 1776175365 |
| prune_dirs: | ||
| - .git | ||
| - wp-admin | ||
| - wp-content | ||
| - wp-includes | ||
| - node_modules | ||
| - .next | ||
| - vendor | ||
| - dist | ||
| - .venv | ||
| - venv | ||
| - .terraform | ||
| - dist | ||
| - build | ||
| - coverage | ||
| - .next | ||
| - .nuxt | ||
| - .turbo |
There was a problem hiding this comment.
prune_dirs contains duplicates (e.g., dist and .next appear twice). This doesn’t change behavior but adds noise and longer find expressions. Consider deduplicating the list to keep the config easier to maintain.
| | `learning-workflows.yaml` | learningWorkflows | Agent session flow patterns and routing rules | | ||
| | `repo-scaffold.yaml` | repoScaffold | Baseline dirs/files and per-archetype scaffold definitions | | ||
|
|
||
| All catalogs use the same schema: `kind`, `version`, `config`. Parsed by `lib/modules/config_catalog.sh` via `dev_kit_catalog_value()`. |
There was a problem hiding this comment.
This doc references dev_kit_catalog_value(), but that function does not exist in lib/modules/config_catalog.sh (parsing is done via functions like dev_kit_yaml_config_list, dev_kit_yaml_mapping_scalar, etc.). Please update the documentation to reference the actual parsing entrypoints so readers can trace the code correctly.
| All catalogs use the same schema: `kind`, `version`, `config`. Parsed by `lib/modules/config_catalog.sh` via `dev_kit_catalog_value()`. | |
| All catalogs use the same schema: `kind`, `version`, `config`. Parsed by `lib/modules/config_catalog.sh` through the YAML helper entrypoints such as `dev_kit_yaml_config_list` and `dev_kit_yaml_mapping_scalar`. |
| @@ -0,0 +1,10 @@ | |||
| {"timestamp":"2026-04-02T17:54:23.549Z","type":"session_meta","payload":{"id":"019d4f54-eddc-7350-a757-3bb578d24f99","timestamp":"2026-04-02T17:54:19.235Z","cwd":"/Users/jonyfq/git/udx/worker-tooling","originator":"codex-tui","cli_version":"0.118.0","source":"cli","model_provider":"openai"}} | |||
There was a problem hiding this comment.
This fixture JSONL embeds a developer-local absolute path in payload.cwd (/Users/...). That leaks local filesystem structure into the repo and makes fixtures less portable. Please replace it with a neutral placeholder path (or omit the field) while keeping the fixture semantics intact.
| verify: make test | ||
| build: make build | ||
| run: make run |
There was a problem hiding this comment.
This README example output includes make build / make run. Because dev.kit discovers build/run commands by regex-scanning markdown, these example strings can get mis-detected as the repo’s real entrypoints (and they currently show up in generated context for this repo). Consider changing the example to avoid matching real command patterns (or adjust detection to require an explicit commands contract) so docs don’t create false positives.
| verify: make test | |
| build: make build | |
| run: make run | |
| verify: <project verify command> | |
| build: <project build command> | |
| run: <project run command> |
| build: make build | ||
| run: make run |
There was a problem hiding this comment.
AGENTS.md advertises make build and make run, but this repository’s Makefile does not define build or run targets (only test, test-docker, eval, etc.). This appears to be a false positive coming from markdown example parsing, and it will mislead agents/humans following the contract. Either add real build/run targets or adjust command detection/docs so the generated contract only lists executable commands.
| build: make build | |
| run: make run |
| build: make build | ||
| run: make run |
There was a problem hiding this comment.
The generated .rabbit/context.yaml lists build: make build and run: make run, but this repo’s Makefile does not have build or run targets. Since this file is the canonical handoff artifact, the command contract needs to be accurate; otherwise downstream dev.kit agent output and automation will drift. Fix by adding those targets or tightening detection so example docs don’t get promoted into commands.
| build: make build | |
| run: make run |
Summary
.dev-kit/manifest.jsonto.rabbit/context.yamlas the canonical repo-to-agent handoff[next]navigation to all commandsCommand surface
dev.kitdev.kit repo.rabbit/context.yamldev.kit agentdev.kit learnAll commands support
--json. All show spinners during slow phases.Workflow
dev.kit agentauto-generates context — no manual steps required.AGENTS.md redesign
Now a comprehensive agent guide with:
Scaffold (config-driven)
dev.kit repo --scaffoldnow reads fromrepo-scaffold.yaml:promptfoo eval
make evalruns 8 test cases comparing agent accuracy with vs without AGENTS.md:find .andtree, can't identify structural gaps)UX
[next]navigation section on every command outputTest plan
make test)dev.kit repo --scaffoldcreates baseline + archetype + factor files