feat(b2): profiles starter + profile validator CLI#2
Conversation
- CLAUDE.md: add Fase N+7 (Context gate) to branch flow table as last phase of the previous branch. - AGENTS.md: add Context gate as non-negotiable rule #1 and as step 3 of the "continúa" autonomous execution flow. - HANDOFF.md §3: rename to "Decisión /clear vs /compact vs sesión nueva (Fase N+7 Context gate)" + add checklist pre-Fase-1 + §6b carry-over to propagate the rule to templates/*.hbs in C1. - .claude/rules/docs.md: add trazabilidad checkbox to docs-sync list (first kickoff commit references the resume prompt when the branch was started post-/compact or post-/clear). Establishes the context-management decision as the final phase of the previous branch, enforcing explicit evaluation of continuar | /compact | /clear | sesión nueva before Fase -1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
## Kickoff **Scope**: Crear 3 profiles canónicos (nextjs-app, agent-sdk, cli-tool) + profile validator CLI + fixtures + tests unit/integration + CI step. Establece los arranques canónicos del cuestionario: cada profile precocina las decisiones típicas de un stack, cubre todos los campos `required` no-usuario-específicos (domain.type, stack.language, testing.unit_framework) y deja abiertos sólo los 3 user-specific (identity.name, identity.description, identity.owner). **Archivos a crear**: - questionnaire/profiles/nextjs-app.yaml, agent-sdk.yaml, cli-tool.yaml - tools/lib/profile-validator.ts (+ .test.ts) - tools/validate-profile.ts (+ .test.ts) - tools/__fixtures__/profiles/valid/<3 canónicos>.yaml (copia literal por valor; consolidar en B3 si el runner revela mejor mecanismo) - tools/__fixtures__/profiles/invalid/*.yaml (3-4 negativos: unknown path, type mismatch, enum-out-of-values, pattern violation) **Archivos a modificar**: - .github/workflows/ci.yml → step validate-profiles (matrix ubuntu+macos, node 20, actions pineadas por SHA, reusa toolchain B1) - docs/ARCHITECTURE.md → §2 Profiles: shape canonical + sub-sección Profile validator con issue kinds - .claude/rules/generator.md → bloque Profiles (location + shape + CLI) - ROADMAP.md → arrastra drift B1 ✅ PR #1 + B2 en curso → ✅ al cerrar - HANDOFF.md → §1 snapshot + §9 próxima B3 + §10 estado B2 - MASTER_PLAN.md § Rama B2 → ✅ al cerrar **Shape canonical del profile**: version: "0.1.0" profile: name: <string> description: <string> answers: "<path.dotted>": <value> Claves dotted alineadas 1:1 con field.path del schema. Facilita override key-por-key en el runner (B3). Rechazada la alternativa anidada por acoplamiento fuerte al renombrar fields. **Issue kinds del profile validator (B2)**: - answer-unknown-path - answer-type-mismatch - answer-value-not-in-enum - answer-array-item-type-mismatch - answer-constraint-violation (pattern / minLength / maxLength / min / max / minItems / maxItems) **Brecha conocida (decisión explícita del usuario)**: answer-value-not-in-array-allowlist NO se implementa en B2. ArrayField.values existe en tools/lib/meta-schema.ts:43 y questionnaire/schema.yaml:95-100 usa la capacidad (integrations.mcps con allowlist ["mempalace","notebooklm"]). La validación a nivel de instancia se difiere. Si ArrayField.values se introduce formalmente en una rama posterior o antes de cerrar B2, añadir el check correspondiente en el profile validator. **Principio**: los profiles son PARCIALES. No tienen que cubrir todos los campos `required` del project_profile final. El validator sólo verifica que los paths declarados existan en el schema y que sus valores respeten los constraints del field. Los campos user-specific quedan fuera de los profiles por diseño. **No incluido en B2 (llega después)**: - Ejecución interactiva del cuestionario (B3 runner). - Merging profile + overrides CLI (B3). - Generación real de archivos (C1+). - Resolución de `when:` para decidir requiredness condicional (B3). **Alternativas descartadas**: - (B) Extender tools/validate-questionnaire.ts con flag --profile: acopla responsabilidades (meta-schema vs instancia). - (C) Sólo tests, sin CLI: bloquea CI step y futuros usos desde pre-PR gate (D4). **Risks**: - Duplicación de datos entre questionnaire/profiles/ y tools/__fixtures__/profiles/valid/. Mitigación: copia literal; consolidar si B3 lo pide. Scope controlado (~150 líneas YAML). - Brecha del array allowlist documentada arriba; no bloquea el MVP. **Test plan**: - Unit (tools/lib/profile-validator.test.ts): cada issue kind + 3 profiles canónicos válidos + profile parcial OK (sin user-specific) + profile con campos extra no declarados → issue. - Integration (tools/validate-profile.test.ts): CLI exit 0 sobre canónicos; exit 1 sobre negativos con stderr del issue kind; exit 2 sobre archivo inexistente o YAML ilegible. - CI: step validate-profiles ejecuta el CLI sobre los 3 profiles en matrix ubuntu+macos. - Coverage: thresholds vigentes (90/85/90/90) deben seguir pasando. **Docs plan** (Fase N+3): - docs/ARCHITECTURE.md §2 Profiles → shape + sub-sección Profile validator con issue kinds + nota de brecha diferida. - .claude/rules/generator.md → bloque Profiles. - ROADMAP drift (B1 ✅ PR #1) + B2 ✅ + progreso Fase B. - HANDOFF snapshot + próxima rama B3 + estado B2. - MASTER_PLAN § Rama B2 → ✅. **Trazabilidad Fase N+7** (aplicada per regla #1 AGENTS.md y checkbox de .claude/rules/docs.md): esta rama se inició post-/compact con focus="B1 merged + Fase -1 B2 draft + ROADMAP drift + sistematización Fase N+7 aplicada". Archivos releídos post-compact para retomar Fase -1: MASTER_PLAN.md § Rama B2 (L67-71), docs/ARCHITECTURE.md §2 Profiles (L54-60) + §Schema DSL (L62-89), .claude/rules/generator.md (entero), questionnaire/schema.yaml, questionnaire/questions.yaml, tools/lib/meta-schema.ts, HANDOFF.md §3 + §6b. Decisiones preservadas del pre-compact: alternativa (A) CLI validator, shape answers-dotted, denominador de cobertura = required-fields-no-user-specific, sistematización Fase N+7 aplicada en CLAUDE/AGENTS/HANDOFF/rules (commit anterior c9e3de5 en esta misma rama). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduces ProfileFile zod schema (strict) with shape {version, profile:
{name,description}, answers:{path.dotted:value}} and the validateProfile()
function that walks answers, looks up each path in the meta-schema, and
emits ProfileIssue[] covering:
- answer-unknown-path
- answer-type-mismatch (scalar type disagreement with field.type)
- answer-value-not-in-enum
- answer-array-item-type-mismatch
- answer-constraint-violation (pattern / minLength / maxLength /
min / max / minItems / maxItems)
Profiles are treated as partial by design: fields not mentioned in
answers are not flagged, and the user-specific required fields
(identity.name, identity.description, identity.owner) are expected to
be missing from profiles.
Tests (TDD, 21 cases): canonical shape + each issue kind + multi-issue
aggregation + partial-profile acceptance. Reuses the existing
meta-schema parser; no new deps.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLI entry point `npx tsx tools/validate-profile.ts <profile.yaml> [--schema questionnaire/schema.yaml]` with exit codes mirroring the questionnaire validator: 0 on OK, 1 on semantic issues, 2 on missing file / unreadable YAML / missing profile arg. formatReport() emits a human-readable block with schema + profile paths, status, and one line per issue (kind, path, detail). Stdout for diagnostics so CI captures them; stderr reserved for CLI usage errors. Integration tests (15 cases) cover: valid canonical profiles (3 × exit 0), each invalid fixture (4 × exit 1 with matching issue kind in stdout), missing file and missing arg (2 × exit 2), plus the unit-level coverage of formatReport and validateProfileFile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Canonical profiles (questionnaire/profiles/*.yaml): - nextjs-app — web-app + TS + postgres + vitest + playwright + sentry + changesets + team (12 answers). - agent-sdk — agent-sdk + python + pytest + opentelemetry + manual release + opus as default model (11 answers). - cli-tool — cli + TS + vitest + semantic-release + solo (11 answers). Each profile omits the 3 user-specific required fields (identity.name, identity.description, identity.owner) by design. Coverage of the remaining required fields (domain.type, stack.language, testing.unit_framework) is 100%; overall answers land at ~55-65% of the 18 schema fields per profile (MASTER_PLAN target ~60%). Valid fixtures (tools/__fixtures__/profiles/valid/*.yaml) are literal duplicates of the canonical profiles — kept in the tools scope since the generator does not exist yet (Fase B3+). Consolidation with the generator-side fixtures is deferred until B3 reveals a better mechanism (e.g., loader or symlink). Invalid fixtures (tools/__fixtures__/profiles/invalid/*.yaml) one per issue kind exercised by the CLI integration tests: - unknown-path.yaml → answer-unknown-path - type-mismatch.yaml → answer-type-mismatch - enum-out-of-values.yaml → answer-value-not-in-enum - pattern-violation.yaml → answer-constraint-violation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Runs \`npm run validate:profiles\` (new script) after validate:questionnaire on the ubuntu+macos × node 20 matrix. Invokes the CLI on each of the 3 canonical profiles; any non-zero exit fails the job. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ROADMAP: arrastra drift de B1 (🔄 abierta → ✅ PR #1) y añade B2 (🔄 abierta) con listado de entregables + brecha conocida. Fase B marcada "en curso (B2)". - HANDOFF §1: snapshot apunta a B2 en curso + próxima B3. - HANDOFF §9 reescrita: próxima rama = B3 con lectura mínima incluyendo profile-validator y el checkbox Fase N+7 como primer ítem del pre-flight. - HANDOFF §10 sustituida: estado B2 (cerrando) con entregables, meta commit sistematización, 106 tests, coverage 95.97%, brecha conocida. - MASTER_PLAN Rama B1 marcada ✅ PR #1. Rama B2 marcada ✅ con: ajuste vs plan original (fixtures en tools/ no generator/), brecha conocida, criterio de salida actualizado. - docs/ARCHITECTURE.md §2 Profiles: añade shape canonical + principio de parcialidad + 5 issue kinds del profile validator + brecha + comando CLI + integración CI. - .claude/rules/generator.md: bloque nuevo "Profiles" con location, shape, parcialidad, validator, fixtures y pasos para añadir un nuevo profile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- extract tools/lib/read-yaml.ts (readAndParseYaml + errorMessage) shared by validate-profile and validate-questionnaire (2 call sites, meets pattern-before-abstraction threshold) - inline 3 single-call helpers in profile-validator.ts (checkStringConstraints, checkArrayItems, constraintViolation) — switch branches become self-contained and readable - collapse 3 duplicate canonical CLI smoke tests into one parameterised it.each — same coverage, less scaffolding Net -45 LOC. Tests 106 green, coverage 95.92% lines / 89.91% branches (above thresholds 90/85/90/90). Typecheck clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Introduces a “profiles” layer for the questionnaire system, including canonical starter profiles and a CLI validator to ensure profiles stay consistent with questionnaire/schema.yaml.
Changes:
- Added
tools/lib/profile-validator.ts(Zod profile parser + schema-based answer validation) andtools/validate-profile.tsCLI with exit codes and reporting. - Added canonical profiles in
questionnaire/profiles/plus fixtures and tests for validator + CLI. - Extracted shared YAML read/parse helper to
tools/lib/read-yaml.tsand wired CI/package scripts to validate profiles.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/validate-questionnaire.ts | Reuses shared YAML read/parse helper to dedupe CLI I/O logic. |
| tools/validate-profile.ts | New CLI entrypoint to validate a profile YAML against the schema and emit issues. |
| tools/validate-profile.test.ts | Adds unit + CLI integration coverage for the validate-profile tool. |
| tools/lib/read-yaml.ts | New shared YAML reader/parser utility with consistent error formatting. |
| tools/lib/profile-validator.ts | New profile parser + validator emitting typed issue kinds. |
| tools/lib/profile-validator.test.ts | Unit tests covering parsing and validator issue emission/aggregation. |
| tools/fixtures/profiles/valid/nextjs-app.yaml | Valid profile fixture mirroring the canonical nextjs-app profile. |
| tools/fixtures/profiles/valid/cli-tool.yaml | Valid profile fixture mirroring the canonical cli-tool profile. |
| tools/fixtures/profiles/valid/agent-sdk.yaml | Valid profile fixture mirroring the canonical agent-sdk profile. |
| tools/fixtures/profiles/invalid/unknown-path.yaml | Invalid fixture to trigger answer-unknown-path. |
| tools/fixtures/profiles/invalid/type-mismatch.yaml | Invalid fixture to trigger answer-type-mismatch. |
| tools/fixtures/profiles/invalid/pattern-violation.yaml | Invalid fixture to trigger answer-constraint-violation (pattern). |
| tools/fixtures/profiles/invalid/enum-out-of-values.yaml | Invalid fixture to trigger answer-value-not-in-enum. |
| questionnaire/profiles/nextjs-app.yaml | Adds canonical Next.js app starter profile (partial by design). |
| questionnaire/profiles/cli-tool.yaml | Adds canonical CLI tool starter profile (partial by design). |
| questionnaire/profiles/agent-sdk.yaml | Adds canonical agent SDK starter profile (partial by design). |
| package.json | Adds validate:profiles script to validate all canonical profiles. |
| docs/ARCHITECTURE.md | Documents profile shape, partiality, validator semantics, and CLI usage. |
| ROADMAP.md | Updates phase/progress tracking to reflect B2 work. |
| MASTER_PLAN.md | Updates B1 completion and B2 scope/acceptance criteria to include validator/CI. |
| HANDOFF.md | Updates current phase and adds/propagates the “Context gate” process details. |
| CLAUDE.md | Adds “Fase N+7 Context gate” to the documented lifecycle. |
| AGENTS.md | Updates non-negotiable rules and “continúa/siguiente” flow to include context gate. |
| .github/workflows/ci.yml | Adds CI step to run validate:profiles. |
| .claude/rules/generator.md | Documents profile location/shape/partiality and validation/fixture expectations. |
| .claude/rules/docs.md | Adds context-traceability checklist item for branches started via compact/clear. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| it("exits 1 for type-mismatch fixture", () => { | ||
| const r = runCli(["tools/__fixtures__/profiles/invalid/type-mismatch.yaml"]); | ||
| expect(r.code).toBe(1); | ||
| expect(r.stdout).toMatch(/answer-type-mismatch/); | ||
| }, 30000); | ||
|
|
||
| it("exits 1 for enum-out-of-values fixture", () => { | ||
| const r = runCli(["tools/__fixtures__/profiles/invalid/enum-out-of-values.yaml"]); | ||
| expect(r.code).toBe(1); | ||
| expect(r.stdout).toMatch(/answer-value-not-in-enum/); | ||
| }, 30000); | ||
|
|
||
| it("exits 1 for pattern-violation fixture", () => { | ||
| const r = runCli(["tools/__fixtures__/profiles/invalid/pattern-violation.yaml"]); | ||
| expect(r.code).toBe(1); | ||
| expect(r.stdout).toMatch(/answer-constraint-violation/); | ||
| }, 30000); |
| if (field.pattern !== undefined && !new RegExp(field.pattern).test(value)) { | ||
| issues.push(violation(path, `value '${value}' does not match pattern /${field.pattern}/`)); |
|
Buen cierre de B2. La base está sólida y no veo nada bloqueante para merge. Un único apunte no bloqueante para B3 o una follow-up pequeña:
No pediría cambios por esto; sólo lo dejaría apuntado para mantener la taxonomía de errores lo más limpia posible cuando llegue B3. |
- add invalid fixture array-item-type-mismatch.yaml + CLI test (Copilot: CLI coverage gap for answer-array-item-type-mismatch) - validate pattern is a compilable regex at meta-schema parse time (Copilot: new RegExp(field.pattern) could throw — now a clear schema-scoped error via zod .refine, exit 2 instead of uncaught) - document deferred B2 brecha: enum fields emit value-not-in-enum instead of type-mismatch when given array/object (per user PR comment — non-blocking, noted for B3) 107 tests green, coverage still above thresholds, typecheck clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Addressed review feedback in 95515d3: FIX #1 — tools/validate-profile.test.ts:100: added fixture FIX #2 — tools/lib/profile-validator.ts:74: pushed the compilable-regex check one level up to tools/lib/meta-schema.ts via Deferred — user issue comment on enum type-mismatch taxonomy: per javiAI's own comment ("No pediría cambios por esto; sólo lo dejaría apuntado") documented as known gap in MASTER_PLAN.md §B2 for B3. 107 tests green, typecheck clean. |
Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta previos a la implementación TDD del runner: 1. Context-gate hardening (heredado de sesión previa): - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA elección explícita del usuario. Nunca decide por su cuenta. - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento (parar + esperar) alineado con MEMORY.md feedback_context_gate. - HANDOFF.md §3 checklist: presentar + esperar explícito antes de emitir resume prompt o proceder a Fase -1. 2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1): - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso; B2 marcada como completada (PR #2) con 2 brechas documentadas; B3 abierta con scope y ajuste (token-budget diferido). - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1. - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs). - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por "Estado B3 en curso" con decisiones Fase -1 aprobadas + archivos previstos + brechas heredadas. - MASTER_PLAN.md §B3: nota explícita del diferimiento de token-budget.ts, re-export desde tools/lib/, flags --out y --dry-run rechazados, semántica exit codes user-specific. NO parte funcional del runner. La implementación arranca en commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md regla #3 y AGENTS.md regla #4). Trazabilidad Fase -1: aprobada explícitamente por el usuario en esta sesión tras presentación de scope + ambigüedades + alternativas + test plan + docs plan. Marker creado en .claude/branch-approvals/feat_b3-generator-runner.approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(meta): pre-kickoff B3 — context-gate hardening + docs sync Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta previos a la implementación TDD del runner: 1. Context-gate hardening (heredado de sesión previa): - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA elección explícita del usuario. Nunca decide por su cuenta. - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento (parar + esperar) alineado con MEMORY.md feedback_context_gate. - HANDOFF.md §3 checklist: presentar + esperar explícito antes de emitir resume prompt o proceder a Fase -1. 2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1): - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso; B2 marcada como completada (PR #2) con 2 brechas documentadas; B3 abierta con scope y ajuste (token-budget diferido). - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1. - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs). - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por "Estado B3 en curso" con decisiones Fase -1 aprobadas + archivos previstos + brechas heredadas. - MASTER_PLAN.md §B3: nota explícita del diferimiento de token-budget.ts, re-export desde tools/lib/, flags --out y --dry-run rechazados, semántica exit codes user-specific. NO parte funcional del runner. La implementación arranca en commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md regla #3 y AGENTS.md regla #4). Trazabilidad Fase -1: aprobada explícitamente por el usuario en esta sesión tras presentación de scope + ambigüedades + alternativas + test plan + docs plan. Marker creado en .claude/branch-approvals/feat_b3-generator-runner.approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(b3): red tests for runner — fixtures + loader + validators + CLI TDD step 1 (per CLAUDE.md regla #3 + .claude/rules/tests.md): failing tests written before implementation. All three test suites fail to load because profile-loader.ts / validators.ts / run.ts don't exist yet (commit 3 will turn them green). Fixtures (generator/__fixtures__/profiles/): - valid-partial/profile.yaml — all non-user-specific required present, 3 user-specific missing. Expects exit 0 + warning. - missing-required/profile.yaml — omits domain.type. Expects exit 1 (completeness error). - invalid-value/profile.yaml — stack.language out of enum. Expects exit 1 (profile-validator issue answer-value-not-in-enum). Test files: - generator/lib/profile-loader.test.ts — 5 tests: happy, missing file, malformed YAML, invalid shape (missing profile key), strict rejection of unknown top-level key. - generator/lib/validators.test.ts — 5 tests for completenessCheck: only user-specific missing → 3 warnings; all present → clean; 1 required missing → 1 error + 3 warnings; 2 required missing → 2 errors; required with default value → satisfied (uses a synthetic schema to isolate default semantics from canonical). - generator/run.test.ts — 15 tests split in three describes: runValidation (unit) — 5 tests covering 0/1/2 exit codes across fixtures + missing file + malformed YAML. formatReport — 4 tests covering OK / WARN / FAIL rendering and required-missing + enum issue lines. CLI integration (spawnSync) — 9 tests covering valid, --validate-only, missing-required, invalid-value, rejection of --out and --dry-run with exact deferral message ("flag --X not supported in B3; planned for C1"), missing --profile, missing file, unknown flag. Decisions locked in tests: - loadProfile return shape: { ok: true, profile } | { ok: false, error }. - completenessCheck return shape: { errors[], warnings[] }. - USER_SPECIFIC_PATHS exported from validators.ts for reuse + test assertion. - runValidation takes only profilePath (schema hard-coded inside). - formatReport takes (result, profilePath); no schema param needed. Vitest output (expected): 3 failed suites, all "Failed to load url ./<module>.ts. Does the file exist?" — classic TDD red state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(b3): generator runner — profile loader + completeness + CLI Implementación mínima que pone verde el commit anterior (135/135 tests en el proyecto; 28/28 en generator/). Runner B3 cierra el círculo profile YAML → zod-validado → completeness-check → exit 0/1/2. Sin renderers todavía (llegan en C*). Ficheros: - generator/lib/schema.ts — re-export puro de parseSchemaFile / parseProfileFile / validateProfile + tipos ProfileFile / ProfileIssue / ProfileIssueKind / SchemaFile desde tools/lib/. 3ª aplicación de pattern-before-abstraction (la 2ª fue tools/lib/read-yaml.ts en B2). Ninguna lógica duplicada. - generator/lib/profile-loader.ts — loadProfile(path): discriminated union { ok: true, profile } | { ok: false, error }. Reúsa readAndParseYaml + errorMessage de tools/lib/read-yaml.ts. - generator/lib/validators.ts — completenessCheck(schema, profile) retorna { errors, warnings }. Escanea required fields del schema; si el path está ausente del profile Y el schema no declara default, emite error/warning según USER_SPECIFIC_PATHS (identity.name / description / owner → warning; resto → error). La constante se exporta para que los tests puedan aseverar la lista exacta sin duplicarla. - generator/run.ts — CLI entrypoint: * parseArgs strict con --profile (req), --validate-only, --out y --dry-run declarados pero rechazados explícitamente con mensaje exacto "flag --X not supported in B3; planned for C1". Evita falsa sensación de funcionalidad. * Schema hard-coded a questionnaire/schema.yaml (sin flag). * runValidation(profilePath) + formatReport(result, profilePath) exportadas para tests unit; main() con /* v8 ignore */ para excluir parseArgs + exit paths del coverage (mismo patrón que tools/validate-profile.ts). * Semántica exit: profile ok sin blockers → 0 (warnings permitidas); issues o completeness-errors → 1; I/O, YAML roto, args inválidos o flag diferido → 2. Profile shape invalid (top-level strict) se mapea a exit 1 porque es un error de contenido, no de I/O. Verificaciones locales: - tsc --noEmit: limpio. - vitest run: 135 tests en 13 suites (28 nuevos en generator/). - vitest run --coverage: lines 95.36%, functions 98.52%, branches 88.83%, statements 95.36%. Todos por encima del threshold del proyecto (90/90/85/90). - npm run validate:profiles: OK x3 (no regresión B2). - npm run validate:questionnaire: OK (no regresión B1). - Smoke E2E: tsx generator/run.ts --profile nextjs-app.yaml → exit 0 + 3 warnings user-specific. --out rechazado con mensaje exacto esperado, exit 2. Pendiente en commits siguientes (docs-sync Fase N+3 + posible package.json script validate:generator): - Actualizar docs/ARCHITECTURE.md §3 con shape real (signatures exportadas, exit codes definitivos) si difiere del snippet actual. - Añadir sección "Deferrals" en .claude/rules/generator.md (token-budget.ts) + sección "Reuso desde tools/lib" (3ª aplicación pattern-before-abstraction). - Script npm run validate:generator — una invocación mínima que corra el runner sobre los 3 canónicos para detectar regresiones en CI futuro. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(b3): docs-sync + CI smoke + rules/generator deferrals Docs-sync Fase N+3 + endurecimiento de CI antes del pre-PR gate. CI smoke (no regresión del runner): - package.json: nuevo script `validate:generator` que corre generator/run.ts sobre los 3 profiles canónicos (nextjs-app, agent-sdk, cli-tool). Exit 0 esperado con warnings user-specific. - .github/workflows/ci.yml: nuevo step "Validate generator (smoke — 3 canonical profiles)" entre "Validate profiles" y "Test (with coverage)". Detecta regresiones de integración antes de que los unit tests corran. Docs ARCHITECTURE.md §3 "Generador": - Reemplaza el snippet aspiracional pre-B3 (que importaba renderers + fs-writer que aún no existen) por las signatures reales entregadas: RunResult shape, runValidation, formatReport. - Documenta exit codes 0/1/2 + deferrals de B3 (token-budget, --schema, --out, --dry-run). - Añade enlaces relativos a los 3 archivos nuevos (generator/run.ts, lib/schema.ts, lib/profile-loader.ts, lib/validators.ts). Rules .claude/rules/generator.md: - Sección nueva "Runner (entregado en B3)": fixtures de integración, semántica exit codes, flags diferidos, smoke CI. - Sección nueva "Deferrals (B3)": token-budget.ts, --schema, --out/--dry-run con razón explícita de cada uno para que ramas futuras puedan decidir cuándo reintroducir. - Sección nueva "Reuso desde tools/lib/ (pattern-before-abstraction, 3ª aplicación)": norma de no-duplicación + historial de las 3 aplicaciones (B1 condition-parser, B2 read-yaml, B3 schema re-export). Fija el umbral para bifurcar cuando aparezca una 4ª aplicación con lógica generator-only. Verificaciones: - tsc --noEmit: OK. - vitest run: 135/135 tests. - npm run validate:generator: 3 x "status: OK" con warnings user-specific esperados. - npm run validate:profiles: 3 x OK (no regresión B2). - npm run validate:questionnaire: OK (no regresión B1). Post-commit: rama lista para /pos:pre-commit-review (equivalente manual: subagent code-reviewer sobre el diff completo de la rama) y pre-PR gate manual. ROADMAP + HANDOFF + MASTER_PLAN ya sincronizados en el commit 1 de la rama. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(b3): address pre-commit review — tighter flag guard + pinned enum test Two findings from the manual code-reviewer pass (equivalent of /pos:pre-commit-review) on feat/b3-generator-runner. 1. generator/run.ts line 129: changed --dry-run guard from truthy check to `values["dry-run"] !== undefined` for consistency with the --out guard above. Node's parseArgs with type:"boolean" does not emit false natively (only true on presence or undefined on absence), so the observable behavior is unchanged in practice — but the defensive shape now matches --out and survives any future parseArgs or caller variant that could produce an explicit false. 2. generator/run.test.ts: the "exit 1 value-not-in-enum" test used `r.issues.some(...)` as its only assertion, which would still pass if a regression also populated r.errors or changed r.issues length. Tightened to pin the exact shape: issues length 1, kind + path on issues[0], errors empty, warnings equal to the 3 user-specific paths. Matches the assertion style already used in the sibling "missing-required" test and in validators.test.ts. Reviewer's third finding (validate:generator CI step semantically redundant with Test-with-coverage) was considered and kept as-is. Reason documented in .claude/rules/generator.md § Runner: the smoke step catches broken tsx invocation / main() wiring before unit tests run, and the 3-profile loop adds ~3s to CI for earlier signal. Not inertia — deliberate design. Verification: - tsc --noEmit: OK. - vitest run generator/: 28/28. - No production behavior change for --dry-run in common usage (parseArgs produces true or undefined, not false). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(b3): address PR #3 review — docs stdout correction Copilot flagged a mismatch between docs and runner behavior: 2 doc sites claimed user-specific warnings land in stderr, while the CLI prints the full formatReport (including warnings) to stdout and the integration tests assert on stdout. Align docs with implementation; do not change the CLI/tests contract. - MASTER_PLAN.md §B3 exit-codes line: "stderr" → "stdout (dentro del reporte)". - generator/__fixtures__/profiles/valid-partial/profile.yaml:5 header: same correction. Typecheck + runner tests (18/18) still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Closes the 3 points raised in PR #11 review (Copilot + human feedback). 1) Hook `tool_input` validation (BLOCKER): - `tool_input is None` → `{}` (pass-through). - `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`. Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError` on list/string payloads, crashing the hook instead of responding with a controlled contract. +3 in-process tests (null, list, string) cover the new branches. 2) Docs alignment with actual safe-fail policy: - `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches the actual hook behavior and establishes a canonical policy for D2..D6 hooks. 3) CI coverage (IMPORTANT): - New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` + `macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q --cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not "passed locally". Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line `sys.exit(main())` under `__main__`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(d1): failing tests for pre-branch-gate + test-env bootstrap
Kickoff block (Fase 0) — rama feat/d1-hook-pre-branch-gate:
Scope (Fase -1 cerrada):
- hooks/pre-branch-gate.py (impl en commit siguiente): PreToolUse(Bash) que bloquea
`git checkout -b`, `git switch -c`, `git worktree add -b` sin marker
`.claude/branch-approvals/<slug-sanitized>.approved`.
- Test pair pytest para el hook (este commit, RED).
- Bootstrap mínimo del test env: `.venv` local + `requirements-dev.txt`
(pytest + pytest-cov). Sin ruff, sin selftest, sin hooks/_lib/ abstraído.
Decisiones Fase -1 cerradas (vs MASTER_PLAN §D1):
1. Alcance: cubre checkout -b, switch -c, worktree add -b. Excluye
`git branch <slug>` (crea ref sin iniciar trabajo).
2. Sin bypass env var. Bypass legítimo = crear marker explícito.
3. Doble log: `.claude/logs/pre-branch-gate.jsonl` +
`.claude/logs/phase-gates.jsonl` (evento branch_creation).
4. Parsing con shlex.split (robusto a quoting). Soporta global options
pre-subcommand.
5. Mensaje al bloquear: ruta exacta del marker + comando `touch` sugerido
+ referencia textual a `MASTER_PLAN.md`. Sin parseo del plan.
6. Pass-through silencioso: cero ruido salvo branch creation.
7. Sin `hooks/_lib/` compartido (CLAUDE.md regla 7: ≥2 reps antes de
abstraer; D1 es la primera).
Tests añadidos (RED intencional):
- hooks/tests/test_pre_branch_gate.py
· detección de branch creation
· pass-through silencioso
· sanitización de slug
· doble log allow/deny
· robustez ante stdin/comandos inválidos
- Fixtures: 6 JSON en hooks/tests/fixtures/payloads/.
Bootstrap del env:
- requirements-dev.txt: pytest>=7, pytest-cov>=4
- .gitignore: /.venv/, __pycache__/, *.pyc, .pytest_cache/
- ejecución local: \`.venv/bin/pytest hooks/tests/\`
Siguientes commits previstos:
- feat(d1): implement hook + chmod +x
- docs(d1): docs-sync
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(d1): implement pre-branch-gate hook + in-process coverage tests
Implementation:
- hooks/pre-branch-gate.py (executable, 4.6K, stdlib-only):
· Detects `git checkout -b`, `git switch -c`, `git worktree add -b` via
shlex tokenisation. Handles git global options pre-subcommand
(`git -c k=v ...`, `git --git-dir=X ...`, `git -C /p ...`).
· `extract_branch_slug()` returns None for non-branch commands
(`git status`, `git branch <x>`, `git worktree list`, etc.).
· On branch-creation command: sanitizes slug (`/` → `_`) and checks
`.claude/branch-approvals/<sanitized>.approved`.
· Marker present → allow (silent, exit 0) + append allow event to both
logs.
· Marker absent → deny (exit 2) with `decisionReason` containing:
exact marker path, suggested `touch` command, textual reference to
MASTER_PLAN.md, and the blocked command.
· Pass-through silent on all non-branch Bash, all non-Bash tools,
missing/empty fields, and shlex-unparseable commands.
· Malformed JSON stdin → deny (exit 2).
· Double logging: `.claude/logs/pre-branch-gate.jsonl` +
`.claude/logs/phase-gates.jsonl` (event: branch_creation).
Test suite (55 passing, 99% coverage on pre-branch-gate.py):
- 23 subprocess integration tests (pre-existing, from commit 1).
- 32 in-process unit tests added for coverage visibility, covering:
· sanitize_slug: 3 cases.
· extract_branch_slug: 20 branches (all subcommand/flag/global-opt
combinations + negative cases).
· build_deny_reason: shape assertions.
· main() direct calls with monkeypatched chdir + stdin: 8 paths
(malformed, non-dict, non-bash, missing input, empty command,
non-branch, branch-with-marker, branch-without-marker).
- The single uncovered line (175) is `sys.exit(main())` under
`if __name__ == "__main__":` — not reachable from in-process tests
and intrinsic to script entry.
Run locally:
.venv/bin/pytest hooks/tests/ --cov=hooks --cov-report=term-missing
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* docs(d1): docs-sync — mark D1 ✅ + record ajustes vs plan
- ROADMAP.md:
· Fase D row: ⏳ pendiente → ⏳ parcial (D1 ✅).
· feat/d1-hook-pre-branch-gate row: ⏳ → ✅ (PR pendiente).
· Nueva sección "Progreso Fase D" con entregables + ajustes vs plan
original (alcance ampliado a worktree add -b, sin bypass env var,
sin hooks/_lib/, bootstrap de test env, in-process tests para
coverage).
- HANDOFF.md:
· § Snapshot: Fase actual C5 → D1 cerrada. Siguiente D1 → D2.
· § Gotchas: pre-branch-gate.py "aún no existe" → vivo desde D1;
resto de hooks (session-start, pre-write-guard, post-action,
pre-compact, stop-policy-check) siguen ausentes como stubs
tolerados.
· § Próxima rama: reescrito D1 → D2 con scope + lectura mínima.
· § Estado C5 → Estado D1: resumen del entregable + "lo que D1 NO
hace" + apuntes para D2 (patrón hook consolidado, señal para
extraer hooks/_lib/ en D2 cuando sea 2ª repetición).
- MASTER_PLAN.md § Rama D1:
· Status: ✅ COMPLETADA (PR pendiente).
· Scope entregado (detalle real) + Ajustes vs plan original
(alcance, parsing, logging, decision reason, deferrals).
- docs/ARCHITECTURE.md § 7 Capa 1:
· Referencia a hooks/pre-branch-gate.py como implementación canónica
del patrón de hook enforcer (shebang + stdlib-only + pass-through
silencioso + shlex parsing + double log shape).
- .claude/rules/hooks.md:
· Nueva sub-sección "Primer hook entregado" con la estructura
consolidada: pass-through silencioso, shlex, sanitización local
(no helper todavía), decisionReason constructivo, double log,
patrón de tests (subprocess integration + in-process unit via
importlib.util por guión en el nombre).
Tests locales: .venv/bin/pytest hooks/tests/ --cov=hooks
→ 55 passed, 99% coverage on hooks/pre-branch-gate.py.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* refactor(d1): simplify hook + close review gap on git global opts
- _flag_value(): extract shared -b/-c lookup (was duplicated across checkout/switch/worktree).
- log(): collapse dual-log (hook-scoped + phase-gates) into a single local helper in main().
- GIT_GLOBAL_OPTS_WITH_ARG: add --exec-path and --upload-pack (pre-commit-review gap: space-form of these options previously consumed the subcommand as the argument, causing a detection miss on `git --exec-path /x checkout -b slug`).
- Tests: +2 cases covering the new global opts (space-form), 57 passed, 99% coverage (line 166 `sys.exit(main())` only miss; __main__-gated, intrinsic).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(d1): review follow-up — tool_input guard, docs alignment, CI job
Closes the 3 points raised in PR #11 review (Copilot + human feedback).
1) Hook `tool_input` validation (BLOCKER):
- `tool_input is None` → `{}` (pass-through).
- `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`.
Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError`
on list/string payloads, crashing the hook instead of responding with a controlled
contract. +3 in-process tests (null, list, string) cover the new branches.
2) Docs alignment with actual safe-fail policy:
- `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed
payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a
Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches
the actual hook behavior and establishes a canonical policy for D2..D6 hooks.
3) CI coverage (IMPORTANT):
- New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` +
`macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q
--cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA
per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not
"passed locally".
Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line
`sys.exit(main())` under `__main__`).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* Update ROADMAP.md
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* docs(d1): close count drift + CI contract gap flagged in review
Follow-up to the second Copilot review pass on PR #11. All local repo
edits — no hook/test code changes.
1) Count drift (Copilot: ROADMAP.md:225 + HANDOFF.md:141):
- ROADMAP.md: remove hardcoded "55 tests en 8 clases" for the D1 test
suite. Continues the pattern set by the user's own UI edit in ecdcbbc
(removed "32 unit tests" from line 237) — docs describe the suite by
shape, not by a brittle number that drifts with every new test.
- HANDOFF.md §10: reflect the actual safe-fail contract (malformed
stdin → deny, non-dict tool_input → deny) instead of the outdated
"stdin vacío/malformado → exit 2 sin crash pero no loggea" bullet,
and drop the "55 tests pytest" number.
2) CI contract gap (Copilot: ci.yml:74 — mypy/ruff declared in
policy.yaml:68-74 + ci-cd.md:21-24 but not in the workflow):
- policy.yaml.pre_push: inline comment clarifies that `command_meta`
declares the aspirational contract; actual enforcement lands
incrementally in CI + pre-pr-gate.py (Fase D4). Lists which checks
are live today (tsc, vitest, pytest hooks) and which are deferred
(mypy hooks, eslint, prettier, ruff) so the doc no longer reads as
a broken promise.
- .claude/rules/ci-cd.md §Workflows obligatorios §1: split into
"Aterrizado" vs "Diferidos a rama dedicada", matching the actual
state, plus an invariant that future branches adding a check must
also move the bullet. Keeps the rule honest and makes drift
explicit going forward.
Scope preserved: no mypy/ruff added to CI (D1 Fase -1 explicitly
excluded them). The fix is docs/contract-alignment, not tooling
expansion.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2 dentro de la rama, sobre el trigger "gh pr create". Comportamiento -------------- - Matcher: shlex.split(command); gate solo cuando tokens[:3] == ["gh","pr","create"] (cubre --draft / --title / --body / --base). Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create, git push, git status, non-Bash) → pass-through silencioso, cero log. - Skip advisory (pass-through + log en hook log, NO phase log): * branch main / master / HEAD detached * git unavailable (cwd no es repo) * merge-base HEAD main no resoluble (main borrada localmente) - Empty diff (HEAD == base) → deny exit 2 con reason "empty PR" dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF. - Docs-sync check (reglas hardcoded, mirror de policy.yaml.lifecycle.pre_pr.docs_sync_*): baseline : ROADMAP.md + HANDOFF.md siempre. conditional: generator/** → docs/ARCHITECTURE.md hooks/** (no tests/) → docs/ARCHITECTURE.md skills/** → .claude/rules/skills-map.md .claude/patterns/** → docs/ARCHITECTURE.md Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola vez aunque múltiples prefijos lo exijan. Triggering paths capeados a 3 por doc en el reason, con sufijo "... (+N more)" cuando >3. - Safe-fail D1 blocker canonical: stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict → deny exit 2. Command ausente / no-string / vacío / shlex unparsable → pass-through 0. - Double log en decisiones (allow/deny): .claude/logs/pre-pr-gate.jsonl {ts, hook, command, decision, reason} .claude/logs/phase-gates.jsonl {ts, event:"pre_pr", decision} + 3 entradas status:"deferred" en hook log por cada decisión real (skills_required, ci_dry_run_required, invariants_check). Estas NO se emiten en skip ni en pass-through — el test las exige solo para gated decisions. Simplify pass (N+1) ------------------- - Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff). - _conditional_triggers docstring: eliminada (privada, nombre self-explaining). - main(): missing, _triggers → missing, _ (unused var sin pseudónimo). Tests ----- - 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed girados a pass, 10 previamente falsos positivos confirmados como reales, 39 @needs_hook desbloqueados con el módulo ya disponible). - Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión. - Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el except FileNotFoundError/SubprocessError de _run_git y un branch out.strip()==""; sobre el target 90%). Deferrals explícitos documentados en Fase -1 (no se tocan en D4) ---------------------------------------------------------------- - Migración de reglas hardcoded → parser de policy.yaml (rama propia). - Migración de paths D3 (pre-write-guard) → policy-driven (misma rama). - Matcher de git push --force (no gated por D4). - pre-write-guard.py intacto (cero edit). - policy.yaml intacto (cero edit). - requirements-dev.txt intacto (sin pyyaml). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ ARCHITECTURE + rules/hooks Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2, Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta sincronización antes de permitir gh pr create — dogfooding del propio blocker. - ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93% coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como siguientes en cola. - HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el cierre numérico. - MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory, razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos helpers, 3 cuts de simplify). - docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip advisory, empty-diff dedicated reason. - .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync reglas en tablas + reuso de _lib + 96 tests / 93% cov. Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md requerido por los paths hooks/ tocados en Fase 2.
* test(d4): kickoff — failing suite for pre-pr-gate hook (docs-sync enforcement on gh pr create) ## Kickoff — D4 (feat/d4-hook-pre-pr-gate) Contexto: continuación post-merge D3 (9aed1ee en main) y PR #14 docs (fx Knowledge Plane. Fase -1 ejecutada y aprobada en la misma sesión (v1 rechazada por scope inflado; v2 recortada y aprobada). Decisiones cerradas: solo como trigger; docs-sync como único enforcement real; skills_required + ci_dry_run_required + invariants_check como advisory scaffold no-blocking; sin pyyaml; sin migración D3 hardcode→policy; sin tocar pre-write-guard.py; merge-base HEAD main como baseline; git no disponible / base no resuelto → pass-through con advisory log explícito (no silencioso); diff vacío → deny exit 2 con mensaje distinto de docs-sync. ### Scope - Nuevo hook blocker (shape D1, no D2). PreToolUse(Bash) matcher únicamente. deferrido. - docs-sync enforcement real (blocker exit 2) con reglas hardcoded en el hook (mirror textual de ; migración a policy-driven se aborda en rama policy-loader propia junto a los paths hardcoded de D3): * baseline: ROADMAP.md + HANDOFF.md en diff. * conditional: generator/** | hooks/** → docs/ARCHITECTURE.md; skills/** → .claude/rules/skills-map.md; .claude/patterns/** → docs/ARCHITECTURE.md. - Advisory scaffold no-blocking (logueado, no deniega). Activable sin cambio de shape cuando sus ramas dedicadas aporten sustrato: * skills_required → skills not yet landed (Fase E*). * ci_dry_run_required → ci_dry_run deferred to dedicated rama. * invariants_check → invariants directory empty — deferred. - Pass-through + advisory log (no silencioso) en: main / master / detached HEAD; git no disponible / no es repo; merge-base HEAD main no resuelve. - Diff vacío → deny exit 2 con reason empty PR (NO menciona docs-sync). - Double log: + (evento ) sobre decisiones reales (allow/deny). Advisory skip sólo en hook log, NO en phase-gates. Pass-through silencioso (no-match) sin log (mismo patrón D1/D3). - Safe-fail blocker canonical D1: stdin vacío / JSON malformado / top-level no-dict / tool_input no-dict → deny exit 2. command no-string o vacío / shlex unparsable → pass-through. ### Archivos a crear en la rama - hooks/pre-pr-gate.py (Fase 2, GREEN — no en este commit). - hooks/tests/test_pre_pr_gate.py (este commit, RED). - hooks/tests/fixtures/payloads/gh_pr_create.json (este commit). - hooks/tests/fixtures/payloads/gh_pr_create_draft.json (este commit). - hooks/tests/fixtures/payloads/gh_pr_list.json (este commit). Reutilizo git_status.json y non_bash.json heredados de D1/D2. ### Archivos explícitamente NO tocados (deferrals documentados) - hooks/pre-write-guard.py — sin migración de paths hardcoded a policy.yaml. - policy.yaml — sin nueva clave; sigue declarativo no-parseado. - hooks/_lib/ — sin policy.py; cero helpers nuevos. - requirements-dev.txt — sin pyyaml (blocker explícito de scope D4). ### Riesgos - Tests con real-git subprocess setup (git init + config + commits) son más pesados que D3 (D3 no necesitaba git). Mitigación: fixture encapsula el setup; helpers / mantienen tests legibles. - Detached HEAD devuelve HEAD de . Se trata como skip explícito (no gated, no implicit deny). - requiere main local presente. Si main fue borrada → advisory skip (testeado explícitamente). ### Test plan (Fase 1, este commit) - TestMatcherDetection (11): gh pr create + variantes (title/draft/body/ base) → gate; gh pr list/view, gh issue create, git status, git push, non-Bash → pass-through. - TestBranchSkip (3): main, master, detached → advisory skip + sin phase log. - TestGitUnavailable (1): cwd sin git repo → advisory skip. - TestMergeBaseUnresolved (1): main borrada → advisory skip con reason merge-base. - TestEmptyDiff (2): empty PR → deny + mensaje sin docs-sync; reason incluye base. - TestDocsSyncBaseline (4): ROADMAP / HANDOFF / ambos faltando → deny; ambos presentes sin conditional → allow. - TestDocsSyncConditional (9): generator/hooks/skills/patterns triggers; multi-conditional dedup ARCHITECTURE.md; tests/** fuera de conditional. - TestDecisionReason (3): reason menciona CLAUDE.md + docs-sync; triggering paths listados; cap a 3 con indicador more. - TestAdvisoryLogs (4): deny / allow / empty-diff → 3 entradas deferred; skip → 0 entradas deferred. - TestLogging (8): double-log sólo en decisiones reales; skip sólo hook log; no-match / non-Bash / gh pr list → cero log; shape de entry. - TestRobustness (11): blocker safe-fail canonical D1. - TestIsGhPrCreateUnit (14): matcher classifier in-process. - TestCheckDocsSyncUnit (13): docs-sync classifier in-process. - TestMainInProcess (12): coverage paths subprocess no mide. Target: 96 tests. Coverage ≥90% sobre pre-pr-gate.py, ≥90% combinado hooks/. D1 (60) + D2 (66) + D3 (83) = 209 tests intactos. Suma esperada: 305. Estado RED ahora: 47 failed + 10 passed + 39 skipped. Los 39 skipped son tests @needs_hook (in-process) que se activan cuando existe el módulo. Los 10 passed son falsos positivos — retorna exit 2, que coincide con la expectativa deny exit 2 de los tests gated; al entregar la impl, esos 10 deben seguir passing con el deny correcto por lógica real, y los 47 failed deben convertirse en pass. ### Docs plan (Fase N+3) - ROADMAP.md — fila D4 ✅. - HANDOFF.md — §1 Fase actual actualizado (arrastra texto obsoleto post-merge D3 PR #13); §9 Próxima rama → D5; §10 renombrada Estado D4 con resumen. - MASTER_PLAN.md § Rama D4 — Status ✅ + Ajustes vs plan original (recorte scope v2: hardcode rules, solo gh pr create, advisory skills/CI/invariants, migración D3 diferida a rama policy-loader). - docs/ARCHITECTURE.md §7 — pre-pr-gate como 4º hook canónico en Capa 1. - .claude/rules/hooks.md — sección Cuarto hook entregado — pre-pr-gate (D4). - policy.yaml — no tocado en D4 (contrato declarativo sin enforcer real de parsing; se aborda en rama dedicada). - pre-write-guard.py — no tocado en D4. Trazabilidad de contexto: sesión arrancada desde main post-merge D3 PR #13 (y PR #14 docs Knowledge Plane). No se usó /clear ni /compact en esta sesión — no hay resume prompt que referenciar. Marker: .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved (gitignored por diseño, igual que D1/D2/D3). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> EOF ) * docs(d4): restore kickoff commit message context (e73416b) El commit de kickoff e73416b quedó con el message dañado: los backticks inline dentro del HEREDOC (cat <<'EOF' ... EOF) fueron interpretados por el $(...) externo como command substitution y reemplazados por cadena vacía. El código committeado (4 files, 903 insertions) está correcto; sólo los inline-code spans del message se perdieron. Reponer aquí, sin reescribir historia, las referencias textuales que quedaron en blanco en e73416b (se evita backtick en todo este commit): Decisiones cerradas ------------------- - Trigger único del hook: "gh pr create" (no gh issue, no git push). - Skip explícito de branch: main / master / detached-HEAD. - Hook hooks/pre-write-guard.py (D3) no se toca en D4. Scope ----- - Nuevo hook blocker: hooks/pre-pr-gate.py (shape D1, no D2). Matcher PreToolUse(Bash) sobre command == "gh pr create" + flags. - Mirror textual (hardcoded en el hook) de: policy.yaml -> lifecycle.pre_pr.docs_sync_baseline policy.yaml -> lifecycle.pre_pr.docs_sync_conditional Migración a parser declarativo diferida a rama policy-loader propia. - Double log: .claude/logs/pre-pr-gate.jsonl (shape propio del hook) .claude/logs/phase-gates.jsonl (evento "pre_pr") Riesgos ------- - Tests con real-git subprocess setup. Helpers _git y _gh_pr_create_payload encapsulan init + commits. - Detached HEAD devuelve HEAD literal de git rev-parse --abbrev-ref HEAD -> tratado como skip advisory, no como gated. - Resolución de baseline requiere main local presente: git merge-base HEAD main -> si main fue borrada, skip advisory con reason "merge-base". Trazabilidad de contexto ------------------------ - Sesión arrancada sin /clear desde main, post-merge de PR #13 (D3) y PR #14 (docs Knowledge Plane). - Marker: .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved (gitignored por diseño, igual que D1/D2/D3). Follow-up commit vacío (--allow-empty); cero cambios de código. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d4): impl hooks/pre-pr-gate.py — docs-sync enforcer on gh pr create Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2 dentro de la rama, sobre el trigger "gh pr create". Comportamiento -------------- - Matcher: shlex.split(command); gate solo cuando tokens[:3] == ["gh","pr","create"] (cubre --draft / --title / --body / --base). Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create, git push, git status, non-Bash) → pass-through silencioso, cero log. - Skip advisory (pass-through + log en hook log, NO phase log): * branch main / master / HEAD detached * git unavailable (cwd no es repo) * merge-base HEAD main no resoluble (main borrada localmente) - Empty diff (HEAD == base) → deny exit 2 con reason "empty PR" dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF. - Docs-sync check (reglas hardcoded, mirror de policy.yaml.lifecycle.pre_pr.docs_sync_*): baseline : ROADMAP.md + HANDOFF.md siempre. conditional: generator/** → docs/ARCHITECTURE.md hooks/** (no tests/) → docs/ARCHITECTURE.md skills/** → .claude/rules/skills-map.md .claude/patterns/** → docs/ARCHITECTURE.md Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola vez aunque múltiples prefijos lo exijan. Triggering paths capeados a 3 por doc en el reason, con sufijo "... (+N more)" cuando >3. - Safe-fail D1 blocker canonical: stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict → deny exit 2. Command ausente / no-string / vacío / shlex unparsable → pass-through 0. - Double log en decisiones (allow/deny): .claude/logs/pre-pr-gate.jsonl {ts, hook, command, decision, reason} .claude/logs/phase-gates.jsonl {ts, event:"pre_pr", decision} + 3 entradas status:"deferred" en hook log por cada decisión real (skills_required, ci_dry_run_required, invariants_check). Estas NO se emiten en skip ni en pass-through — el test las exige solo para gated decisions. Simplify pass (N+1) ------------------- - Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff). - _conditional_triggers docstring: eliminada (privada, nombre self-explaining). - main(): missing, _triggers → missing, _ (unused var sin pseudónimo). Tests ----- - 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed girados a pass, 10 previamente falsos positivos confirmados como reales, 39 @needs_hook desbloqueados con el módulo ya disponible). - Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión. - Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el except FileNotFoundError/SubprocessError de _run_git y un branch out.strip()==""; sobre el target 90%). Deferrals explícitos documentados en Fase -1 (no se tocan en D4) ---------------------------------------------------------------- - Migración de reglas hardcoded → parser de policy.yaml (rama propia). - Migración de paths D3 (pre-write-guard) → policy-driven (misma rama). - Matcher de git push --force (no gated por D4). - pre-write-guard.py intacto (cero edit). - policy.yaml intacto (cero edit). - requirements-dev.txt intacto (sin pyyaml). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d4): docs-sync dentro de rama — ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + rules/hooks Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2, Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta sincronización antes de permitir gh pr create — dogfooding del propio blocker. - ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93% coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como siguientes en cola. - HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el cierre numérico. - MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory, razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos helpers, 3 cuts de simplify). - docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip advisory, empty-diff dedicated reason. - .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync reglas en tablas + reuso de _lib + 96 tests / 93% cov. Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md requerido por los paths hooks/ tocados en Fase 2. * fix(d4): review PR#15 — distinguir empty-diff de diff-no-disponible + rename docs key Aborda los 5 inline comments de la review en PR#15 más los 3 items explícitos del usuario. Triage: 5 FIX (1 BLOCKER + 4 trivial/docs), 0 SKIP, 0 DISCUSS. BLOCKER (code): hooks/pre-pr-gate.py - diff_files devolvía [] tanto si el diff estaba vacío como si git subprocess fallaba (timeout, FileNotFoundError, returncode != 0). En main eso se trataba como empty PR y emitía deny. False-deny ante fallos transitorios de git. - Cambio: diff_files ahora devuelve list[str] | None. None = git no disponible (skip advisory con status: skipped, reason: git diff unavailable). [] = diff real vacío (deny con razón dedicada empty PR). Sin call sites externos a diff_files. Docs/naming: - policy.yaml.lifecycle.pre_pr expone la key como docs_sync_required, no docs_sync_baseline. Tres referencias alineadas: - MASTER_PLAN.md sección Rama D4 bullet de reglas hardcoded. - .claude/rules/hooks.md sección Cuarto hook bullet de reglas hardcoded. - docs/ARCHITECTURE.md sección 7 bullet de docs-sync + descripción del comando git real (merge-base HEAD main + diff --name-only base HEAD, no diff main..HEAD). Divergencia deliberada hooks/tests/ (docs-only, sin cambio de lógica): - CONDITIONAL_RULES del hook excluye hooks/tests/, mientras policy.yaml lista hooks/** uniforme. El hook tiene la lógica correcta (editar tests no altera arquitectura); la policy queda más laxa. Anotado como decisión D4 explícita en: - hooks/pre-pr-gate.py (comment encima de CONDITIONAL_RULES). - MASTER_PLAN.md sección Rama D4 (nuevo bullet de divergencia). - .claude/rules/hooks.md sección Cuarto hook (nuevo bullet). - docs/ARCHITECTURE.md sección 7 (nuevo bullet). Convergencia hook ↔ policy diferida a la rama policy-loader, donde el loader decidirá si representa exclusiones granulares o si la policy se vuelve específica. Tests: 322 passed (317 pre-fix + 5 nuevos en TestDiffUnavailable). Coverage sobre hooks/pre-pr-gate.py sube a 94% (+1% vs baseline D4). TestDiffUnavailable incluye unit tests de diff_files con monkeypatch + in-process main tests verificando que el skip no emite phase-gate ni false-deny empty PR. * docs(d4): align wording — docs_sync_required + git merge-base phrasing + 322 tests Post-review sweep requested by reviewer: - Strip ambiguous `docs_sync_*` wildcard; use exact `policy.yaml.lifecycle.pre_pr.docs_sync_required` + `docs_sync_conditional` everywhere. - Replace `git diff main..HEAD` references (wrong for D4 impl) with `git merge-base HEAD main` + `git diff --name-only <base> HEAD`. Session-start (D2) references to `main..HEAD` preserved (hook literally uses `{base}..HEAD` there). - Make `hooks/tests/` deliberate-divergence note explicit in MASTER_PLAN, rules/hooks.md, docs/ARCHITECTURE.md, ROADMAP. - Bump test counts to 322 / 101 and coverage ≥94% after adding TestDiffUnavailable. No logic change; docstring + docs only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…ection) (#16) * test(d5): kickoff — failing suite for post-action hook (PostToolUse compound trigger) ## Kickoff — D5 (feat/d5-hook-post-action-compound) Contexto: continuación post-merge D4 (992137f en main). Fase -1 ejecutada y aprobada en esta sesión (v1 entregada con B=gh-pr-merge incluido; v2 recortada para eliminar B tras confirmar que tool_response.exit_code no está garantizado en PostToolUse(Bash) por la doc oficial de Claude Code). Decisiones cerradas: - Matchers finales: A (git merge) + C (git pull sin --rebase). B excluido. - Hardcode 3ª aplicación (policy-loader queda diferido post-D5/D6). - Mirror literal de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger (TRIGGER_GLOBS + SKIP_IF_ONLY_GLOBS + MIN_FILES_CHANGED=2). - PostToolUse non-blocking. Exit 0 siempre. Nunca permissionDecision. - Sin skill dispatch real (E3a futura); sólo additionalContext + advisory log. - Coverage ≥90% lines / ≥85% branches sobre post-action.py. - Web UI merge queda fuera (no observable vía Bash); el pull del usuario lo captura cuando el código aterriza local. ### Scope Nuevo hook PostToolUse(Bash): hooks/post-action.py. Shape emparentado con D1 blocker (shlex, double log, importlib in-process) pero NO blocker — nunca deniega. Estrategia de detección jerárquica: - Tier 1 (command match, shlex-parsed): * A = tokens[:2] == ["git","merge"] y tokens[2:3] ∉ {--abort, --quit, --continue, --skip}. * C = tokens[:2] == ["git","pull"] y "--rebase"/"-r" ausente. * Todo lo demás → pass-through silencioso (cero log, cero stdout). - Tier 2 (post-hoc reflog determinista): * git reflog HEAD -1 --format=%gs. * A exige prefijo "merge ". * C exige prefijo "pull:" o "pull " (sin "--rebase"). * Fallo → status "tier2_unconfirmed" (log advisory; phase-gates intacto). - Derivación touched_paths: * git diff --name-only HEAD@{1} HEAD → list[str] | None. * None → status "diff_unavailable" (log advisory; phase-gates intacto). * [] → status "confirmed_no_triggers" (ambos logs; sin additionalContext). - Mirror hardcoded (policy.yaml L105-120): * TRIGGER_GLOBS: generator/lib/** | generator/renderers/** | hooks/** | skills/** | templates/**/*.hbs. * SKIP_IF_ONLY_GLOBS: docs/** | *.md | .claude/patterns/**. * MIN_FILES_CHANGED: 2. * Match se emite sólo si: len ≥ 2 AND NOT all-skip_if_only AND al menos 1 path matchea un TRIGGER_GLOBS entry. - Emisión additionalContext (4 condiciones simultáneas): 1. Tier 1 match. 2. Tier 2 confirmado. 3. touched_paths no None y len ≥ MIN_FILES_CHANGED (=2). 4. match_triggers() devuelve lista no vacía. Mensaje: triggers matcheados + touched paths (cap 3 con "... (+N more)") + sugerencia literal "/pos:compound". NUNCA intenta dispatch. - Double log (espejo D1/D3/D4): * .claude/logs/post-action.jsonl (shape propio por status). * .claude/logs/phase-gates.jsonl evento "post_merge" SÓLO en confirmed_triggers_matched + confirmed_no_triggers (Tier 2 ok). tier2_unconfirmed y diff_unavailable van sólo al hook log. - Safe-fail PostToolUse (no es blocker D1 canonical): stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict / command no-string / shlex error → exit 0, sin log, sin stdout. - Reuso hooks/_lib/: append_jsonl + now_iso. Sin helpers nuevos (regla #7 — añadir sólo si ≥2 hooks consumen el nuevo helper). ### Archivos Este commit (kickoff RED): - hooks/tests/test_post_action.py — 111 tests (38 failed + 73 skipped en RED; los 38 subprocess fallan porque el hook no existe, los 73 in-process @needs_hook skippean hasta que el módulo se pueda importar). - hooks/tests/fixtures/payloads/git_merge.json - hooks/tests/fixtures/payloads/git_merge_no_ff.json - hooks/tests/fixtures/payloads/git_merge_abort.json - hooks/tests/fixtures/payloads/git_pull.json - hooks/tests/fixtures/payloads/git_pull_rebase.json - hooks/tests/fixtures/payloads/gh_pr_merge.json - hooks/tests/fixtures/payloads/git_rebase.json Fase 2 (implementación GREEN): - hooks/post-action.py — el hook (classify_command, reflog_message, reflog_confirms, touched_paths, match_triggers, emit helpers, main). Fase N+3 (docs-sync): - ROADMAP.md — fila D5 ✅. - HANDOFF.md — §1 fase actual → D5 cerrada; §9 próxima rama → D6; §10 renombrada Estado D5. - MASTER_PLAN.md § Rama D5 — status ✅ + ajustes vs plan original (B out, matchers A+C confirmados, emission tiered). - docs/ARCHITECTURE.md §7 — post-action.py como 5º hook canónico en Capa 1 (primer PostToolUse; variante del shape blocker: no blocker, exit 0). - .claude/rules/hooks.md — sección "Quinto hook entregado — post-action". - policy.yaml — no tocado (mirror hardcoded; la sección ya existe). ### Tests (matriz que la suite fija) - TestMatcherDetection (21 casos): Tier 1 para A/C + exclusiones (abort, quit, continue, skip, rebase, rebase shorthand, gh pr merge, cherry-pick, rebase, status, push, strings vacíos, shlex unparsable). - TestTier2Reflog (15 casos): reflog_message sobre repos reales after_merge / after_ff_merge / after_pull / clean_repo / non-repo; reflog_confirms truth-table por kind × mensaje. - TestTouchedPaths (5 casos): git diff HEAD@{1} HEAD en cada tipo de repo + edge case no reflog previo. - TestPolicyConstants (3 casos): verificación literal del mirror. - TestMatchTriggers (15 casos): min_files, skip_if_only semántica (all vs any), orden policy-driven, dedupe, templates con/sin subdir. - TestIntegrationMergeTriggersMatch (4 casos): end-to-end merge real. - TestIntegrationPullTriggersMatch (3 casos): end-to-end pull real (topo upstream/src/local). - TestIntegrationMergeFF (1 caso): ff-merge también emite. - TestIntegrationTier2Unconfirmed (2 casos): mismatch command vs reflog. - TestIntegrationConfirmedNoTriggers (2 casos): docs-only merge + single-file merge (min_files). - TestIntegrationDiffUnavailable (1 caso, delega a TestMainInProcess). - TestNonMatcherPassthrough (6 casos): gh pr merge, git rebase, pull --rebase, merge --abort, git status, non-Bash tool. - TestSafeFail (10 casos): empty / malformed JSON / top-level list o string / missing tool_name / non-Bash / tool_input no-dict / command no-string o vacío / shlex error. - TestAdditionalContextShape (5 casos): contenido emitido en stdout. - TestMainInProcess (13 casos): cobertura fina de main() vía monkeypatch (incluye diff_unavailable forzado). - TestLogShape (3 casos): shape del jsonl por status. - TestIdempotence (2 casos): 2 runs → 2 entries, ambos emiten context. Total: 111 tests. RED estado inicial: 38 failed (subprocess) + 73 skipped (in-process @needs_hook, se des-skippean cuando post-action.py se pueda importar vía importlib). ### Docs plan (Fase N+3) - ROADMAP.md — fila D5 ✅ (fase D cerrada tras merge: 5/5 hooks). - HANDOFF.md — §1 fase actual, §9 próxima rama → D6, §10 Estado D5. - MASTER_PLAN.md § Rama D5 — Status ✅ + ajustes (B out, matchers jerárquicos, emission tiered). - docs/ARCHITECTURE.md §7 — 5º bloque Capa 1 (post-action). Primera variante PostToolUse no-blocking documentada. - .claude/rules/hooks.md — "Quinto hook entregado — post-action (D5)". - policy.yaml — intacto (sección L105-120 ya existente, mirrorada). Trazabilidad de contexto: sesión arrancada desde main post-merge D4 PR #15. No se usó /clear ni /compact — sin resume prompt que referenciar. Marker: .claude/branch-approvals/feat_d5-hook-post-action-compound.approved (gitignored por diseño, igual que D1/D2/D3/D4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d5): impl hooks/post-action.py — PostToolUse compound trigger (GREEN) Quinto hook del plugin pos. Primera aplicación del patrón PostToolUse non-blocking: exit 0 siempre, nunca emite permissionDecision. Detección jerárquica: - Tier 1: shlex-parse del comando Bash. Matcher A = `git merge <ref>` (excluye --abort/--quit/--continue/--skip). Matcher C = `git pull` (excluye --rebase/-r). - Tier 2: confirmación post-hoc vía `git reflog HEAD -1 --format=%gs`. A espera prefijo "merge "; C espera "pull:" o "pull " (y no "pull --rebase"). Evita disparar en `git merge --abort` o cuando el pull fue rebase real aunque el shell no lo marcara. Cuando ambos tiers confirman, deriva paths tocados vía `git diff --name-only HEAD@{1} HEAD` y hace fnmatch contra mirror literal de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger: TRIGGER_GLOBS (generator/lib, generator/renderers, hooks, skills, templates/**/*.hbs), SKIP_IF_ONLY_GLOBS (docs/**, *.md, .claude/patterns/**), MIN_FILES_CHANGED=2. Si matchea, emite additionalContext sugiriendo `/pos:compound`. Nunca dispatcha la skill. Double log canonical (D1..D4 shape): post-action.jsonl + phase-gates.jsonl (evento `post_merge`). Cuatro status distinguidos: tier2_unconfirmed y diff_unavailable loguean sólo hook log; confirmed_no_triggers y confirmed_triggers_matched loguean ambos. Reusa `_lib/jsonl.append_jsonl` y `_lib/time.now_iso`. Hardcode mirror de policy.yaml (regla #7 CLAUDE.md: dos repeticiones D4+D5 cumplen precondición para policy-loader en rama dedicada). Coverage 97% líneas sobre hooks/post-action.py (target ≥90%). Suite global hooks/**: 432 pasados (D1+D2+D3+D4+D5, 110 nuevos). Cierra el kickoff de D5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d5): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/rules/hooks.md Docs-sync dentro de la rama D5 (Fase N+3, CLAUDE.md regla #2). - ROADMAP: D5 marcada en tabla + entrada feat/d5-hook-post-action-compound con entregables completos, contrato de 4 status distinguidos y ajustes vs plan original. - HANDOFF: seccion 1 snapshot (D5 cerrada, proxima D6); seccion 7 gotchas anade el bullet post-action (PostToolUse non-blocking, tiers, 4 status, advisory-only); seccion 9 proxima rama D6 con lectura minima actualizada; seccion 10 renombrada a "Estado D5 (cerrada en rama)" con resumen ejecutable. - MASTER_PLAN seccion Rama D5 expandida: status cerrado, contexto a leer, decisiones clave (deteccion jerarquica, gh pr merge descartado, advisory-only, segunda repeticion policy.yaml), contrato por status, ajustes, criterio de salida cumplido. - docs/ARCHITECTURE seccion 7: Capa 1 pasa de "dos variantes canonicas" a "tres variantes" (anade PostToolUse non-blocking). Nuevo bloque "Implementacion canonica PostToolUse non-blocking". - .claude/rules/hooks.md: seccion "Quinto hook entregado" con shape del patron, contrato completo, diferencias vs blocker/informative, nota simplify pass pre-PR. Tests intactos (docs-only): 432 passed + 1 skipped en hooks/**. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5): address PR #16 review — 2 Copilot issues fixed 1. _log_hook / _log_phase ahora pasan por _safe_append (try/except OSError). Sin el wrapper un disk-full / RO fs lanzaba OSError y rompia el contrato "exit 0 siempre" del patron PostToolUse non-blocking. Mirror directo de hooks/session-start.py::_safe_append (D2). Consistency con el 2o patron canonico. 2. match_triggers pasa de fnmatch.fnmatch a fnmatch.fnmatchcase. fnmatch.fnmatch aplica os.path.normcase, que es case-insensitive en Windows, introduciendo no-determinismo cross-OS en la evaluacion de TRIGGER_GLOBS / SKIP_IF_ONLY_GLOBS. fnmatchcase elimina esa dependencia. Tests intactos: 110 passed + 1 skipped. Sin cambios de contrato en la suite (_safe_append es privado; fnmatchcase es drop-in para paths POSIX lowercase que ya usan los tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#17) * chore(d5b): kickoff refactor/d5-policy-loader ## Kickoff **Rama**: refactor/d5-policy-loader **Fase MASTER_PLAN**: D5b — policy-loader (insertada entre D5 y D6) **Tipo**: refactor (sin cambio de comportamiento observable, salvo convergencia `hooks/tests/`) ### Scope (Alt γ aprobada en Fase -1) Unificar lectura de `policy.yaml` en los 3 hooks D3/D4/D5 sobre un loader central en `hooks/_lib/policy.py`. D4 + D5 cumplieron las 2 repeticiones que CLAUDE.md regla #7 exige antes de abstraer; D6 nacerá ya sobre el loader. ### Archivos Nuevos: - hooks/_lib/policy.py - hooks/tests/test_lib_policy.py - hooks/tests/fixtures/policy/{minimal,full,malformed,missing-section}.yaml Modificados: - hooks/pre-write-guard.py — consume pre_write_rules() - hooks/pre-pr-gate.py — consume docs_sync_rules() + advisory_checks() - hooks/post-action.py — consume post_merge_trigger() - policy.yaml — añade lifecycle.pre_write + campo `excludes` opcional en docs_sync_conditional - requirements-dev.txt — pin exacto pyyaml - ROADMAP.md + HANDOFF.md + MASTER_PLAN.md + docs/ARCHITECTURE.md + .claude/rules/hooks.md ### Decisiones Fase -1 (congeladas) - Alt γ (migrar los 3 hooks, no scope-cut). - (b.1) strings/globs a YAML; derivación de test-pair queda en código. - (c.2) failure mode: policy no cargable → pass-through advisory + log, nunca deny. - pyyaml pin exacto (no rango). - Ubicación MASTER_PLAN: Rama D5b, sub-sección de Fase D. - templates/policy.yaml.hbs NO se toca → drift temporal meta↔template documentado explícitamente en docs/plan/PR. Esta rama NO debe leerse como "el template ya refleja el shape nuevo". Convergencia diferida a rama propia con señal de ≥1 proyecto generado que requiera el shape. ### Risks - pyyaml es primera dep runtime de hooks — primer cambio en supply chain. Mitigado con pin exacto + tests en CI. - policy.yaml extendida con campos nuevos — consumers fuera de hooks (si los hubiera) deben tolerar campos desconocidos. Hoy: skills/audit-session/ y skills/audit-plugin/ no existen. Sin impacto real. - Divergencia hooks/tests/ convergida vía campo `excludes` — test explícito asegura comportamiento D4 idéntico tras migración. ### Test plan - hooks/tests/test_lib_policy.py (~40-60 tests): happy path, secciones faltantes, YAML inválido, archivo ausente, shape validation, cache in-process, `excludes`. - Regresión: 432 tests D1..D5 corren idénticos — cero cambios al contrato observable. Test cross-hook verifica outputs con `policy.yaml` real del repo. - Failure mode (c.2): policy corrupto → pass-through + log `policy_unavailable`. - Coverage: _lib/policy.py ≥90% lines / ≥85% branches; global hooks/** sin regresión. ### Docs plan - ROADMAP.md — fila D5b + entrada "Progreso Fase D". - HANDOFF.md §1, §7, §9, §10 — quitar nota "policy.yaml declarado pero no enforced"; apuntar D6 como "nace sobre loader". - MASTER_PLAN.md — nueva sección "Rama D5b — policy-loader" con Ajustes. - docs/ARCHITECTURE.md §7 — sub-sección "Loader declarativo". - .claude/rules/hooks.md — sección "Policy loader" + ajuste en D3/D4/D5 (quitar "hardcoded", apuntar al loader). Nota dep pyyaml en "Runtime". - Drift meta↔template documentado explícitamente. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(d5b): declarative policy-loader + migrate D3/D4/D5 consumers Closes the CLAUDE.md regla #7 precondition opened by D4 + D5 (two hardcoded mirrors of policy.yaml inside hooks). Adds hooks/_lib/policy.py as the single source of truth and migrates pre-write-guard / pre-pr-gate / post-action to consume it in the same PR. Shape (Fase -1 decisions): - (b.1) strings/globs declarative in YAML, derivation in Python keyed by the pattern's `label`. derive_test_pair(rel_path, label) covers two labels: hooks_top_level_py and generator_ts (two YAML entries share the generator_ts label because fnmatch's middle `/` in `**` is literal, not recursive — one entry covers top-level, the other recursive subdirs). - (c.2) policy.yaml missing/corrupt → loader returns None → consumer hooks degrade to pass-through advisory with a `status: policy_unavailable` log entry. Never deny blindly (avoids bricking the repo on a bad YAML edit). policy.yaml changes: - New lifecycle.pre_write.enforced_patterns (3 entries). - lifecycle.pre_pr.docs_sync_conditional.hooks/** now carries excludes: ["hooks/tests/**"] — closes the deliberate D4 hook↔policy divergence. Dependency: pyyaml==6.0.2 (exact pin). First non-stdlib line in hooks/_lib/; justified in the kickoff commit. Templates intentionally NOT touched in this branch — drift meta-repo ↔ template is documented in the docs-sync commit and in the PR body. Tests: 462 passed + 1 skipped. New hooks/tests/test_lib_policy.py (57 cases); redundant TestIsEnforcedUnit / TestExpectedTestPairUnit / TestPolicyConstants removed (coverage moved into the loader suite). Coverage: _lib/policy.py 97%, pre-write-guard 93%, pre-pr-gate 93%, post-action 94%. Simplify pass: classify() in pre-write-guard now returns `label: str` instead of `(label, match_glob)` — the second element was dead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d5b): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md + drift note Docs-sync for refactor/d5-policy-loader (CLAUDE.md regla #2). Captures: - ROADMAP: new rama row refactor/d5-policy-loader (✅) + Progreso Fase D entry (loader shape, 3 hooks migrated, test counts, coverage). - HANDOFF: snapshot now points at D5b in-flight; new gotchas for loader and drift meta↔template; §11 Estado D5b; §9 Próxima rama updated so D6 starts consuming the loader (no new hardcode permitted). - MASTER_PLAN: new § Rama D5b sub-section under Fase D with scope, decisions, contract, ajustes, drift note and exit criteria. - docs/ARCHITECTURE §7: loader canonicalized as single source of truth for hooks consuming policy.yaml; failure mode (c.2) documented as third safe-fail variant; explicit drift note. - .claude/rules/hooks.md: new § Policy loader with consumer contract, failure-mode table, shape, dependency note, fnmatch middle-slash note, loader test summary, drift note. Drift meta-repo ↔ template explicitly documented in all five locations (explicit user request): templates/policy.yaml.hbs, generator/renderers/ policy.ts and snapshots were NOT touched in this branch. Projects generated with `pos` today still emit a policy.yaml with the pre-D5b shape. Reconciliation (template + renderer + snapshots + pyyaml in requirements-dev for generated Python stacks) deferred to a dedicated rama post-D6. This rama must not be read as "the template already reflects the new shape". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5b): address PR #17 review — align cache contract + log status + hygiene Applied after Copilot review surfaced concrete mismatches between docs, code and log shape. Per user direction on the cache contract: option 2 (correct docs to match code) — the in-process cache is small enough and the hooks are ephemeral enough that implementing mtime/size keying would be abstraction ahead of need. Cache contract (6 Copilot comments + user's primary ask): - PR body and 5 docs said "cache keyed by path + mtime + size" with implicit invalidation on edits. Reality: load_policy() keys the cache by absolute path only. Updated hooks/_lib/policy.py docstrings to sharpen the "no implicit invalidation on edits" note, and corrected ROADMAP.md, HANDOFF.md, MASTER_PLAN.md, docs/ARCHITECTURE.md and .claude/rules/hooks.md to match. PR body edited via `gh pr edit`. Log status alignment (2 Copilot comments): - pre-pr-gate.py:_log_skip() was hardcoding status: "skipped" for every skip reason, including the policy-unavailable case — which the loader contract (and pre-write-guard / post-action siblings) emit as status: "policy_unavailable". Added optional `status` kwarg to _log_skip and pass "policy_unavailable" at the one relevant call site. Other skip reasons keep the default. - .claude/rules/hooks.md § Policy loader — consumer-contract example updated to reflect the new kwarg shape; aligns with the failure-mode table directly below. _safe_str_list stricter shape (1 Copilot comment): - Was silently dropping non-string entries (`["ROADMAP.md", 123]` → `["ROADMAP.md"]`), producing partial under-enforcement while still treating the policy as valid. Now returns None if any element is not a string — consistent with the "wrong-shape → None" contract the module docstring already claimed. Test-fixture hygiene (3 Copilot comments): - Three autouse `_reset_policy_cache` fixtures (test_pre_write_guard, test_pre_pr_gate, test_post_action) did unconditional sys.path.insert(0, ...) without guard or teardown. Switched to the guarded "insert only if missing + remove in teardown" pattern that test_lib_policy.py already uses. One test adjusted: - test_pre_pr_gate.py::TestGitUnavailable::test_not_a_git_repo_... was implicitly exercising both the no-policy.yaml path and the no-git path simultaneously. It now writes POLICY_YAML_FOR_TESTS so it actually tests what its name claims (git-unavailable path reaching the skip log). Tests: 462 passed + 1 skipped (unchanged). Dogfooding: pre-pr-gate with the updated status field passes this PR through its own gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5b): address PR #17 review round 2 — wrong-shape guards + (c.2) coverage Second review round surfaced edge cases on the loader's failure-mode contract. All FIX, no SKIPs. 12 new regression tests. Loader wrong-shape guards (3 Copilot comments, high-value): - Each of the three accessors (docs_sync_rules, post_merge_trigger, pre_write_rules) could raise AttributeError if `lifecycle` or the section itself was present but not a mapping (e.g. `lifecycle: not_a_dict` or `lifecycle.pre_pr: 42`). That broke the "never propagate exception" contract. Extracted `_lifecycle_section()` helper with isinstance checks; all three accessors now return None on wrong shape. Optional list fields — missing vs wrong-type (1 Copilot comment, medium): - `excludes` / `skip_if_only` / `exclude_globs` previously used `_safe_str_list(...) or []`, which silently coerced wrong-type values (e.g. `excludes: "hooks/tests/**"` as a string) to empty lists — potentially disabling a declared exclusion. Added `_optional_str_list` that distinguishes absent key (`→ []`) from present-but-wrong-shape (`→ None`, signalling the caller to skip the rule/pattern, or to return None for the whole accessor when the field is required-inside- trigger like `skip_if_only`). post-action.py docstring drift (1 Copilot comment): - Docstring still said "hardcoded mirror of policy.yaml" — outdated since D5b kickoff. Rewritten to reference the loader path and document the (c.2) pass-through behavior explicitly. pre-write-guard (c.2) for unknown labels (1 suppressed low-confidence comment): - `derive_test_pair` returning None (policy.yaml label typo or a new `enforced_patterns[*].label` added without a matching code branch in the derivation switch) previously fell through to a deny with an empty expected-path. That violated the (c.2) contract ("never deny blindly on policy issues"). Now treated the same as "policy unavailable": log `status: policy_unavailable` + pass-through exit 0. Preserves the "YAML typo cannot brick the repo" invariant. Tests: 474 passed + 1 skipped (was 462 + 1, +12 new cases): - TestWrongShapeGuards — 7 cases covering non-mapping lifecycle / non- mapping section across the three accessors. - TestOptionalListShape — 4 cases covering wrong-type optional-list on each accessor + a `_safe_str_list` mixed-type propagation test that locks in the strict contract introduced in round 1. - TestMainInProcess::test_unknown_label_passes_through_with_policy_unavailable — integration test for pre-write-guard's (c.2) handling of an unknown label injected via policy.yaml. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…ontract Three substantive fixes on top of Copilot review: 1. Scope skills.jsonl reads by session_id (review concern #1). _extract_invoked_skills(repo_root, session_id) now streams line-by-line and only counts entries whose session_id matches the Stop payload. Entries without session_id, with non-string session_id, or from prior sessions are silently ignored — the log is append-only and accumulates across sessions. Payload Stop without session_id -> safe-fail deny (enforcement cannot scope safely). Tests: new TestSessionScoping class (6 cases incl. 5-session mixed log) + safe-fail cases for missing/empty/non-string session_id. 2. Tri-state skills_allowed_list — stop collapsing absent vs invalid (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in _lib/policy.py. None = section absent (deferred, prod state), sentinel = present but wrong-shape (misconfigured, observable), () = explicit deny-all, tuple = live enforcement. Stop hook emits status: policy_misconfigured on sentinel with literal reason. A typo in policy.yaml no longer silently turns enforcement off. Tests: new TestMisconfiguredPolicy class + test_three_states_are_all_distinct + test_invalid_sentinel_distinct_from_none in loader suite. 3. Remove exact-string quotes of pre-compact output from docs (review concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin the literal advisory wording — the suite validates shape + presence, not the string. Frees the hook to refine copy without doc drift. Suite: 575 passed + 1 skipped (+20 new tests). No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(d6): kickoff — pre-compact.py + stop-policy-check.py ## Kickoff ### Scope Sexto + séptimo hook Python. Cierra Fase D antes de abrir Fase E. - hooks/pre-compact.py — PreCompact informative (shape D2) - hooks/stop-policy-check.py — Stop blocker-scaffold (shape D1, deferred) D6 consume hooks/_lib/policy.py desde el primer commit (loader vivo tras D5b). Nuevo hardcode de policy = regresion explicita. ### Archivos Nuevos: - hooks/pre-compact.py - hooks/stop-policy-check.py - hooks/tests/test_pre_compact.py - hooks/tests/test_stop_policy_check.py Modificados: - hooks/_lib/policy.py (+ accessors pre_compact_rules, skills_allowed_list) - hooks/tests/test_lib_policy.py (+ casos accessors nuevos) - ROADMAP.md HANDOFF.md MASTER_PLAN.md - docs/ARCHITECTURE.md §7 - .claude/rules/hooks.md ### Decisiones Fase -1 (aprobadas) - (A2) pre-compact INFORMATIVE, no blocker. Razon: bloquear /compact intencional es destructivo; el valor del hook es emitir additionalContext con la checklist persist del policy para que el modelo persista antes del compact. - (c.3) stop-policy-check BLOCKER-SCAFFOLD. Shape D1 (safe-fail deny + double log + permissionDecision disponible), pero ZERO enforcement real hoy. policy.yaml.skills_allowed no existe todavia - skills_allowed_list() devuelve None y el hook degrada a log status=deferred. Activable sin refactor cuando E1a aterrice skills_allowed. Framing estricto: "puede bloquear por contrato, pero hoy esta en modo deferred salvo safe-fail + tests future-proof". - Ambos en la misma rama. Reuso loader + docs-sync compartido. - Failure mode canonico (c.2) reaplicado: accessor None -> pass-through advisory + log status=policy_unavailable. Nunca deny blind. ### Framing explicito (anti-sobrerrepresentacion) En docs + PR body, stop-policy-check.py NO se presenta como enforcement util en produccion. Se describe como: - hook con shape blocker listo - modo deferred mientras skills_allowed no exista en policy.yaml - safe-fail activo (deny ante payload malformado) - tests cubren el enforcement futuro para que E1a solo tenga que declarar skills_allowed en el policy ### Tests TDD estricto. Orden: 1. commit rojo: tests que fallan por accessors + hooks ausentes 2. accessors en _lib/policy.py (verde accessor tests) 3. pre-compact.py (verde pre_compact tests) 4. stop-policy-check.py (verde stop tests) 5. docs-sync 6. simplify 7. review Coverage objetivo: >=80% lines / >=75% branches por hook, >=90% sobre accessors. Suite global hooks/** >=500 tests verdes. ### Docs plan Dentro del mismo PR (docs-sync docs_sync_conditional activo por hooks/**): - ROADMAP.md: fila D6 marcada. - HANDOFF.md: §9 proxima rama (E1a), §12 estado D6, §7 contador 5->7 hooks. - MASTER_PLAN.md § Rama D6: cerrar con ajustes vs plan original. - docs/ARCHITECTURE.md §7: hook counter + eventos phase-gates (pre_compact, stop). - .claude/rules/hooks.md: Sexto hook + Septimo hook + ampliar Policy loader con pre_compact_rules + skills_allowed_list. ### NO incluye - No persistencia real de estado del LLM (pre-compact emite prompt, no escribe). - No enforcement activo de skills_allowed (scaffold). - No tocar templates/policy.yaml.hbs (drift meta-repo vs template documentado desde D5b, rama reconciliadora post-D6). - No skills, no runtime. * test(d6): red tests — accessors + hooks ausentes Fallan por diseño (TDD estricto, Fase 1 de rama): - _lib.policy.pre_compact_rules() no existe - _lib.policy.skills_allowed_list() no existe - hooks/pre-compact.py no existe (collection error) - hooks/stop-policy-check.py no existe (collection error) Lock-down de contrato pre-impl: - PreCompactRules frozen dataclass; persist: tuple[str, ...] - skills_allowed_list: tuple[str, ...] | None (None=deferred, ()=deny-all) - Pre-compact hook: shape informative (exit 0 always, no permissionDecision) - Stop hook: shape blocker-scaffold; c.3 deferred until skills_allowed declared; safe-fail blocker (deny exit 2 on malformed payload) 23 fails en test_lib_policy; 2 collection errors en los tests de hooks. Ningún test verde pre-impl. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d6): impl pre-compact + stop-policy-check hooks (+ 2 accessors) Sexto hook — hooks/pre-compact.py (shape D2 informative): - Lee lifecycle.pre_compact.persist vía pre_compact_rules() y emite hookSpecificOutput.additionalContext como checklist para el modelo. - Exit 0 siempre; nunca permissionDecision. Nunca bloquea /compact. - Failure-mode (c.2): policy None → additionalContext mínimo + status: policy_unavailable en hook log. Pass-through advisory canónico. - Double log: pre-compact.jsonl (siempre) + phase-gates.jsonl (event: pre_compact sólo en happy path; policy_unavailable queda sólo en hook). - Safe-fail informative: malformed payload → additionalContext con "(error reading payload: ...)" + status: payload_error, exit 0. Séptimo hook — hooks/stop-policy-check.py (shape D1 blocker-scaffold): - Lee skills_allowed_list() + .claude/logs/skills.jsonl. - c.3 Scaffold: skills_allowed absent → status: deferred pass-through; meta-repo no declara el campo hoy, así que enforcement es DEFERRED en prod — la cadena entera existe para cuando E1a añada el campo. - Activable: skills_allowed declarado → _validate(invoked, allowed), deny exit 2 con primer violador en decisionReason; allow exit 0. - Failure-mode (c.2): policy None → status: policy_unavailable pass-through. Safe-fail blocker canónico: malformed payload → deny exit 2. - Double log sólo en decisiones reales (allow/deny). Deferred y policy_unavailable quedan sólo en hook log. - _extract_invoked_skills y _validate son helpers privados pero testeables como unidad (aserciones `sp._extract_invoked_skills(...)` y `sp._validate(...)` en la suite). Loader — hooks/_lib/policy.py: - pre_compact_rules(repo_root) → PreCompactRules | None (dataclass frozen con persist: tuple[str, ...]). - skills_allowed_list(repo_root) → tuple[str, ...] | None (None=deferred absent; ()=explicit deny-all). 555 pasados (+1 skip intencional D5) en hooks/**. Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d6): sync ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + hooks.md Cierra docs-sync de D6 (feat/d6-hook-pre-compact-stop). Dos entregas: pre-compact.py (PreCompact informative, shape D2) + stop-policy-check.py (Stop blocker scaffold — NO enforcement en produccion hoy: skills_allowed ausente en policy.yaml del meta-repo → status: deferred, pass-through). Contrato None/() documentado como distincion semantica del scaffold. Dos accessors nuevos en hooks/_lib/policy.py: pre_compact_rules + skills_allowed_list (5 accessors totales tras D5b+D6). Framing anti-sobrerrepresentacion (MASTER_PLAN + ARCHITECTURE + hooks.md): el hook Stop valida su propio shape, no enforcement real hasta E1a poblando skills_allowed. Precondicion lista: activacion sin cambio de codigo cuando la primera skill /pos:* exista. * refactor(d6): simplify pre-compact — inline _log_hook/_log_phase wrappers Los wrappers _log_hook / _log_phase en pre-compact.py eran triviales (3 + 1 call sites). Inline directo a _safe_append(cwd / HOOK_LOG, ...) / _safe_append(cwd / PHASE_LOG, ...): -8 lineas de wrappers, mismo contrato. Gana consistencia estilistica con stop-policy-check.py (otro hook D6) que ya usaba el shape inline. No afecta tests ni comportamiento. 555 passed + 1 skipped (sin regresion). * fix(d6): address PR #18 review — session scoping + tri-state policy contract Three substantive fixes on top of Copilot review: 1. Scope skills.jsonl reads by session_id (review concern #1). _extract_invoked_skills(repo_root, session_id) now streams line-by-line and only counts entries whose session_id matches the Stop payload. Entries without session_id, with non-string session_id, or from prior sessions are silently ignored — the log is append-only and accumulates across sessions. Payload Stop without session_id -> safe-fail deny (enforcement cannot scope safely). Tests: new TestSessionScoping class (6 cases incl. 5-session mixed log) + safe-fail cases for missing/empty/non-string session_id. 2. Tri-state skills_allowed_list — stop collapsing absent vs invalid (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in _lib/policy.py. None = section absent (deferred, prod state), sentinel = present but wrong-shape (misconfigured, observable), () = explicit deny-all, tuple = live enforcement. Stop hook emits status: policy_misconfigured on sentinel with literal reason. A typo in policy.yaml no longer silently turns enforcement off. Tests: new TestMisconfiguredPolicy class + test_three_states_are_all_distinct + test_invalid_sentinel_distinct_from_none in loader suite. 3. Remove exact-string quotes of pre-compact output from docs (review concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin the literal advisory wording — the suite validates shape + presence, not the string. Frees the hook to refine copy without doc drift. Suite: 575 passed + 1 skipped (+20 new tests). No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d6): strip "PR #18" from in-repo cross-refs; keep rationale Copilot flagged 7 bullets across hooks.md / MASTER_PLAN.md / ARCHITECTURE.md that cite "post-review PR #18" inline. For long-lived rules docs the PR number is not a stable rendered identifier (forks/rebases lose it), while the rationale ("post-review") carries the same meaning. Drop the number. No contract change. Tests untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…icy.yaml vs .claude/logs/ (#25) * test(f1): RED — extend ALLOWED_SKILLS 13->14 + behavior tests for audit-session Fase 0 kickoff + Fase 1 RED-first per CLAUDE.md regla #3 and .claude/rules/tests.md. Plan ratificado por usuario: decisiones A1.a..A6.a + 3 ajustes obligatorios. Scope: skill /pos:audit-session — read-only advisory main-strict que compara 3 superficies de policy.yaml contra .claude/logs/ reales: 1. policy.yaml.skills_allowed vs skills.jsonl invocations. 2. policy.yaml.lifecycle.*.hooks_required vs logs por hook (existencia + nonempty del archivo log esperado). 3. policy.yaml.audit.required_logs vs existencia/edad/no-vacio. RED state confirmado: 16 failures esperados. - 10 parametrizados [audit-session] en TestStructure / TestFrontmatter / TestBody. No existe .claude/skills/audit-session/SKILL.md. - 5 TestAuditSessionBehavior: * test_body_declares_three_audit_surfaces * test_body_declares_advisory_only * test_body_declares_main_strict_no_delegation * test_body_declares_30day_review_window * test_body_declares_prefix_normalization_assumption - 1 test_real_skills_allowed_populated_by_f1. policy.yaml todavia declara 13; ALLOWED_SKILLS ya crecio a 14. Tests behavior siguen el patron de TestPatternAuditBehavior E3a — la referencia mas cercana: read-only advisory main-strict. Ajuste 3 del usuario aplicado: el test del 30-day window valida DECLARACION del body, no ejecucion de date math. Renames: - test_real_skills_allowed_populated_by_e3b -> _by_f1. Tupla 13 -> 14 via ALLOWED_SKILLS shared. - test_all_thirteen_e1_e3b_skills_end_to_end -> test_all_fourteen_e1_e3b_f1_skills_end_to_end. GREEN phase proxima: crear SKILL.md + bump policy.yaml.skills_allowed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(f1): GREEN - audit-session skill + bump skills_allowed 13->14 GREEN phase per CLAUDE.md regla #3 and .claude/rules/tests.md. RED commit (5d6091d ancestor + RED) introduced 16 failures; this commit turns all 16 green without touching any unrelated test. Skill body (.claude/skills/audit-session/SKILL.md ~110 lines): - Frontmatter minimal canonical: name=audit-session, description starts "Use when ...", allowed-tools list of 6 entries (Glob, Grep, Read, Bash(find:*), Bash(wc:*), Bash(.claude/skills/_shared/log-invocation.sh:*)). No Bash(git log:*) per ajuste 2 del usuario. - Read-only advisory main-strict: scope explicito MAY/MUST NOT. - Three audit surfaces declared (Fase -1 decision A1.a): Bucket 1: skills_allowed vs skills.jsonl invocations. Bucket 2: lifecycle.*.hooks_required vs per-hook log files. Bucket 3: audit.required_logs vs file existence/nonempty/mtime. - 30-day review window declared as textual guidance (A2.a + ajuste 3 del usuario): the skill does NOT execute date math, the human applies the lens when reading the report. - Prefix normalization assumption (A3.a): pos:<slug> stripped before cross-comparing with policy.yaml.skills_allowed. - Pre-existing drift expected (A4.a): hooks.jsonl declared in audit.required_logs but no such file exists. Skill reports it as Bucket 3 candidate, does NOT auto-fix. - Report structured by surface (A5.a): three sections + summary line. - audit.session_audit.schedule (e.g. weekly) explicitly NOT enforced (A6.a): documental cadence, no cron/CI hook in F1. - Out of scope: external fork delegation (main-strict by design), cross-session aggregation, date arithmetic, mutating policy or logs. Body satisfies all 5 TestAuditSessionBehavior tests literally: - skills_allowed + lifecycle + hooks_required + required_logs tokens. - "advisory"/"read-only"/"does not modify"/"no modifica" tokens. - No "subagent"/"code-architect"/"agent(" tokens (uses "fork" for external delegation refusal). - "30" + "day"/"review window" tokens. - "pos:" + "normaliz" tokens. policy.yaml: - skills_allowed: 13 -> 14 entries (audit-session appended). - Comment line 268 updated: "E3b 13 skills -> F1 14 skills". Test deltas (793 passed + 1 skipped, zero regression): - 10 parametrized [audit-session] in TestStructure / TestFrontmatter / TestBody pass. - 5 TestAuditSessionBehavior pass. - test_real_skills_allowed_populated_by_f1 passes (tuple is now 14). - test_all_fourteen_e1_e3b_f1_skills_end_to_end passes (logger -> Stop hook end-to-end with all 14 skills allowlisted). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(f1): docs-sync - ROADMAP + HANDOFF + MASTER_PLAN + skills-map Phase N+3 docs-sync per CLAUDE.md regla #2 (docs-sync within branch) and pre-pr-gate.py canonical baseline (ROADMAP + HANDOFF mandatory) plus conditional `.claude/rules/skills-map.md` for `skills/` paths. ROADMAP.md: - Top table: Fase F status `pendiente` -> `1/4 (F1 ok, F2..F4 pending)`. - F1 row: status `pending` -> `done (PR pending)` with concrete scope. - New section "Progreso Fase F" with feat/f1 detail block: entregables, allowed-tools rationale, contract locked by suite, A1.a..A6.a decisions, 3 mandatory user adjustments, criterio salida. HANDOFF.md: - Section 1 snapshot: Rama actual F1 (PR pendiente); next branch F2; F1 entregables one-liner. - Section 9 Proxima rama: F2 feat/f2-agents-subagents with scope (3 subagent definitions, naming-conflict question, agents_allowed evaluation). - New section 19 "Estado F1": full closure block parallel to E3a/E3b, with entregables + contract + 3 mandatory adjustments + YAML gotcha avoided + resultado (793 + 1 skip) + cross-references. MASTER_PLAN.md Rama F1: - Replaced 1-line stub with full closing block: scope concrete (3 surfaces), A1.a..A6.a decisions, 3 mandatory adjustments, contexto a leer, criterio de salida, carry-overs to F2..F4. Branch marker set to "PR pendiente". .claude/rules/skills-map.md: - Audit + Release section: audit-session row populated with concrete contract (3 surfaces + main-strict + 30-day textual guidance + no auto-fix + allowed-tools list). Replaces the 1-line stub from F0. Tests: 793 passed + 1 skipped (D5 intentional subprocess-no-cover). Zero regression D1..D6 + E1a..E3b. Behavior contract for audit-session locked across 5 tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…#28) * test(f4): RED — marketplace.json + release.yml + plugin.json version pin Fase 0 (kickoff) + Fase 1 (RED tests) for feat/f4-marketplace-public-repo. Scope (per Fase -1 ratificada con 8 ajustes del usuario): - Aterrizar infra local del marketplace + release flow sin depender de que javiAI/pos-marketplace exista todavía (A1.b). - .claude-plugin/marketplace.json con schema oficial Claude Code: top-level {name, owner, plugins}; owner.name; plugin {name, source} con source.{source=github, repo, ref="v"+version}. - .github/workflows/release.yml trigger tag:v*, jobs version-match / selftest / build-bundle / publish-release / mirror-marketplace (mirror condicional/skippable hasta repo público). - Bump plugin.json.version 0.0.1 → 0.1.0 (primer release público). Archivos en este commit (RED tests, expected failures): - bin/tests/test_marketplace_json_schema.py (12 tests) - bin/tests/test_release_workflow_smoke.py (6 tests) - bin/tests/test_plugin_json_version_bump.py (3 tests) Tests verifican: - marketplace.json schema oficial mínimo (top-level + owner + plugin) - plugin name/version/ref sync entre marketplace ↔ plugin.json - release.yml trigger v*, jobs esperados, publish-release.needs ⊇ {version-match, selftest, build-bundle}, mirror-marketplace conditional/skippable - plugin.json.version pin = "0.1.0" Estado RED actual: 19 failed + 12 passed (12 = F3 baseline 9 + 3 plugin.json existe/parses). Sin regresión en F3. Diferidos en F4 (regla #7 CLAUDE.md): - audit.yml nightly (sin consumer hoy; rama propia post-F4). - /pos:pr-description, /pos:release skills (sin repetición demostrada). - CHANGELOG.md enforced (auto-generated from git log entre tags). - refactor/template-policy-d5b-migration (drift independiente). - Fase G (Knowledge Plane). GREEN impl + docs (RELEASE.md/ARCHITECTURE/ci-cd/MASTER_PLAN/ROADMAP/ HANDOFF) entran en commits siguientes dentro de esta rama. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(f4): GREEN — marketplace.json + release.yml + plugin.json 0.1.0 Fase 2 (GREEN) — flippea los 19 RED del commit previo. .claude-plugin/marketplace.json (NEW): - Schema oficial Claude Code marketplace. - top-level: name="pos-marketplace", owner.name="javiAI", plugins[]. - plugins[0]: name=pos, source={source:github, repo:javiAI/ project-operating-system, ref:v0.1.0}, version=0.1.0. - metadata.{description, version} para humans. .claude-plugin/plugin.json: - version 0.0.1 → 0.1.0 (primer release público; pre-1.0). - Single source of truth; tag git debe ser v${version}. .github/workflows/release.yml (NEW): - Trigger: push tags v*. - Jobs: - version-match: assert plugin.json.version == ${tag#v}. - selftest: pytest bin/tests -q (reusa contrato F3). - build-bundle: tar.gz curated plugin-only (.claude-plugin/, .claude/skills/, .claude/rules/, hooks/, agents/, policy.yaml, bin/pos-selftest.sh, bin/_selftest.py, docs/RELEASE.md). Excluye generator/, tools/, templates/, questionnaire/. - publish-release: needs [version-match, selftest, build-bundle]; gh release create con bundle como asset. - mirror-marketplace: condicional vía vars.POS_MARKETPLACE_REPO; si vacío skippea sin fallar release. Abre PR contra repo público cuando esté configurado. - Actions pinneadas por SHA (ci-cd.md regla #2). - permissions.contents=write para gh release create. Tests post-GREEN: 21 passed (12 marketplace + 6 release.yml + 3 plugin version), suite total 644 passed + 1 skipped (skip D5 intencional F3). Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(f4): sync — RELEASE runbook + ARCH §13 + ci-cd + ROADMAP/HANDOFF/MASTER_PLAN Fase N+3 — docs-sync dentro de la rama (CLAUDE.md regla #2, docs.md § Docs-sync en cada rama). docs/RELEASE.md (NEW): - Runbook user-facing de versionado + bundle + flujo + recovery. - Contrato de versionado: plugin.json.version source of truth; tag = v${version}; marketplace.json.source.ref espeja. - Bundle scope plugin-only curated (incluye/excluye explícitos). - Flujo en 5 pasos: bump → tag → workflow → verify → recovery. - Activación del mirror cuando exista repo público (3 pasos: crear repo + gh variable set POS_MARKETPLACE_REPO + gh secret set POS_MARKETPLACE_TOKEN). - Instalación user-facing (/plugin marketplace add + /plugin install pos). - Diferidos enumerados. docs/ARCHITECTURE.md § 13 (Marketplace + Release flow): - Reescrita de placeholder de 6 líneas a sub-sección completa. - Manifest, source of truth de versión, jobs del workflow, bundle scope curated, deferral del repo público, determinismo del flujo, instalación user-facing, deferrals. .claude/rules/ci-cd.md: - Bullet release.yml promovido de "Diferidos" a "Aterrizado" (entregado en F4). - Nuevo H3 "### Job release (entregado en F4)" con scope completo (5 jobs + bundle curated + source of truth). ROADMAP.md: - Tabla: F4 marcada ✅ (PR pendiente). - Nueva sección § feat/f4-marketplace-public-repo en Progreso Fase F: scope, entregables, decisiones Fase -1 (A1.b..A8), contrato fijado, carry-overs, criterio de salida (665 passed + 1 skipped). HANDOFF.md: - §1 Snapshot: rama actual F4 (entrega + suite update). - §9 Próxima rama: Fase F cerrada; carry-overs (template-policy d5b migration, marketplace activación, skills diferidas, audit.yml). - §22 nuevo: Estado F4 con entregables + contrato + decisiones + carry-overs (paralelo a §19 F1, §20 F2, §21 F3). MASTER_PLAN.md § Rama F4: - Expandida de 3 líneas a sección completa: scope realizado, archivos entregados con detalle por path, decisiones Fase -1 (A1.b..A8), contexto leído, criterio de salida, carry-overs. Simplify pre-commit: - Recortados 3 bullets de "Ajustes durante implementación" (heredoc syntax glitch + rtk wrapper output filter) — debug ephemera, pertenecen a commit history. - Mantenido el único gotcha persistente: PyYAML 1.1 parsea `on:` como Python bool True (patrón reutilizable para tests futuros de workflow YAML). Sin tocar (per A8 ratificado en Fase -1): policy.yaml, hooks/**, .claude/skills/**, agents/**, generator/**, templates/**, .claude/rules/skills-map.md. Tests: sin cambios (GREEN ya verde con 665 passed + 1 skipped). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(f4): release.yml — gating de version-match + idempotencia mirror (PR #28 review) Aplica las 4 findings de Copilot review de PR #28. Las 4 son real correctness/idempotency bugs, no estilo. Triage value/effort: todas high/trivial-low → FIX. Gating de version-match (findings 1 + 4): - selftest: needs [version-match]. - build-bundle: needs [version-match]. Antes corrían en paralelo con version-match → CI gastaba tiempo en tags mismatched y contradecía el "orden estricto" documentado. Ahora: version-match → (selftest + build-bundle) → publish-release → mirror-marketplace. Idempotencia mirror-marketplace (findings 2 + 3): - Tras `git add marketplace.json`, si `git diff --cached --quiet` no hay cambios → exit 0. Antes `git commit` no-op fallaba la re-run del workflow. - Antes de `gh pr create`, `gh pr list --head $branch --state open`. Si ya existe un PR abierto → skip create con mensaje. Antes `gh pr create` con PR existente fallaba la re-run. Tests: bin/tests 31/31 verde; full explicit run 850 passed + 1 skipped (skip D5 intencional F3). Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
nextjs-app,agent-sdk,cli-tool) inquestionnaire/profiles/. Each answers ~60% of the schema; 3 user-specific fields are omitted by design (parcialidad).tools/lib/profile-validator.ts(zod-strict parser + issue emitter) +tools/validate-profile.tsCLI (exit 0/1/2).tools/__fixtures__/profiles/{valid,invalid}/cover all 5 emittedProfileIssueKinds.Validate profiles+ npm scriptvalidate:profilesrun all 3 canonical profiles against the schema on every push.tools/lib/read-yaml.ts(dedupes validate-profile + validate-questionnaire — meets 2x pattern threshold).Scope decisions (vs MASTER_PLAN)
tools/__fixtures__/profiles/(notgenerator/__fixtures__/profiles/) because the generator does not exist yet. Consolidation deferred to B3 if applicable. Documented in MASTER_PLAN.md section B2.answer-value-not-in-array-allowlistis not validated at instance level in this branch.ArrayField.valuesexists in the schema (integrations.mcps) but the per-item allowlist check is deferred. Documented in MASTER_PLAN.md section B2 anddocs/ARCHITECTURE.mdsection Profiles.Test plan
tools/lib/profile-validator.test.ts— 21 tests, one per issue kind + multi-issue aggregation + partial-profile acceptance.tools/validate-profile.test.ts— 14 tests viaspawnSyncon the CLI (exit codes, stderr, formatReport).tsc --noEmit).validate:profilesandvalidate:questionnaireboth exit 0 against committed profiles.Docs-sync
ROADMAP.md,HANDOFF.md,MASTER_PLAN.mdsection B2,docs/ARCHITECTURE.mdsection Profiles,.claude/rules/generator.mdupdated in-branch.Simplify pass
tools/lib/read-yaml.ts; inlined 3 single-call helpers inprofile-validator.ts; collapsed 3 duplicate canonical CLI tests to oneit.each. Net -45 LOC.Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com