feat(b2): profiles starter + profile validator CLI by javiAI · Pull Request #2 · javiAI/project-operating-system

javiAI · 2026-04-20T07:45:23Z

Summary

3 canonical profiles (nextjs-app, agent-sdk, cli-tool) in questionnaire/profiles/. Each answers ~60% of the schema; 3 user-specific fields are omitted by design (parcialidad).
New profile validator: tools/lib/profile-validator.ts (zod-strict parser + issue emitter) + tools/validate-profile.ts CLI (exit 0/1/2).
Fixtures in tools/__fixtures__/profiles/{valid,invalid}/ cover all 5 emitted ProfileIssueKinds.
CI step Validate profiles + npm script validate:profiles run all 3 canonical profiles against the schema on every push.
Shared YAML I/O extracted to tools/lib/read-yaml.ts (dedupes validate-profile + validate-questionnaire — meets 2x pattern threshold).

Scope decisions (vs MASTER_PLAN)

Fixtures location: tools/__fixtures__/profiles/ (not generator/__fixtures__/profiles/) because the generator does not exist yet. Consolidation deferred to B3 if applicable. Documented in MASTER_PLAN.md section B2.
Known gap: answer-value-not-in-array-allowlist is not validated at instance level in this branch. ArrayField.values exists in the schema (integrations.mcps) but the per-item allowlist check is deferred. Documented in MASTER_PLAN.md section B2 and docs/ARCHITECTURE.md section Profiles.

Test plan

Unit tests: tools/lib/profile-validator.test.ts — 21 tests, one per issue kind + multi-issue aggregation + partial-profile acceptance.
Integration tests: tools/validate-profile.test.ts — 14 tests via spawnSync on the CLI (exit codes, stderr, formatReport).
Full suite: 106 tests passing.
Coverage: 95.92% lines / 89.91% branches / 100% functions (thresholds 90/85/90/90).
Typecheck: clean (tsc --noEmit).
CI: validate:profiles and validate:questionnaire both exit 0 against committed profiles.

Docs-sync

ROADMAP.md, HANDOFF.md, MASTER_PLAN.md section B2, docs/ARCHITECTURE.md section Profiles, .claude/rules/generator.md updated in-branch.

Simplify pass

Extracted tools/lib/read-yaml.ts; inlined 3 single-call helpers in profile-validator.ts; collapsed 3 duplicate canonical CLI tests to one it.each. Net -45 LOC.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

- CLAUDE.md: add Fase N+7 (Context gate) to branch flow table as last phase of the previous branch. - AGENTS.md: add Context gate as non-negotiable rule #1 and as step 3 of the "continúa" autonomous execution flow. - HANDOFF.md §3: rename to "Decisión /clear vs /compact vs sesión nueva (Fase N+7 Context gate)" + add checklist pre-Fase-1 + §6b carry-over to propagate the rule to templates/*.hbs in C1. - .claude/rules/docs.md: add trazabilidad checkbox to docs-sync list (first kickoff commit references the resume prompt when the branch was started post-/compact or post-/clear). Establishes the context-management decision as the final phase of the previous branch, enforcing explicit evaluation of continuar | /compact | /clear | sesión nueva before Fase -1. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

## Kickoff **Scope**: Crear 3 profiles canónicos (nextjs-app, agent-sdk, cli-tool) + profile validator CLI + fixtures + tests unit/integration + CI step. Establece los arranques canónicos del cuestionario: cada profile precocina las decisiones típicas de un stack, cubre todos los campos `required` no-usuario-específicos (domain.type, stack.language, testing.unit_framework) y deja abiertos sólo los 3 user-specific (identity.name, identity.description, identity.owner). **Archivos a crear**: - questionnaire/profiles/nextjs-app.yaml, agent-sdk.yaml, cli-tool.yaml - tools/lib/profile-validator.ts (+ .test.ts) - tools/validate-profile.ts (+ .test.ts) - tools/__fixtures__/profiles/valid/<3 canónicos>.yaml (copia literal por valor; consolidar en B3 si el runner revela mejor mecanismo) - tools/__fixtures__/profiles/invalid/*.yaml (3-4 negativos: unknown path, type mismatch, enum-out-of-values, pattern violation) **Archivos a modificar**: - .github/workflows/ci.yml → step validate-profiles (matrix ubuntu+macos, node 20, actions pineadas por SHA, reusa toolchain B1) - docs/ARCHITECTURE.md → §2 Profiles: shape canonical + sub-sección Profile validator con issue kinds - .claude/rules/generator.md → bloque Profiles (location + shape + CLI) - ROADMAP.md → arrastra drift B1 ✅ PR #1 + B2 en curso → ✅ al cerrar - HANDOFF.md → §1 snapshot + §9 próxima B3 + §10 estado B2 - MASTER_PLAN.md § Rama B2 → ✅ al cerrar **Shape canonical del profile**: version: "0.1.0" profile: name: <string> description: <string> answers: "<path.dotted>": <value> Claves dotted alineadas 1:1 con field.path del schema. Facilita override key-por-key en el runner (B3). Rechazada la alternativa anidada por acoplamiento fuerte al renombrar fields. **Issue kinds del profile validator (B2)**: - answer-unknown-path - answer-type-mismatch - answer-value-not-in-enum - answer-array-item-type-mismatch - answer-constraint-violation (pattern / minLength / maxLength / min / max / minItems / maxItems) **Brecha conocida (decisión explícita del usuario)**: answer-value-not-in-array-allowlist NO se implementa en B2. ArrayField.values existe en tools/lib/meta-schema.ts:43 y questionnaire/schema.yaml:95-100 usa la capacidad (integrations.mcps con allowlist ["mempalace","notebooklm"]). La validación a nivel de instancia se difiere. Si ArrayField.values se introduce formalmente en una rama posterior o antes de cerrar B2, añadir el check correspondiente en el profile validator. **Principio**: los profiles son PARCIALES. No tienen que cubrir todos los campos `required` del project_profile final. El validator sólo verifica que los paths declarados existan en el schema y que sus valores respeten los constraints del field. Los campos user-specific quedan fuera de los profiles por diseño. **No incluido en B2 (llega después)**: - Ejecución interactiva del cuestionario (B3 runner). - Merging profile + overrides CLI (B3). - Generación real de archivos (C1+). - Resolución de `when:` para decidir requiredness condicional (B3). **Alternativas descartadas**: - (B) Extender tools/validate-questionnaire.ts con flag --profile: acopla responsabilidades (meta-schema vs instancia). - (C) Sólo tests, sin CLI: bloquea CI step y futuros usos desde pre-PR gate (D4). **Risks**: - Duplicación de datos entre questionnaire/profiles/ y tools/__fixtures__/profiles/valid/. Mitigación: copia literal; consolidar si B3 lo pide. Scope controlado (~150 líneas YAML). - Brecha del array allowlist documentada arriba; no bloquea el MVP. **Test plan**: - Unit (tools/lib/profile-validator.test.ts): cada issue kind + 3 profiles canónicos válidos + profile parcial OK (sin user-specific) + profile con campos extra no declarados → issue. - Integration (tools/validate-profile.test.ts): CLI exit 0 sobre canónicos; exit 1 sobre negativos con stderr del issue kind; exit 2 sobre archivo inexistente o YAML ilegible. - CI: step validate-profiles ejecuta el CLI sobre los 3 profiles en matrix ubuntu+macos. - Coverage: thresholds vigentes (90/85/90/90) deben seguir pasando. **Docs plan** (Fase N+3): - docs/ARCHITECTURE.md §2 Profiles → shape + sub-sección Profile validator con issue kinds + nota de brecha diferida. - .claude/rules/generator.md → bloque Profiles. - ROADMAP drift (B1 ✅ PR #1) + B2 ✅ + progreso Fase B. - HANDOFF snapshot + próxima rama B3 + estado B2. - MASTER_PLAN § Rama B2 → ✅. **Trazabilidad Fase N+7** (aplicada per regla #1 AGENTS.md y checkbox de .claude/rules/docs.md): esta rama se inició post-/compact con focus="B1 merged + Fase -1 B2 draft + ROADMAP drift + sistematización Fase N+7 aplicada". Archivos releídos post-compact para retomar Fase -1: MASTER_PLAN.md § Rama B2 (L67-71), docs/ARCHITECTURE.md §2 Profiles (L54-60) + §Schema DSL (L62-89), .claude/rules/generator.md (entero), questionnaire/schema.yaml, questionnaire/questions.yaml, tools/lib/meta-schema.ts, HANDOFF.md §3 + §6b. Decisiones preservadas del pre-compact: alternativa (A) CLI validator, shape answers-dotted, denominador de cobertura = required-fields-no-user-specific, sistematización Fase N+7 aplicada en CLAUDE/AGENTS/HANDOFF/rules (commit anterior c9e3de5 en esta misma rama). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Introduces ProfileFile zod schema (strict) with shape {version, profile: {name,description}, answers:{path.dotted:value}} and the validateProfile() function that walks answers, looks up each path in the meta-schema, and emits ProfileIssue[] covering: - answer-unknown-path - answer-type-mismatch (scalar type disagreement with field.type) - answer-value-not-in-enum - answer-array-item-type-mismatch - answer-constraint-violation (pattern / minLength / maxLength / min / max / minItems / maxItems) Profiles are treated as partial by design: fields not mentioned in answers are not flagged, and the user-specific required fields (identity.name, identity.description, identity.owner) are expected to be missing from profiles. Tests (TDD, 21 cases): canonical shape + each issue kind + multi-issue aggregation + partial-profile acceptance. Reuses the existing meta-schema parser; no new deps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CLI entry point `npx tsx tools/validate-profile.ts <profile.yaml> [--schema questionnaire/schema.yaml]` with exit codes mirroring the questionnaire validator: 0 on OK, 1 on semantic issues, 2 on missing file / unreadable YAML / missing profile arg. formatReport() emits a human-readable block with schema + profile paths, status, and one line per issue (kind, path, detail). Stdout for diagnostics so CI captures them; stderr reserved for CLI usage errors. Integration tests (15 cases) cover: valid canonical profiles (3 × exit 0), each invalid fixture (4 × exit 1 with matching issue kind in stdout), missing file and missing arg (2 × exit 2), plus the unit-level coverage of formatReport and validateProfileFile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Canonical profiles (questionnaire/profiles/*.yaml): - nextjs-app — web-app + TS + postgres + vitest + playwright + sentry + changesets + team (12 answers). - agent-sdk — agent-sdk + python + pytest + opentelemetry + manual release + opus as default model (11 answers). - cli-tool — cli + TS + vitest + semantic-release + solo (11 answers). Each profile omits the 3 user-specific required fields (identity.name, identity.description, identity.owner) by design. Coverage of the remaining required fields (domain.type, stack.language, testing.unit_framework) is 100%; overall answers land at ~55-65% of the 18 schema fields per profile (MASTER_PLAN target ~60%). Valid fixtures (tools/__fixtures__/profiles/valid/*.yaml) are literal duplicates of the canonical profiles — kept in the tools scope since the generator does not exist yet (Fase B3+). Consolidation with the generator-side fixtures is deferred until B3 reveals a better mechanism (e.g., loader or symlink). Invalid fixtures (tools/__fixtures__/profiles/invalid/*.yaml) one per issue kind exercised by the CLI integration tests: - unknown-path.yaml → answer-unknown-path - type-mismatch.yaml → answer-type-mismatch - enum-out-of-values.yaml → answer-value-not-in-enum - pattern-violation.yaml → answer-constraint-violation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Runs \`npm run validate:profiles\` (new script) after validate:questionnaire on the ubuntu+macos × node 20 matrix. Invokes the CLI on each of the 3 canonical profiles; any non-zero exit fails the job. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- ROADMAP: arrastra drift de B1 (🔄 abierta → ✅ PR #1) y añade B2 (🔄 abierta) con listado de entregables + brecha conocida. Fase B marcada "en curso (B2)". - HANDOFF §1: snapshot apunta a B2 en curso + próxima B3. - HANDOFF §9 reescrita: próxima rama = B3 con lectura mínima incluyendo profile-validator y el checkbox Fase N+7 como primer ítem del pre-flight. - HANDOFF §10 sustituida: estado B2 (cerrando) con entregables, meta commit sistematización, 106 tests, coverage 95.97%, brecha conocida. - MASTER_PLAN Rama B1 marcada ✅ PR #1. Rama B2 marcada ✅ con: ajuste vs plan original (fixtures en tools/ no generator/), brecha conocida, criterio de salida actualizado. - docs/ARCHITECTURE.md §2 Profiles: añade shape canonical + principio de parcialidad + 5 issue kinds del profile validator + brecha + comando CLI + integración CI. - .claude/rules/generator.md: bloque nuevo "Profiles" con location, shape, parcialidad, validator, fixtures y pasos para añadir un nuevo profile. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- extract tools/lib/read-yaml.ts (readAndParseYaml + errorMessage) shared by validate-profile and validate-questionnaire (2 call sites, meets pattern-before-abstraction threshold) - inline 3 single-call helpers in profile-validator.ts (checkStringConstraints, checkArrayItems, constraintViolation) — switch branches become self-contained and readable - collapse 3 duplicate canonical CLI smoke tests into one parameterised it.each — same coverage, less scaffolding Net -45 LOC. Tests 106 green, coverage 95.92% lines / 89.91% branches (above thresholds 90/85/90/90). Typecheck clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Introduces a “profiles” layer for the questionnaire system, including canonical starter profiles and a CLI validator to ensure profiles stay consistent with questionnaire/schema.yaml.

Changes:

Added tools/lib/profile-validator.ts (Zod profile parser + schema-based answer validation) and tools/validate-profile.ts CLI with exit codes and reporting.
Added canonical profiles in questionnaire/profiles/ plus fixtures and tests for validator + CLI.
Extracted shared YAML read/parse helper to tools/lib/read-yaml.ts and wired CI/package scripts to validate profiles.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tools/validate-questionnaire.ts	Reuses shared YAML read/parse helper to dedupe CLI I/O logic.
tools/validate-profile.ts	New CLI entrypoint to validate a profile YAML against the schema and emit issues.
tools/validate-profile.test.ts	Adds unit + CLI integration coverage for the validate-profile tool.
tools/lib/read-yaml.ts	New shared YAML reader/parser utility with consistent error formatting.
tools/lib/profile-validator.ts	New profile parser + validator emitting typed issue kinds.
tools/lib/profile-validator.test.ts	Unit tests covering parsing and validator issue emission/aggregation.
tools/fixtures/profiles/valid/nextjs-app.yaml	Valid profile fixture mirroring the canonical `nextjs-app` profile.
tools/fixtures/profiles/valid/cli-tool.yaml	Valid profile fixture mirroring the canonical `cli-tool` profile.
tools/fixtures/profiles/valid/agent-sdk.yaml	Valid profile fixture mirroring the canonical `agent-sdk` profile.
tools/fixtures/profiles/invalid/unknown-path.yaml	Invalid fixture to trigger `answer-unknown-path`.
tools/fixtures/profiles/invalid/type-mismatch.yaml	Invalid fixture to trigger `answer-type-mismatch`.
tools/fixtures/profiles/invalid/pattern-violation.yaml	Invalid fixture to trigger `answer-constraint-violation` (pattern).
tools/fixtures/profiles/invalid/enum-out-of-values.yaml	Invalid fixture to trigger `answer-value-not-in-enum`.
questionnaire/profiles/nextjs-app.yaml	Adds canonical Next.js app starter profile (partial by design).
questionnaire/profiles/cli-tool.yaml	Adds canonical CLI tool starter profile (partial by design).
questionnaire/profiles/agent-sdk.yaml	Adds canonical agent SDK starter profile (partial by design).
package.json	Adds `validate:profiles` script to validate all canonical profiles.
docs/ARCHITECTURE.md	Documents profile shape, partiality, validator semantics, and CLI usage.
ROADMAP.md	Updates phase/progress tracking to reflect B2 work.
MASTER_PLAN.md	Updates B1 completion and B2 scope/acceptance criteria to include validator/CI.
HANDOFF.md	Updates current phase and adds/propagates the “Context gate” process details.
CLAUDE.md	Adds “Fase N+7 Context gate” to the documented lifecycle.
AGENTS.md	Updates non-negotiable rules and “continúa/siguiente” flow to include context gate.
.github/workflows/ci.yml	Adds CI step to run `validate:profiles`.
.claude/rules/generator.md	Documents profile location/shape/partiality and validation/fixture expectations.
.claude/rules/docs.md	Adds context-traceability checklist item for branches started via compact/clear.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+  it("exits 1 for type-mismatch fixture", () => {
+    const r = runCli(["tools/__fixtures__/profiles/invalid/type-mismatch.yaml"]);
+    expect(r.code).toBe(1);
+    expect(r.stdout).toMatch(/answer-type-mismatch/);
+  }, 30000);
+
+  it("exits 1 for enum-out-of-values fixture", () => {
+    const r = runCli(["tools/__fixtures__/profiles/invalid/enum-out-of-values.yaml"]);
+    expect(r.code).toBe(1);
+    expect(r.stdout).toMatch(/answer-value-not-in-enum/);
+  }, 30000);
+
+  it("exits 1 for pattern-violation fixture", () => {
+    const r = runCli(["tools/__fixtures__/profiles/invalid/pattern-violation.yaml"]);
+    expect(r.code).toBe(1);
+    expect(r.stdout).toMatch(/answer-constraint-violation/);
+  }, 30000);


+      if (field.pattern !== undefined && !new RegExp(field.pattern).test(value)) {
+        issues.push(violation(path, `value '${value}' does not match pattern /${field.pattern}/`));


javiAI · 2026-04-20T07:50:55Z

Buen cierre de B2. La base está sólida y no veo nada bloqueante para merge.

Un único apunte no bloqueante para B3 o una follow-up pequeña:

En profile-validator.ts, los fields enum no distinguen entre type mismatch y value not in enum. Ahora mismo, si alguien mete un array u objeto donde el field es enum, el issue emitido sería answer-value-not-in-enum en lugar de answer-type-mismatch. Eso no rompe B2, pero sí hace el reporting un poco menos preciso para el runner futuro y para overrides CLI.

No pediría cambios por esto; sólo lo dejaría apuntado para mantener la taxonomía de errores lo más limpia posible cuando llegue B3.

- add invalid fixture array-item-type-mismatch.yaml + CLI test (Copilot: CLI coverage gap for answer-array-item-type-mismatch) - validate pattern is a compilable regex at meta-schema parse time (Copilot: new RegExp(field.pattern) could throw — now a clear schema-scoped error via zod .refine, exit 2 instead of uncaught) - document deferred B2 brecha: enum fields emit value-not-in-enum instead of type-mismatch when given array/object (per user PR comment — non-blocking, noted for B3) 107 tests green, coverage still above thresholds, typecheck clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

javiAI · 2026-04-20T07:56:53Z

Addressed review feedback in 95515d3:

FIX #1 — tools/validate-profile.test.ts:100: added fixture tools/__fixtures__/profiles/invalid/array-item-type-mismatch.yaml + CLI test assertion. All 5 ProfileIssueKinds now covered at CLI level.

FIX #2 — tools/lib/profile-validator.ts:74: pushed the compilable-regex check one level up to tools/lib/meta-schema.ts via z.string().refine(...). An invalid regex in questionnaire/schema.yaml now fails schema parsing with a field-scoped zod error (exit 2), so the validator never sees an uncompilable pattern. Applied to both StringField.pattern and TextQuestion.validation.pattern.

Deferred — user issue comment on enum type-mismatch taxonomy: per javiAI's own comment ("No pediría cambios por esto; sólo lo dejaría apuntado") documented as known gap in MASTER_PLAN.md §B2 for B3.

107 tests green, typecheck clean.

Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta previos a la implementación TDD del runner: 1. Context-gate hardening (heredado de sesión previa): - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA elección explícita del usuario. Nunca decide por su cuenta. - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento (parar + esperar) alineado con MEMORY.md feedback_context_gate. - HANDOFF.md §3 checklist: presentar + esperar explícito antes de emitir resume prompt o proceder a Fase -1. 2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1): - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso; B2 marcada como completada (PR #2) con 2 brechas documentadas; B3 abierta con scope y ajuste (token-budget diferido). - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1. - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs). - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por "Estado B3 en curso" con decisiones Fase -1 aprobadas + archivos previstos + brechas heredadas. - MASTER_PLAN.md §B3: nota explícita del diferimiento de token-budget.ts, re-export desde tools/lib/, flags --out y --dry-run rechazados, semántica exit codes user-specific. NO parte funcional del runner. La implementación arranca en commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md regla #3 y AGENTS.md regla #4). Trazabilidad Fase -1: aprobada explícitamente por el usuario en esta sesión tras presentación de scope + ambigüedades + alternativas + test plan + docs plan. Marker creado en .claude/branch-approvals/feat_b3-generator-runner.approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(meta): pre-kickoff B3 — context-gate hardening + docs sync Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta previos a la implementación TDD del runner: 1. Context-gate hardening (heredado de sesión previa): - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA elección explícita del usuario. Nunca decide por su cuenta. - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento (parar + esperar) alineado con MEMORY.md feedback_context_gate. - HANDOFF.md §3 checklist: presentar + esperar explícito antes de emitir resume prompt o proceder a Fase -1. 2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1): - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso; B2 marcada como completada (PR #2) con 2 brechas documentadas; B3 abierta con scope y ajuste (token-budget diferido). - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1. - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs). - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por "Estado B3 en curso" con decisiones Fase -1 aprobadas + archivos previstos + brechas heredadas. - MASTER_PLAN.md §B3: nota explícita del diferimiento de token-budget.ts, re-export desde tools/lib/, flags --out y --dry-run rechazados, semántica exit codes user-specific. NO parte funcional del runner. La implementación arranca en commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md regla #3 y AGENTS.md regla #4). Trazabilidad Fase -1: aprobada explícitamente por el usuario en esta sesión tras presentación de scope + ambigüedades + alternativas + test plan + docs plan. Marker creado en .claude/branch-approvals/feat_b3-generator-runner.approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(b3): red tests for runner — fixtures + loader + validators + CLI TDD step 1 (per CLAUDE.md regla #3 + .claude/rules/tests.md): failing tests written before implementation. All three test suites fail to load because profile-loader.ts / validators.ts / run.ts don't exist yet (commit 3 will turn them green). Fixtures (generator/__fixtures__/profiles/): - valid-partial/profile.yaml — all non-user-specific required present, 3 user-specific missing. Expects exit 0 + warning. - missing-required/profile.yaml — omits domain.type. Expects exit 1 (completeness error). - invalid-value/profile.yaml — stack.language out of enum. Expects exit 1 (profile-validator issue answer-value-not-in-enum). Test files: - generator/lib/profile-loader.test.ts — 5 tests: happy, missing file, malformed YAML, invalid shape (missing profile key), strict rejection of unknown top-level key. - generator/lib/validators.test.ts — 5 tests for completenessCheck: only user-specific missing → 3 warnings; all present → clean; 1 required missing → 1 error + 3 warnings; 2 required missing → 2 errors; required with default value → satisfied (uses a synthetic schema to isolate default semantics from canonical). - generator/run.test.ts — 15 tests split in three describes: runValidation (unit) — 5 tests covering 0/1/2 exit codes across fixtures + missing file + malformed YAML. formatReport — 4 tests covering OK / WARN / FAIL rendering and required-missing + enum issue lines. CLI integration (spawnSync) — 9 tests covering valid, --validate-only, missing-required, invalid-value, rejection of --out and --dry-run with exact deferral message ("flag --X not supported in B3; planned for C1"), missing --profile, missing file, unknown flag. Decisions locked in tests: - loadProfile return shape: { ok: true, profile } | { ok: false, error }. - completenessCheck return shape: { errors[], warnings[] }. - USER_SPECIFIC_PATHS exported from validators.ts for reuse + test assertion. - runValidation takes only profilePath (schema hard-coded inside). - formatReport takes (result, profilePath); no schema param needed. Vitest output (expected): 3 failed suites, all "Failed to load url ./<module>.ts. Does the file exist?" — classic TDD red state. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(b3): generator runner — profile loader + completeness + CLI Implementación mínima que pone verde el commit anterior (135/135 tests en el proyecto; 28/28 en generator/). Runner B3 cierra el círculo profile YAML → zod-validado → completeness-check → exit 0/1/2. Sin renderers todavía (llegan en C*). Ficheros: - generator/lib/schema.ts — re-export puro de parseSchemaFile / parseProfileFile / validateProfile + tipos ProfileFile / ProfileIssue / ProfileIssueKind / SchemaFile desde tools/lib/. 3ª aplicación de pattern-before-abstraction (la 2ª fue tools/lib/read-yaml.ts en B2). Ninguna lógica duplicada. - generator/lib/profile-loader.ts — loadProfile(path): discriminated union { ok: true, profile } | { ok: false, error }. Reúsa readAndParseYaml + errorMessage de tools/lib/read-yaml.ts. - generator/lib/validators.ts — completenessCheck(schema, profile) retorna { errors, warnings }. Escanea required fields del schema; si el path está ausente del profile Y el schema no declara default, emite error/warning según USER_SPECIFIC_PATHS (identity.name / description / owner → warning; resto → error). La constante se exporta para que los tests puedan aseverar la lista exacta sin duplicarla. - generator/run.ts — CLI entrypoint: * parseArgs strict con --profile (req), --validate-only, --out y --dry-run declarados pero rechazados explícitamente con mensaje exacto "flag --X not supported in B3; planned for C1". Evita falsa sensación de funcionalidad. * Schema hard-coded a questionnaire/schema.yaml (sin flag). * runValidation(profilePath) + formatReport(result, profilePath) exportadas para tests unit; main() con /* v8 ignore */ para excluir parseArgs + exit paths del coverage (mismo patrón que tools/validate-profile.ts). * Semántica exit: profile ok sin blockers → 0 (warnings permitidas); issues o completeness-errors → 1; I/O, YAML roto, args inválidos o flag diferido → 2. Profile shape invalid (top-level strict) se mapea a exit 1 porque es un error de contenido, no de I/O. Verificaciones locales: - tsc --noEmit: limpio. - vitest run: 135 tests en 13 suites (28 nuevos en generator/). - vitest run --coverage: lines 95.36%, functions 98.52%, branches 88.83%, statements 95.36%. Todos por encima del threshold del proyecto (90/90/85/90). - npm run validate:profiles: OK x3 (no regresión B2). - npm run validate:questionnaire: OK (no regresión B1). - Smoke E2E: tsx generator/run.ts --profile nextjs-app.yaml → exit 0 + 3 warnings user-specific. --out rechazado con mensaje exacto esperado, exit 2. Pendiente en commits siguientes (docs-sync Fase N+3 + posible package.json script validate:generator): - Actualizar docs/ARCHITECTURE.md §3 con shape real (signatures exportadas, exit codes definitivos) si difiere del snippet actual. - Añadir sección "Deferrals" en .claude/rules/generator.md (token-budget.ts) + sección "Reuso desde tools/lib" (3ª aplicación pattern-before-abstraction). - Script npm run validate:generator — una invocación mínima que corra el runner sobre los 3 canónicos para detectar regresiones en CI futuro. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(b3): docs-sync + CI smoke + rules/generator deferrals Docs-sync Fase N+3 + endurecimiento de CI antes del pre-PR gate. CI smoke (no regresión del runner): - package.json: nuevo script `validate:generator` que corre generator/run.ts sobre los 3 profiles canónicos (nextjs-app, agent-sdk, cli-tool). Exit 0 esperado con warnings user-specific. - .github/workflows/ci.yml: nuevo step "Validate generator (smoke — 3 canonical profiles)" entre "Validate profiles" y "Test (with coverage)". Detecta regresiones de integración antes de que los unit tests corran. Docs ARCHITECTURE.md §3 "Generador": - Reemplaza el snippet aspiracional pre-B3 (que importaba renderers + fs-writer que aún no existen) por las signatures reales entregadas: RunResult shape, runValidation, formatReport. - Documenta exit codes 0/1/2 + deferrals de B3 (token-budget, --schema, --out, --dry-run). - Añade enlaces relativos a los 3 archivos nuevos (generator/run.ts, lib/schema.ts, lib/profile-loader.ts, lib/validators.ts). Rules .claude/rules/generator.md: - Sección nueva "Runner (entregado en B3)": fixtures de integración, semántica exit codes, flags diferidos, smoke CI. - Sección nueva "Deferrals (B3)": token-budget.ts, --schema, --out/--dry-run con razón explícita de cada uno para que ramas futuras puedan decidir cuándo reintroducir. - Sección nueva "Reuso desde tools/lib/ (pattern-before-abstraction, 3ª aplicación)": norma de no-duplicación + historial de las 3 aplicaciones (B1 condition-parser, B2 read-yaml, B3 schema re-export). Fija el umbral para bifurcar cuando aparezca una 4ª aplicación con lógica generator-only. Verificaciones: - tsc --noEmit: OK. - vitest run: 135/135 tests. - npm run validate:generator: 3 x "status: OK" con warnings user-specific esperados. - npm run validate:profiles: 3 x OK (no regresión B2). - npm run validate:questionnaire: OK (no regresión B1). Post-commit: rama lista para /pos:pre-commit-review (equivalente manual: subagent code-reviewer sobre el diff completo de la rama) y pre-PR gate manual. ROADMAP + HANDOFF + MASTER_PLAN ya sincronizados en el commit 1 de la rama. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(b3): address pre-commit review — tighter flag guard + pinned enum test Two findings from the manual code-reviewer pass (equivalent of /pos:pre-commit-review) on feat/b3-generator-runner. 1. generator/run.ts line 129: changed --dry-run guard from truthy check to `values["dry-run"] !== undefined` for consistency with the --out guard above. Node's parseArgs with type:"boolean" does not emit false natively (only true on presence or undefined on absence), so the observable behavior is unchanged in practice — but the defensive shape now matches --out and survives any future parseArgs or caller variant that could produce an explicit false. 2. generator/run.test.ts: the "exit 1 value-not-in-enum" test used `r.issues.some(...)` as its only assertion, which would still pass if a regression also populated r.errors or changed r.issues length. Tightened to pin the exact shape: issues length 1, kind + path on issues[0], errors empty, warnings equal to the 3 user-specific paths. Matches the assertion style already used in the sibling "missing-required" test and in validators.test.ts. Reviewer's third finding (validate:generator CI step semantically redundant with Test-with-coverage) was considered and kept as-is. Reason documented in .claude/rules/generator.md § Runner: the smoke step catches broken tsx invocation / main() wiring before unit tests run, and the 3-profile loop adds ~3s to CI for earlier signal. Not inertia — deliberate design. Verification: - tsc --noEmit: OK. - vitest run generator/: 28/28. - No production behavior change for --dry-run in common usage (parseArgs produces true or undefined, not false). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(b3): address PR #3 review — docs stdout correction Copilot flagged a mismatch between docs and runner behavior: 2 doc sites claimed user-specific warnings land in stderr, while the CLI prints the full formatReport (including warnings) to stdout and the integration tests assert on stdout. Align docs with implementation; do not change the CLI/tests contract. - MASTER_PLAN.md §B3 exit-codes line: "stderr" → "stdout (dentro del reporte)". - generator/__fixtures__/profiles/valid-partial/profile.yaml:5 header: same correction. Typecheck + runner tests (18/18) still pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Closes the 3 points raised in PR #11 review (Copilot + human feedback). 1) Hook `tool_input` validation (BLOCKER): - `tool_input is None` → `{}` (pass-through). - `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`. Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError` on list/string payloads, crashing the hook instead of responding with a controlled contract. +3 in-process tests (null, list, string) cover the new branches. 2) Docs alignment with actual safe-fail policy: - `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches the actual hook behavior and establishes a canonical policy for D2..D6 hooks. 3) CI coverage (IMPORTANT): - New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` + `macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q --cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not "passed locally". Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line `sys.exit(main())` under `__main__`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(d1): failing tests for pre-branch-gate + test-env bootstrap Kickoff block (Fase 0) — rama feat/d1-hook-pre-branch-gate: Scope (Fase -1 cerrada): - hooks/pre-branch-gate.py (impl en commit siguiente): PreToolUse(Bash) que bloquea `git checkout -b`, `git switch -c`, `git worktree add -b` sin marker `.claude/branch-approvals/<slug-sanitized>.approved`. - Test pair pytest para el hook (este commit, RED). - Bootstrap mínimo del test env: `.venv` local + `requirements-dev.txt` (pytest + pytest-cov). Sin ruff, sin selftest, sin hooks/_lib/ abstraído. Decisiones Fase -1 cerradas (vs MASTER_PLAN §D1): 1. Alcance: cubre checkout -b, switch -c, worktree add -b. Excluye `git branch <slug>` (crea ref sin iniciar trabajo). 2. Sin bypass env var. Bypass legítimo = crear marker explícito. 3. Doble log: `.claude/logs/pre-branch-gate.jsonl` + `.claude/logs/phase-gates.jsonl` (evento branch_creation). 4. Parsing con shlex.split (robusto a quoting). Soporta global options pre-subcommand. 5. Mensaje al bloquear: ruta exacta del marker + comando `touch` sugerido + referencia textual a `MASTER_PLAN.md`. Sin parseo del plan. 6. Pass-through silencioso: cero ruido salvo branch creation. 7. Sin `hooks/_lib/` compartido (CLAUDE.md regla 7: ≥2 reps antes de abstraer; D1 es la primera). Tests añadidos (RED intencional): - hooks/tests/test_pre_branch_gate.py · detección de branch creation · pass-through silencioso · sanitización de slug · doble log allow/deny · robustez ante stdin/comandos inválidos - Fixtures: 6 JSON en hooks/tests/fixtures/payloads/. Bootstrap del env: - requirements-dev.txt: pytest>=7, pytest-cov>=4 - .gitignore: /.venv/, __pycache__/, *.pyc, .pytest_cache/ - ejecución local: \`.venv/bin/pytest hooks/tests/\` Siguientes commits previstos: - feat(d1): implement hook + chmod +x - docs(d1): docs-sync Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d1): implement pre-branch-gate hook + in-process coverage tests Implementation: - hooks/pre-branch-gate.py (executable, 4.6K, stdlib-only): · Detects `git checkout -b`, `git switch -c`, `git worktree add -b` via shlex tokenisation. Handles git global options pre-subcommand (`git -c k=v ...`, `git --git-dir=X ...`, `git -C /p ...`). · `extract_branch_slug()` returns None for non-branch commands (`git status`, `git branch <x>`, `git worktree list`, etc.). · On branch-creation command: sanitizes slug (`/` → `_`) and checks `.claude/branch-approvals/<sanitized>.approved`. · Marker present → allow (silent, exit 0) + append allow event to both logs. · Marker absent → deny (exit 2) with `decisionReason` containing: exact marker path, suggested `touch` command, textual reference to MASTER_PLAN.md, and the blocked command. · Pass-through silent on all non-branch Bash, all non-Bash tools, missing/empty fields, and shlex-unparseable commands. · Malformed JSON stdin → deny (exit 2). · Double logging: `.claude/logs/pre-branch-gate.jsonl` + `.claude/logs/phase-gates.jsonl` (event: branch_creation). Test suite (55 passing, 99% coverage on pre-branch-gate.py): - 23 subprocess integration tests (pre-existing, from commit 1). - 32 in-process unit tests added for coverage visibility, covering: · sanitize_slug: 3 cases. · extract_branch_slug: 20 branches (all subcommand/flag/global-opt combinations + negative cases). · build_deny_reason: shape assertions. · main() direct calls with monkeypatched chdir + stdin: 8 paths (malformed, non-dict, non-bash, missing input, empty command, non-branch, branch-with-marker, branch-without-marker). - The single uncovered line (175) is `sys.exit(main())` under `if __name__ == "__main__":` — not reachable from in-process tests and intrinsic to script entry. Run locally: .venv/bin/pytest hooks/tests/ --cov=hooks --cov-report=term-missing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d1): docs-sync — mark D1 ✅ + record ajustes vs plan - ROADMAP.md: · Fase D row: ⏳ pendiente → ⏳ parcial (D1 ✅). · feat/d1-hook-pre-branch-gate row: ⏳ → ✅ (PR pendiente). · Nueva sección "Progreso Fase D" con entregables + ajustes vs plan original (alcance ampliado a worktree add -b, sin bypass env var, sin hooks/_lib/, bootstrap de test env, in-process tests para coverage). - HANDOFF.md: · § Snapshot: Fase actual C5 → D1 cerrada. Siguiente D1 → D2. · § Gotchas: pre-branch-gate.py "aún no existe" → vivo desde D1; resto de hooks (session-start, pre-write-guard, post-action, pre-compact, stop-policy-check) siguen ausentes como stubs tolerados. · § Próxima rama: reescrito D1 → D2 con scope + lectura mínima. · § Estado C5 → Estado D1: resumen del entregable + "lo que D1 NO hace" + apuntes para D2 (patrón hook consolidado, señal para extraer hooks/_lib/ en D2 cuando sea 2ª repetición). - MASTER_PLAN.md § Rama D1: · Status: ✅ COMPLETADA (PR pendiente). · Scope entregado (detalle real) + Ajustes vs plan original (alcance, parsing, logging, decision reason, deferrals). - docs/ARCHITECTURE.md § 7 Capa 1: · Referencia a hooks/pre-branch-gate.py como implementación canónica del patrón de hook enforcer (shebang + stdlib-only + pass-through silencioso + shlex parsing + double log shape). - .claude/rules/hooks.md: · Nueva sub-sección "Primer hook entregado" con la estructura consolidada: pass-through silencioso, shlex, sanitización local (no helper todavía), decisionReason constructivo, double log, patrón de tests (subprocess integration + in-process unit via importlib.util por guión en el nombre). Tests locales: .venv/bin/pytest hooks/tests/ --cov=hooks → 55 passed, 99% coverage on hooks/pre-branch-gate.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(d1): simplify hook + close review gap on git global opts - _flag_value(): extract shared -b/-c lookup (was duplicated across checkout/switch/worktree). - log(): collapse dual-log (hook-scoped + phase-gates) into a single local helper in main(). - GIT_GLOBAL_OPTS_WITH_ARG: add --exec-path and --upload-pack (pre-commit-review gap: space-form of these options previously consumed the subcommand as the argument, causing a detection miss on `git --exec-path /x checkout -b slug`). - Tests: +2 cases covering the new global opts (space-form), 57 passed, 99% coverage (line 166 `sys.exit(main())` only miss; __main__-gated, intrinsic). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d1): review follow-up — tool_input guard, docs alignment, CI job Closes the 3 points raised in PR #11 review (Copilot + human feedback). 1) Hook `tool_input` validation (BLOCKER): - `tool_input is None` → `{}` (pass-through). - `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`. Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError` on list/string payloads, crashing the hook instead of responding with a controlled contract. +3 in-process tests (null, list, string) cover the new branches. 2) Docs alignment with actual safe-fail policy: - `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches the actual hook behavior and establishes a canonical policy for D2..D6 hooks. 3) CI coverage (IMPORTANT): - New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` + `macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q --cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not "passed locally". Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line `sys.exit(main())` under `__main__`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update ROADMAP.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * docs(d1): close count drift + CI contract gap flagged in review Follow-up to the second Copilot review pass on PR #11. All local repo edits — no hook/test code changes. 1) Count drift (Copilot: ROADMAP.md:225 + HANDOFF.md:141): - ROADMAP.md: remove hardcoded "55 tests en 8 clases" for the D1 test suite. Continues the pattern set by the user's own UI edit in ecdcbbc (removed "32 unit tests" from line 237) — docs describe the suite by shape, not by a brittle number that drifts with every new test. - HANDOFF.md §10: reflect the actual safe-fail contract (malformed stdin → deny, non-dict tool_input → deny) instead of the outdated "stdin vacío/malformado → exit 2 sin crash pero no loggea" bullet, and drop the "55 tests pytest" number. 2) CI contract gap (Copilot: ci.yml:74 — mypy/ruff declared in policy.yaml:68-74 + ci-cd.md:21-24 but not in the workflow): - policy.yaml.pre_push: inline comment clarifies that `command_meta` declares the aspirational contract; actual enforcement lands incrementally in CI + pre-pr-gate.py (Fase D4). Lists which checks are live today (tsc, vitest, pytest hooks) and which are deferred (mypy hooks, eslint, prettier, ruff) so the doc no longer reads as a broken promise. - .claude/rules/ci-cd.md §Workflows obligatorios §1: split into "Aterrizado" vs "Diferidos a rama dedicada", matching the actual state, plus an invariant that future branches adding a check must also move the bullet. Keeps the rule honest and makes drift explicit going forward. Scope preserved: no mypy/ruff added to CI (D1 Fase -1 explicitly excluded them). The fix is docs/contract-alignment, not tooling expansion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2 dentro de la rama, sobre el trigger "gh pr create". Comportamiento -------------- - Matcher: shlex.split(command); gate solo cuando tokens[:3] == ["gh","pr","create"] (cubre --draft / --title / --body / --base). Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create, git push, git status, non-Bash) → pass-through silencioso, cero log. - Skip advisory (pass-through + log en hook log, NO phase log): * branch main / master / HEAD detached * git unavailable (cwd no es repo) * merge-base HEAD main no resoluble (main borrada localmente) - Empty diff (HEAD == base) → deny exit 2 con reason "empty PR" dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF. - Docs-sync check (reglas hardcoded, mirror de policy.yaml.lifecycle.pre_pr.docs_sync_*): baseline : ROADMAP.md + HANDOFF.md siempre. conditional: generator/** → docs/ARCHITECTURE.md hooks/** (no tests/) → docs/ARCHITECTURE.md skills/** → .claude/rules/skills-map.md .claude/patterns/** → docs/ARCHITECTURE.md Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola vez aunque múltiples prefijos lo exijan. Triggering paths capeados a 3 por doc en el reason, con sufijo "... (+N more)" cuando >3. - Safe-fail D1 blocker canonical: stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict → deny exit 2. Command ausente / no-string / vacío / shlex unparsable → pass-through 0. - Double log en decisiones (allow/deny): .claude/logs/pre-pr-gate.jsonl {ts, hook, command, decision, reason} .claude/logs/phase-gates.jsonl {ts, event:"pre_pr", decision} + 3 entradas status:"deferred" en hook log por cada decisión real (skills_required, ci_dry_run_required, invariants_check). Estas NO se emiten en skip ni en pass-through — el test las exige solo para gated decisions. Simplify pass (N+1) ------------------- - Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff). - _conditional_triggers docstring: eliminada (privada, nombre self-explaining). - main(): missing, _triggers → missing, _ (unused var sin pseudónimo). Tests ----- - 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed girados a pass, 10 previamente falsos positivos confirmados como reales, 39 @needs_hook desbloqueados con el módulo ya disponible). - Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión. - Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el except FileNotFoundError/SubprocessError de _run_git y un branch out.strip()==""; sobre el target 90%). Deferrals explícitos documentados en Fase -1 (no se tocan en D4) ---------------------------------------------------------------- - Migración de reglas hardcoded → parser de policy.yaml (rama propia). - Migración de paths D3 (pre-write-guard) → policy-driven (misma rama). - Matcher de git push --force (no gated por D4). - pre-write-guard.py intacto (cero edit). - policy.yaml intacto (cero edit). - requirements-dev.txt intacto (sin pyyaml). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…+ ARCHITECTURE + rules/hooks Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2, Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta sincronización antes de permitir gh pr create — dogfooding del propio blocker. - ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93% coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como siguientes en cola. - HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el cierre numérico. - MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory, razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos helpers, 3 cuts de simplify). - docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip advisory, empty-diff dedicated reason. - .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync reglas en tablas + reuso de _lib + 96 tests / 93% cov. Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md requerido por los paths hooks/ tocados en Fase 2.

* test(d4): kickoff — failing suite for pre-pr-gate hook (docs-sync enforcement on gh pr create) ## Kickoff — D4 (feat/d4-hook-pre-pr-gate) Contexto: continuación post-merge D3 (9aed1ee en main) y PR #14 docs (fx Knowledge Plane. Fase -1 ejecutada y aprobada en la misma sesión (v1 rechazada por scope inflado; v2 recortada y aprobada). Decisiones cerradas: solo como trigger; docs-sync como único enforcement real; skills_required + ci_dry_run_required + invariants_check como advisory scaffold no-blocking; sin pyyaml; sin migración D3 hardcode→policy; sin tocar pre-write-guard.py; merge-base HEAD main como baseline; git no disponible / base no resuelto → pass-through con advisory log explícito (no silencioso); diff vacío → deny exit 2 con mensaje distinto de docs-sync. ### Scope - Nuevo hook blocker (shape D1, no D2). PreToolUse(Bash) matcher únicamente. deferrido. - docs-sync enforcement real (blocker exit 2) con reglas hardcoded en el hook (mirror textual de ; migración a policy-driven se aborda en rama policy-loader propia junto a los paths hardcoded de D3): * baseline: ROADMAP.md + HANDOFF.md en diff. * conditional: generator/** | hooks/** → docs/ARCHITECTURE.md; skills/** → .claude/rules/skills-map.md; .claude/patterns/** → docs/ARCHITECTURE.md. - Advisory scaffold no-blocking (logueado, no deniega). Activable sin cambio de shape cuando sus ramas dedicadas aporten sustrato: * skills_required → skills not yet landed (Fase E*). * ci_dry_run_required → ci_dry_run deferred to dedicated rama. * invariants_check → invariants directory empty — deferred. - Pass-through + advisory log (no silencioso) en: main / master / detached HEAD; git no disponible / no es repo; merge-base HEAD main no resuelve. - Diff vacío → deny exit 2 con reason empty PR (NO menciona docs-sync). - Double log: + (evento ) sobre decisiones reales (allow/deny). Advisory skip sólo en hook log, NO en phase-gates. Pass-through silencioso (no-match) sin log (mismo patrón D1/D3). - Safe-fail blocker canonical D1: stdin vacío / JSON malformado / top-level no-dict / tool_input no-dict → deny exit 2. command no-string o vacío / shlex unparsable → pass-through. ### Archivos a crear en la rama - hooks/pre-pr-gate.py (Fase 2, GREEN — no en este commit). - hooks/tests/test_pre_pr_gate.py (este commit, RED). - hooks/tests/fixtures/payloads/gh_pr_create.json (este commit). - hooks/tests/fixtures/payloads/gh_pr_create_draft.json (este commit). - hooks/tests/fixtures/payloads/gh_pr_list.json (este commit). Reutilizo git_status.json y non_bash.json heredados de D1/D2. ### Archivos explícitamente NO tocados (deferrals documentados) - hooks/pre-write-guard.py — sin migración de paths hardcoded a policy.yaml. - policy.yaml — sin nueva clave; sigue declarativo no-parseado. - hooks/_lib/ — sin policy.py; cero helpers nuevos. - requirements-dev.txt — sin pyyaml (blocker explícito de scope D4). ### Riesgos - Tests con real-git subprocess setup (git init + config + commits) son más pesados que D3 (D3 no necesitaba git). Mitigación: fixture encapsula el setup; helpers / mantienen tests legibles. - Detached HEAD devuelve HEAD de . Se trata como skip explícito (no gated, no implicit deny). - requiere main local presente. Si main fue borrada → advisory skip (testeado explícitamente). ### Test plan (Fase 1, este commit) - TestMatcherDetection (11): gh pr create + variantes (title/draft/body/ base) → gate; gh pr list/view, gh issue create, git status, git push, non-Bash → pass-through. - TestBranchSkip (3): main, master, detached → advisory skip + sin phase log. - TestGitUnavailable (1): cwd sin git repo → advisory skip. - TestMergeBaseUnresolved (1): main borrada → advisory skip con reason merge-base. - TestEmptyDiff (2): empty PR → deny + mensaje sin docs-sync; reason incluye base. - TestDocsSyncBaseline (4): ROADMAP / HANDOFF / ambos faltando → deny; ambos presentes sin conditional → allow. - TestDocsSyncConditional (9): generator/hooks/skills/patterns triggers; multi-conditional dedup ARCHITECTURE.md; tests/** fuera de conditional. - TestDecisionReason (3): reason menciona CLAUDE.md + docs-sync; triggering paths listados; cap a 3 con indicador more. - TestAdvisoryLogs (4): deny / allow / empty-diff → 3 entradas deferred; skip → 0 entradas deferred. - TestLogging (8): double-log sólo en decisiones reales; skip sólo hook log; no-match / non-Bash / gh pr list → cero log; shape de entry. - TestRobustness (11): blocker safe-fail canonical D1. - TestIsGhPrCreateUnit (14): matcher classifier in-process. - TestCheckDocsSyncUnit (13): docs-sync classifier in-process. - TestMainInProcess (12): coverage paths subprocess no mide. Target: 96 tests. Coverage ≥90% sobre pre-pr-gate.py, ≥90% combinado hooks/. D1 (60) + D2 (66) + D3 (83) = 209 tests intactos. Suma esperada: 305. Estado RED ahora: 47 failed + 10 passed + 39 skipped. Los 39 skipped son tests @needs_hook (in-process) que se activan cuando existe el módulo. Los 10 passed son falsos positivos — retorna exit 2, que coincide con la expectativa deny exit 2 de los tests gated; al entregar la impl, esos 10 deben seguir passing con el deny correcto por lógica real, y los 47 failed deben convertirse en pass. ### Docs plan (Fase N+3) - ROADMAP.md — fila D4 ✅. - HANDOFF.md — §1 Fase actual actualizado (arrastra texto obsoleto post-merge D3 PR #13); §9 Próxima rama → D5; §10 renombrada Estado D4 con resumen. - MASTER_PLAN.md § Rama D4 — Status ✅ + Ajustes vs plan original (recorte scope v2: hardcode rules, solo gh pr create, advisory skills/CI/invariants, migración D3 diferida a rama policy-loader). - docs/ARCHITECTURE.md §7 — pre-pr-gate como 4º hook canónico en Capa 1. - .claude/rules/hooks.md — sección Cuarto hook entregado — pre-pr-gate (D4). - policy.yaml — no tocado en D4 (contrato declarativo sin enforcer real de parsing; se aborda en rama dedicada). - pre-write-guard.py — no tocado en D4. Trazabilidad de contexto: sesión arrancada desde main post-merge D3 PR #13 (y PR #14 docs Knowledge Plane). No se usó /clear ni /compact en esta sesión — no hay resume prompt que referenciar. Marker: .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved (gitignored por diseño, igual que D1/D2/D3). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> EOF ) * docs(d4): restore kickoff commit message context (e73416b) El commit de kickoff e73416b quedó con el message dañado: los backticks inline dentro del HEREDOC (cat <<'EOF' ... EOF) fueron interpretados por el $(...) externo como command substitution y reemplazados por cadena vacía. El código committeado (4 files, 903 insertions) está correcto; sólo los inline-code spans del message se perdieron. Reponer aquí, sin reescribir historia, las referencias textuales que quedaron en blanco en e73416b (se evita backtick en todo este commit): Decisiones cerradas ------------------- - Trigger único del hook: "gh pr create" (no gh issue, no git push). - Skip explícito de branch: main / master / detached-HEAD. - Hook hooks/pre-write-guard.py (D3) no se toca en D4. Scope ----- - Nuevo hook blocker: hooks/pre-pr-gate.py (shape D1, no D2). Matcher PreToolUse(Bash) sobre command == "gh pr create" + flags. - Mirror textual (hardcoded en el hook) de: policy.yaml -> lifecycle.pre_pr.docs_sync_baseline policy.yaml -> lifecycle.pre_pr.docs_sync_conditional Migración a parser declarativo diferida a rama policy-loader propia. - Double log: .claude/logs/pre-pr-gate.jsonl (shape propio del hook) .claude/logs/phase-gates.jsonl (evento "pre_pr") Riesgos ------- - Tests con real-git subprocess setup. Helpers _git y _gh_pr_create_payload encapsulan init + commits. - Detached HEAD devuelve HEAD literal de git rev-parse --abbrev-ref HEAD -> tratado como skip advisory, no como gated. - Resolución de baseline requiere main local presente: git merge-base HEAD main -> si main fue borrada, skip advisory con reason "merge-base". Trazabilidad de contexto ------------------------ - Sesión arrancada sin /clear desde main, post-merge de PR #13 (D3) y PR #14 (docs Knowledge Plane). - Marker: .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved (gitignored por diseño, igual que D1/D2/D3). Follow-up commit vacío (--allow-empty); cero cambios de código. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d4): impl hooks/pre-pr-gate.py — docs-sync enforcer on gh pr create Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2 dentro de la rama, sobre el trigger "gh pr create". Comportamiento -------------- - Matcher: shlex.split(command); gate solo cuando tokens[:3] == ["gh","pr","create"] (cubre --draft / --title / --body / --base). Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create, git push, git status, non-Bash) → pass-through silencioso, cero log. - Skip advisory (pass-through + log en hook log, NO phase log): * branch main / master / HEAD detached * git unavailable (cwd no es repo) * merge-base HEAD main no resoluble (main borrada localmente) - Empty diff (HEAD == base) → deny exit 2 con reason "empty PR" dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF. - Docs-sync check (reglas hardcoded, mirror de policy.yaml.lifecycle.pre_pr.docs_sync_*): baseline : ROADMAP.md + HANDOFF.md siempre. conditional: generator/** → docs/ARCHITECTURE.md hooks/** (no tests/) → docs/ARCHITECTURE.md skills/** → .claude/rules/skills-map.md .claude/patterns/** → docs/ARCHITECTURE.md Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola vez aunque múltiples prefijos lo exijan. Triggering paths capeados a 3 por doc en el reason, con sufijo "... (+N more)" cuando >3. - Safe-fail D1 blocker canonical: stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict → deny exit 2. Command ausente / no-string / vacío / shlex unparsable → pass-through 0. - Double log en decisiones (allow/deny): .claude/logs/pre-pr-gate.jsonl {ts, hook, command, decision, reason} .claude/logs/phase-gates.jsonl {ts, event:"pre_pr", decision} + 3 entradas status:"deferred" en hook log por cada decisión real (skills_required, ci_dry_run_required, invariants_check). Estas NO se emiten en skip ni en pass-through — el test las exige solo para gated decisions. Simplify pass (N+1) ------------------- - Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff). - _conditional_triggers docstring: eliminada (privada, nombre self-explaining). - main(): missing, _triggers → missing, _ (unused var sin pseudónimo). Tests ----- - 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed girados a pass, 10 previamente falsos positivos confirmados como reales, 39 @needs_hook desbloqueados con el módulo ya disponible). - Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión. - Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el except FileNotFoundError/SubprocessError de _run_git y un branch out.strip()==""; sobre el target 90%). Deferrals explícitos documentados en Fase -1 (no se tocan en D4) ---------------------------------------------------------------- - Migración de reglas hardcoded → parser de policy.yaml (rama propia). - Migración de paths D3 (pre-write-guard) → policy-driven (misma rama). - Matcher de git push --force (no gated por D4). - pre-write-guard.py intacto (cero edit). - policy.yaml intacto (cero edit). - requirements-dev.txt intacto (sin pyyaml). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d4): docs-sync dentro de rama — ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + rules/hooks Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2, Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta sincronización antes de permitir gh pr create — dogfooding del propio blocker. - ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93% coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como siguientes en cola. - HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el cierre numérico. - MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory, razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos helpers, 3 cuts de simplify). - docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip advisory, empty-diff dedicated reason. - .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync reglas en tablas + reuso de _lib + 96 tests / 93% cov. Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md requerido por los paths hooks/ tocados en Fase 2. * fix(d4): review PR#15 — distinguir empty-diff de diff-no-disponible + rename docs key Aborda los 5 inline comments de la review en PR#15 más los 3 items explícitos del usuario. Triage: 5 FIX (1 BLOCKER + 4 trivial/docs), 0 SKIP, 0 DISCUSS. BLOCKER (code): hooks/pre-pr-gate.py - diff_files devolvía [] tanto si el diff estaba vacío como si git subprocess fallaba (timeout, FileNotFoundError, returncode != 0). En main eso se trataba como empty PR y emitía deny. False-deny ante fallos transitorios de git. - Cambio: diff_files ahora devuelve list[str] | None. None = git no disponible (skip advisory con status: skipped, reason: git diff unavailable). [] = diff real vacío (deny con razón dedicada empty PR). Sin call sites externos a diff_files. Docs/naming: - policy.yaml.lifecycle.pre_pr expone la key como docs_sync_required, no docs_sync_baseline. Tres referencias alineadas: - MASTER_PLAN.md sección Rama D4 bullet de reglas hardcoded. - .claude/rules/hooks.md sección Cuarto hook bullet de reglas hardcoded. - docs/ARCHITECTURE.md sección 7 bullet de docs-sync + descripción del comando git real (merge-base HEAD main + diff --name-only base HEAD, no diff main..HEAD). Divergencia deliberada hooks/tests/ (docs-only, sin cambio de lógica): - CONDITIONAL_RULES del hook excluye hooks/tests/, mientras policy.yaml lista hooks/** uniforme. El hook tiene la lógica correcta (editar tests no altera arquitectura); la policy queda más laxa. Anotado como decisión D4 explícita en: - hooks/pre-pr-gate.py (comment encima de CONDITIONAL_RULES). - MASTER_PLAN.md sección Rama D4 (nuevo bullet de divergencia). - .claude/rules/hooks.md sección Cuarto hook (nuevo bullet). - docs/ARCHITECTURE.md sección 7 (nuevo bullet). Convergencia hook ↔ policy diferida a la rama policy-loader, donde el loader decidirá si representa exclusiones granulares o si la policy se vuelve específica. Tests: 322 passed (317 pre-fix + 5 nuevos en TestDiffUnavailable). Coverage sobre hooks/pre-pr-gate.py sube a 94% (+1% vs baseline D4). TestDiffUnavailable incluye unit tests de diff_files con monkeypatch + in-process main tests verificando que el skip no emite phase-gate ni false-deny empty PR. * docs(d4): align wording — docs_sync_required + git merge-base phrasing + 322 tests Post-review sweep requested by reviewer: - Strip ambiguous `docs_sync_*` wildcard; use exact `policy.yaml.lifecycle.pre_pr.docs_sync_required` + `docs_sync_conditional` everywhere. - Replace `git diff main..HEAD` references (wrong for D4 impl) with `git merge-base HEAD main` + `git diff --name-only <base> HEAD`. Session-start (D2) references to `main..HEAD` preserved (hook literally uses `{base}..HEAD` there). - Make `hooks/tests/` deliberate-divergence note explicit in MASTER_PLAN, rules/hooks.md, docs/ARCHITECTURE.md, ROADMAP. - Bump test counts to 322 / 101 and coverage ≥94% after adding TestDiffUnavailable. No logic change; docstring + docs only. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…ection) (#16) * test(d5): kickoff — failing suite for post-action hook (PostToolUse compound trigger) ## Kickoff — D5 (feat/d5-hook-post-action-compound) Contexto: continuación post-merge D4 (992137f en main). Fase -1 ejecutada y aprobada en esta sesión (v1 entregada con B=gh-pr-merge incluido; v2 recortada para eliminar B tras confirmar que tool_response.exit_code no está garantizado en PostToolUse(Bash) por la doc oficial de Claude Code). Decisiones cerradas: - Matchers finales: A (git merge) + C (git pull sin --rebase). B excluido. - Hardcode 3ª aplicación (policy-loader queda diferido post-D5/D6). - Mirror literal de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger (TRIGGER_GLOBS + SKIP_IF_ONLY_GLOBS + MIN_FILES_CHANGED=2). - PostToolUse non-blocking. Exit 0 siempre. Nunca permissionDecision. - Sin skill dispatch real (E3a futura); sólo additionalContext + advisory log. - Coverage ≥90% lines / ≥85% branches sobre post-action.py. - Web UI merge queda fuera (no observable vía Bash); el pull del usuario lo captura cuando el código aterriza local. ### Scope Nuevo hook PostToolUse(Bash): hooks/post-action.py. Shape emparentado con D1 blocker (shlex, double log, importlib in-process) pero NO blocker — nunca deniega. Estrategia de detección jerárquica: - Tier 1 (command match, shlex-parsed): * A = tokens[:2] == ["git","merge"] y tokens[2:3] ∉ {--abort, --quit, --continue, --skip}. * C = tokens[:2] == ["git","pull"] y "--rebase"/"-r" ausente. * Todo lo demás → pass-through silencioso (cero log, cero stdout). - Tier 2 (post-hoc reflog determinista): * git reflog HEAD -1 --format=%gs. * A exige prefijo "merge ". * C exige prefijo "pull:" o "pull " (sin "--rebase"). * Fallo → status "tier2_unconfirmed" (log advisory; phase-gates intacto). - Derivación touched_paths: * git diff --name-only HEAD@{1} HEAD → list[str] | None. * None → status "diff_unavailable" (log advisory; phase-gates intacto). * [] → status "confirmed_no_triggers" (ambos logs; sin additionalContext). - Mirror hardcoded (policy.yaml L105-120): * TRIGGER_GLOBS: generator/lib/** | generator/renderers/** | hooks/** | skills/** | templates/**/*.hbs. * SKIP_IF_ONLY_GLOBS: docs/** | *.md | .claude/patterns/**. * MIN_FILES_CHANGED: 2. * Match se emite sólo si: len ≥ 2 AND NOT all-skip_if_only AND al menos 1 path matchea un TRIGGER_GLOBS entry. - Emisión additionalContext (4 condiciones simultáneas): 1. Tier 1 match. 2. Tier 2 confirmado. 3. touched_paths no None y len ≥ MIN_FILES_CHANGED (=2). 4. match_triggers() devuelve lista no vacía. Mensaje: triggers matcheados + touched paths (cap 3 con "... (+N more)") + sugerencia literal "/pos:compound". NUNCA intenta dispatch. - Double log (espejo D1/D3/D4): * .claude/logs/post-action.jsonl (shape propio por status). * .claude/logs/phase-gates.jsonl evento "post_merge" SÓLO en confirmed_triggers_matched + confirmed_no_triggers (Tier 2 ok). tier2_unconfirmed y diff_unavailable van sólo al hook log. - Safe-fail PostToolUse (no es blocker D1 canonical): stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict / command no-string / shlex error → exit 0, sin log, sin stdout. - Reuso hooks/_lib/: append_jsonl + now_iso. Sin helpers nuevos (regla #7 — añadir sólo si ≥2 hooks consumen el nuevo helper). ### Archivos Este commit (kickoff RED): - hooks/tests/test_post_action.py — 111 tests (38 failed + 73 skipped en RED; los 38 subprocess fallan porque el hook no existe, los 73 in-process @needs_hook skippean hasta que el módulo se pueda importar). - hooks/tests/fixtures/payloads/git_merge.json - hooks/tests/fixtures/payloads/git_merge_no_ff.json - hooks/tests/fixtures/payloads/git_merge_abort.json - hooks/tests/fixtures/payloads/git_pull.json - hooks/tests/fixtures/payloads/git_pull_rebase.json - hooks/tests/fixtures/payloads/gh_pr_merge.json - hooks/tests/fixtures/payloads/git_rebase.json Fase 2 (implementación GREEN): - hooks/post-action.py — el hook (classify_command, reflog_message, reflog_confirms, touched_paths, match_triggers, emit helpers, main). Fase N+3 (docs-sync): - ROADMAP.md — fila D5 ✅. - HANDOFF.md — §1 fase actual → D5 cerrada; §9 próxima rama → D6; §10 renombrada Estado D5. - MASTER_PLAN.md § Rama D5 — status ✅ + ajustes vs plan original (B out, matchers A+C confirmados, emission tiered). - docs/ARCHITECTURE.md §7 — post-action.py como 5º hook canónico en Capa 1 (primer PostToolUse; variante del shape blocker: no blocker, exit 0). - .claude/rules/hooks.md — sección "Quinto hook entregado — post-action". - policy.yaml — no tocado (mirror hardcoded; la sección ya existe). ### Tests (matriz que la suite fija) - TestMatcherDetection (21 casos): Tier 1 para A/C + exclusiones (abort, quit, continue, skip, rebase, rebase shorthand, gh pr merge, cherry-pick, rebase, status, push, strings vacíos, shlex unparsable). - TestTier2Reflog (15 casos): reflog_message sobre repos reales after_merge / after_ff_merge / after_pull / clean_repo / non-repo; reflog_confirms truth-table por kind × mensaje. - TestTouchedPaths (5 casos): git diff HEAD@{1} HEAD en cada tipo de repo + edge case no reflog previo. - TestPolicyConstants (3 casos): verificación literal del mirror. - TestMatchTriggers (15 casos): min_files, skip_if_only semántica (all vs any), orden policy-driven, dedupe, templates con/sin subdir. - TestIntegrationMergeTriggersMatch (4 casos): end-to-end merge real. - TestIntegrationPullTriggersMatch (3 casos): end-to-end pull real (topo upstream/src/local). - TestIntegrationMergeFF (1 caso): ff-merge también emite. - TestIntegrationTier2Unconfirmed (2 casos): mismatch command vs reflog. - TestIntegrationConfirmedNoTriggers (2 casos): docs-only merge + single-file merge (min_files). - TestIntegrationDiffUnavailable (1 caso, delega a TestMainInProcess). - TestNonMatcherPassthrough (6 casos): gh pr merge, git rebase, pull --rebase, merge --abort, git status, non-Bash tool. - TestSafeFail (10 casos): empty / malformed JSON / top-level list o string / missing tool_name / non-Bash / tool_input no-dict / command no-string o vacío / shlex error. - TestAdditionalContextShape (5 casos): contenido emitido en stdout. - TestMainInProcess (13 casos): cobertura fina de main() vía monkeypatch (incluye diff_unavailable forzado). - TestLogShape (3 casos): shape del jsonl por status. - TestIdempotence (2 casos): 2 runs → 2 entries, ambos emiten context. Total: 111 tests. RED estado inicial: 38 failed (subprocess) + 73 skipped (in-process @needs_hook, se des-skippean cuando post-action.py se pueda importar vía importlib). ### Docs plan (Fase N+3) - ROADMAP.md — fila D5 ✅ (fase D cerrada tras merge: 5/5 hooks). - HANDOFF.md — §1 fase actual, §9 próxima rama → D6, §10 Estado D5. - MASTER_PLAN.md § Rama D5 — Status ✅ + ajustes (B out, matchers jerárquicos, emission tiered). - docs/ARCHITECTURE.md §7 — 5º bloque Capa 1 (post-action). Primera variante PostToolUse no-blocking documentada. - .claude/rules/hooks.md — "Quinto hook entregado — post-action (D5)". - policy.yaml — intacto (sección L105-120 ya existente, mirrorada). Trazabilidad de contexto: sesión arrancada desde main post-merge D4 PR #15. No se usó /clear ni /compact — sin resume prompt que referenciar. Marker: .claude/branch-approvals/feat_d5-hook-post-action-compound.approved (gitignored por diseño, igual que D1/D2/D3/D4). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d5): impl hooks/post-action.py — PostToolUse compound trigger (GREEN) Quinto hook del plugin pos. Primera aplicación del patrón PostToolUse non-blocking: exit 0 siempre, nunca emite permissionDecision. Detección jerárquica: - Tier 1: shlex-parse del comando Bash. Matcher A = `git merge <ref>` (excluye --abort/--quit/--continue/--skip). Matcher C = `git pull` (excluye --rebase/-r). - Tier 2: confirmación post-hoc vía `git reflog HEAD -1 --format=%gs`. A espera prefijo "merge "; C espera "pull:" o "pull " (y no "pull --rebase"). Evita disparar en `git merge --abort` o cuando el pull fue rebase real aunque el shell no lo marcara. Cuando ambos tiers confirman, deriva paths tocados vía `git diff --name-only HEAD@{1} HEAD` y hace fnmatch contra mirror literal de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger: TRIGGER_GLOBS (generator/lib, generator/renderers, hooks, skills, templates/**/*.hbs), SKIP_IF_ONLY_GLOBS (docs/**, *.md, .claude/patterns/**), MIN_FILES_CHANGED=2. Si matchea, emite additionalContext sugiriendo `/pos:compound`. Nunca dispatcha la skill. Double log canonical (D1..D4 shape): post-action.jsonl + phase-gates.jsonl (evento `post_merge`). Cuatro status distinguidos: tier2_unconfirmed y diff_unavailable loguean sólo hook log; confirmed_no_triggers y confirmed_triggers_matched loguean ambos. Reusa `_lib/jsonl.append_jsonl` y `_lib/time.now_iso`. Hardcode mirror de policy.yaml (regla #7 CLAUDE.md: dos repeticiones D4+D5 cumplen precondición para policy-loader en rama dedicada). Coverage 97% líneas sobre hooks/post-action.py (target ≥90%). Suite global hooks/**: 432 pasados (D1+D2+D3+D4+D5, 110 nuevos). Cierra el kickoff de D5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d5): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/rules/hooks.md Docs-sync dentro de la rama D5 (Fase N+3, CLAUDE.md regla #2). - ROADMAP: D5 marcada en tabla + entrada feat/d5-hook-post-action-compound con entregables completos, contrato de 4 status distinguidos y ajustes vs plan original. - HANDOFF: seccion 1 snapshot (D5 cerrada, proxima D6); seccion 7 gotchas anade el bullet post-action (PostToolUse non-blocking, tiers, 4 status, advisory-only); seccion 9 proxima rama D6 con lectura minima actualizada; seccion 10 renombrada a "Estado D5 (cerrada en rama)" con resumen ejecutable. - MASTER_PLAN seccion Rama D5 expandida: status cerrado, contexto a leer, decisiones clave (deteccion jerarquica, gh pr merge descartado, advisory-only, segunda repeticion policy.yaml), contrato por status, ajustes, criterio de salida cumplido. - docs/ARCHITECTURE seccion 7: Capa 1 pasa de "dos variantes canonicas" a "tres variantes" (anade PostToolUse non-blocking). Nuevo bloque "Implementacion canonica PostToolUse non-blocking". - .claude/rules/hooks.md: seccion "Quinto hook entregado" con shape del patron, contrato completo, diferencias vs blocker/informative, nota simplify pass pre-PR. Tests intactos (docs-only): 432 passed + 1 skipped en hooks/**. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5): address PR #16 review — 2 Copilot issues fixed 1. _log_hook / _log_phase ahora pasan por _safe_append (try/except OSError). Sin el wrapper un disk-full / RO fs lanzaba OSError y rompia el contrato "exit 0 siempre" del patron PostToolUse non-blocking. Mirror directo de hooks/session-start.py::_safe_append (D2). Consistency con el 2o patron canonico. 2. match_triggers pasa de fnmatch.fnmatch a fnmatch.fnmatchcase. fnmatch.fnmatch aplica os.path.normcase, que es case-insensitive en Windows, introduciendo no-determinismo cross-OS en la evaluacion de TRIGGER_GLOBS / SKIP_IF_ONLY_GLOBS. fnmatchcase elimina esa dependencia. Tests intactos: 110 passed + 1 skipped. Sin cambios de contrato en la suite (_safe_append es privado; fnmatchcase es drop-in para paths POSIX lowercase que ya usan los tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…#17) * chore(d5b): kickoff refactor/d5-policy-loader ## Kickoff **Rama**: refactor/d5-policy-loader **Fase MASTER_PLAN**: D5b — policy-loader (insertada entre D5 y D6) **Tipo**: refactor (sin cambio de comportamiento observable, salvo convergencia `hooks/tests/`) ### Scope (Alt γ aprobada en Fase -1) Unificar lectura de `policy.yaml` en los 3 hooks D3/D4/D5 sobre un loader central en `hooks/_lib/policy.py`. D4 + D5 cumplieron las 2 repeticiones que CLAUDE.md regla #7 exige antes de abstraer; D6 nacerá ya sobre el loader. ### Archivos Nuevos: - hooks/_lib/policy.py - hooks/tests/test_lib_policy.py - hooks/tests/fixtures/policy/{minimal,full,malformed,missing-section}.yaml Modificados: - hooks/pre-write-guard.py — consume pre_write_rules() - hooks/pre-pr-gate.py — consume docs_sync_rules() + advisory_checks() - hooks/post-action.py — consume post_merge_trigger() - policy.yaml — añade lifecycle.pre_write + campo `excludes` opcional en docs_sync_conditional - requirements-dev.txt — pin exacto pyyaml - ROADMAP.md + HANDOFF.md + MASTER_PLAN.md + docs/ARCHITECTURE.md + .claude/rules/hooks.md ### Decisiones Fase -1 (congeladas) - Alt γ (migrar los 3 hooks, no scope-cut). - (b.1) strings/globs a YAML; derivación de test-pair queda en código. - (c.2) failure mode: policy no cargable → pass-through advisory + log, nunca deny. - pyyaml pin exacto (no rango). - Ubicación MASTER_PLAN: Rama D5b, sub-sección de Fase D. - templates/policy.yaml.hbs NO se toca → drift temporal meta↔template documentado explícitamente en docs/plan/PR. Esta rama NO debe leerse como "el template ya refleja el shape nuevo". Convergencia diferida a rama propia con señal de ≥1 proyecto generado que requiera el shape. ### Risks - pyyaml es primera dep runtime de hooks — primer cambio en supply chain. Mitigado con pin exacto + tests en CI. - policy.yaml extendida con campos nuevos — consumers fuera de hooks (si los hubiera) deben tolerar campos desconocidos. Hoy: skills/audit-session/ y skills/audit-plugin/ no existen. Sin impacto real. - Divergencia hooks/tests/ convergida vía campo `excludes` — test explícito asegura comportamiento D4 idéntico tras migración. ### Test plan - hooks/tests/test_lib_policy.py (~40-60 tests): happy path, secciones faltantes, YAML inválido, archivo ausente, shape validation, cache in-process, `excludes`. - Regresión: 432 tests D1..D5 corren idénticos — cero cambios al contrato observable. Test cross-hook verifica outputs con `policy.yaml` real del repo. - Failure mode (c.2): policy corrupto → pass-through + log `policy_unavailable`. - Coverage: _lib/policy.py ≥90% lines / ≥85% branches; global hooks/** sin regresión. ### Docs plan - ROADMAP.md — fila D5b + entrada "Progreso Fase D". - HANDOFF.md §1, §7, §9, §10 — quitar nota "policy.yaml declarado pero no enforced"; apuntar D6 como "nace sobre loader". - MASTER_PLAN.md — nueva sección "Rama D5b — policy-loader" con Ajustes. - docs/ARCHITECTURE.md §7 — sub-sección "Loader declarativo". - .claude/rules/hooks.md — sección "Policy loader" + ajuste en D3/D4/D5 (quitar "hardcoded", apuntar al loader). Nota dep pyyaml en "Runtime". - Drift meta↔template documentado explícitamente. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(d5b): declarative policy-loader + migrate D3/D4/D5 consumers Closes the CLAUDE.md regla #7 precondition opened by D4 + D5 (two hardcoded mirrors of policy.yaml inside hooks). Adds hooks/_lib/policy.py as the single source of truth and migrates pre-write-guard / pre-pr-gate / post-action to consume it in the same PR. Shape (Fase -1 decisions): - (b.1) strings/globs declarative in YAML, derivation in Python keyed by the pattern's `label`. derive_test_pair(rel_path, label) covers two labels: hooks_top_level_py and generator_ts (two YAML entries share the generator_ts label because fnmatch's middle `/` in `**` is literal, not recursive — one entry covers top-level, the other recursive subdirs). - (c.2) policy.yaml missing/corrupt → loader returns None → consumer hooks degrade to pass-through advisory with a `status: policy_unavailable` log entry. Never deny blindly (avoids bricking the repo on a bad YAML edit). policy.yaml changes: - New lifecycle.pre_write.enforced_patterns (3 entries). - lifecycle.pre_pr.docs_sync_conditional.hooks/** now carries excludes: ["hooks/tests/**"] — closes the deliberate D4 hook↔policy divergence. Dependency: pyyaml==6.0.2 (exact pin). First non-stdlib line in hooks/_lib/; justified in the kickoff commit. Templates intentionally NOT touched in this branch — drift meta-repo ↔ template is documented in the docs-sync commit and in the PR body. Tests: 462 passed + 1 skipped. New hooks/tests/test_lib_policy.py (57 cases); redundant TestIsEnforcedUnit / TestExpectedTestPairUnit / TestPolicyConstants removed (coverage moved into the loader suite). Coverage: _lib/policy.py 97%, pre-write-guard 93%, pre-pr-gate 93%, post-action 94%. Simplify pass: classify() in pre-write-guard now returns `label: str` instead of `(label, match_glob)` — the second element was dead. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d5b): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md + drift note Docs-sync for refactor/d5-policy-loader (CLAUDE.md regla #2). Captures: - ROADMAP: new rama row refactor/d5-policy-loader (✅) + Progreso Fase D entry (loader shape, 3 hooks migrated, test counts, coverage). - HANDOFF: snapshot now points at D5b in-flight; new gotchas for loader and drift meta↔template; §11 Estado D5b; §9 Próxima rama updated so D6 starts consuming the loader (no new hardcode permitted). - MASTER_PLAN: new § Rama D5b sub-section under Fase D with scope, decisions, contract, ajustes, drift note and exit criteria. - docs/ARCHITECTURE §7: loader canonicalized as single source of truth for hooks consuming policy.yaml; failure mode (c.2) documented as third safe-fail variant; explicit drift note. - .claude/rules/hooks.md: new § Policy loader with consumer contract, failure-mode table, shape, dependency note, fnmatch middle-slash note, loader test summary, drift note. Drift meta-repo ↔ template explicitly documented in all five locations (explicit user request): templates/policy.yaml.hbs, generator/renderers/ policy.ts and snapshots were NOT touched in this branch. Projects generated with `pos` today still emit a policy.yaml with the pre-D5b shape. Reconciliation (template + renderer + snapshots + pyyaml in requirements-dev for generated Python stacks) deferred to a dedicated rama post-D6. This rama must not be read as "the template already reflects the new shape". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5b): address PR #17 review — align cache contract + log status + hygiene Applied after Copilot review surfaced concrete mismatches between docs, code and log shape. Per user direction on the cache contract: option 2 (correct docs to match code) — the in-process cache is small enough and the hooks are ephemeral enough that implementing mtime/size keying would be abstraction ahead of need. Cache contract (6 Copilot comments + user's primary ask): - PR body and 5 docs said "cache keyed by path + mtime + size" with implicit invalidation on edits. Reality: load_policy() keys the cache by absolute path only. Updated hooks/_lib/policy.py docstrings to sharpen the "no implicit invalidation on edits" note, and corrected ROADMAP.md, HANDOFF.md, MASTER_PLAN.md, docs/ARCHITECTURE.md and .claude/rules/hooks.md to match. PR body edited via `gh pr edit`. Log status alignment (2 Copilot comments): - pre-pr-gate.py:_log_skip() was hardcoding status: "skipped" for every skip reason, including the policy-unavailable case — which the loader contract (and pre-write-guard / post-action siblings) emit as status: "policy_unavailable". Added optional `status` kwarg to _log_skip and pass "policy_unavailable" at the one relevant call site. Other skip reasons keep the default. - .claude/rules/hooks.md § Policy loader — consumer-contract example updated to reflect the new kwarg shape; aligns with the failure-mode table directly below. _safe_str_list stricter shape (1 Copilot comment): - Was silently dropping non-string entries (`["ROADMAP.md", 123]` → `["ROADMAP.md"]`), producing partial under-enforcement while still treating the policy as valid. Now returns None if any element is not a string — consistent with the "wrong-shape → None" contract the module docstring already claimed. Test-fixture hygiene (3 Copilot comments): - Three autouse `_reset_policy_cache` fixtures (test_pre_write_guard, test_pre_pr_gate, test_post_action) did unconditional sys.path.insert(0, ...) without guard or teardown. Switched to the guarded "insert only if missing + remove in teardown" pattern that test_lib_policy.py already uses. One test adjusted: - test_pre_pr_gate.py::TestGitUnavailable::test_not_a_git_repo_... was implicitly exercising both the no-policy.yaml path and the no-git path simultaneously. It now writes POLICY_YAML_FOR_TESTS so it actually tests what its name claims (git-unavailable path reaching the skip log). Tests: 462 passed + 1 skipped (unchanged). Dogfooding: pre-pr-gate with the updated status field passes this PR through its own gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(d5b): address PR #17 review round 2 — wrong-shape guards + (c.2) coverage Second review round surfaced edge cases on the loader's failure-mode contract. All FIX, no SKIPs. 12 new regression tests. Loader wrong-shape guards (3 Copilot comments, high-value): - Each of the three accessors (docs_sync_rules, post_merge_trigger, pre_write_rules) could raise AttributeError if `lifecycle` or the section itself was present but not a mapping (e.g. `lifecycle: not_a_dict` or `lifecycle.pre_pr: 42`). That broke the "never propagate exception" contract. Extracted `_lifecycle_section()` helper with isinstance checks; all three accessors now return None on wrong shape. Optional list fields — missing vs wrong-type (1 Copilot comment, medium): - `excludes` / `skip_if_only` / `exclude_globs` previously used `_safe_str_list(...) or []`, which silently coerced wrong-type values (e.g. `excludes: "hooks/tests/**"` as a string) to empty lists — potentially disabling a declared exclusion. Added `_optional_str_list` that distinguishes absent key (`→ []`) from present-but-wrong-shape (`→ None`, signalling the caller to skip the rule/pattern, or to return None for the whole accessor when the field is required-inside- trigger like `skip_if_only`). post-action.py docstring drift (1 Copilot comment): - Docstring still said "hardcoded mirror of policy.yaml" — outdated since D5b kickoff. Rewritten to reference the loader path and document the (c.2) pass-through behavior explicitly. pre-write-guard (c.2) for unknown labels (1 suppressed low-confidence comment): - `derive_test_pair` returning None (policy.yaml label typo or a new `enforced_patterns[*].label` added without a matching code branch in the derivation switch) previously fell through to a deny with an empty expected-path. That violated the (c.2) contract ("never deny blindly on policy issues"). Now treated the same as "policy unavailable": log `status: policy_unavailable` + pass-through exit 0. Preserves the "YAML typo cannot brick the repo" invariant. Tests: 474 passed + 1 skipped (was 462 + 1, +12 new cases): - TestWrongShapeGuards — 7 cases covering non-mapping lifecycle / non- mapping section across the three accessors. - TestOptionalListShape — 4 cases covering wrong-type optional-list on each accessor + a `_safe_str_list` mixed-type propagation test that locks in the strict contract introduced in round 1. - TestMainInProcess::test_unknown_label_passes_through_with_policy_unavailable — integration test for pre-write-guard's (c.2) handling of an unknown label injected via policy.yaml. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…ontract Three substantive fixes on top of Copilot review: 1. Scope skills.jsonl reads by session_id (review concern #1). _extract_invoked_skills(repo_root, session_id) now streams line-by-line and only counts entries whose session_id matches the Stop payload. Entries without session_id, with non-string session_id, or from prior sessions are silently ignored — the log is append-only and accumulates across sessions. Payload Stop without session_id -> safe-fail deny (enforcement cannot scope safely). Tests: new TestSessionScoping class (6 cases incl. 5-session mixed log) + safe-fail cases for missing/empty/non-string session_id. 2. Tri-state skills_allowed_list — stop collapsing absent vs invalid (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in _lib/policy.py. None = section absent (deferred, prod state), sentinel = present but wrong-shape (misconfigured, observable), () = explicit deny-all, tuple = live enforcement. Stop hook emits status: policy_misconfigured on sentinel with literal reason. A typo in policy.yaml no longer silently turns enforcement off. Tests: new TestMisconfiguredPolicy class + test_three_states_are_all_distinct + test_invalid_sentinel_distinct_from_none in loader suite. 3. Remove exact-string quotes of pre-compact output from docs (review concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin the literal advisory wording — the suite validates shape + presence, not the string. Frees the hook to refine copy without doc drift. Suite: 575 passed + 1 skipped (+20 new tests). No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(d6): kickoff — pre-compact.py + stop-policy-check.py ## Kickoff ### Scope Sexto + séptimo hook Python. Cierra Fase D antes de abrir Fase E. - hooks/pre-compact.py — PreCompact informative (shape D2) - hooks/stop-policy-check.py — Stop blocker-scaffold (shape D1, deferred) D6 consume hooks/_lib/policy.py desde el primer commit (loader vivo tras D5b). Nuevo hardcode de policy = regresion explicita. ### Archivos Nuevos: - hooks/pre-compact.py - hooks/stop-policy-check.py - hooks/tests/test_pre_compact.py - hooks/tests/test_stop_policy_check.py Modificados: - hooks/_lib/policy.py (+ accessors pre_compact_rules, skills_allowed_list) - hooks/tests/test_lib_policy.py (+ casos accessors nuevos) - ROADMAP.md HANDOFF.md MASTER_PLAN.md - docs/ARCHITECTURE.md §7 - .claude/rules/hooks.md ### Decisiones Fase -1 (aprobadas) - (A2) pre-compact INFORMATIVE, no blocker. Razon: bloquear /compact intencional es destructivo; el valor del hook es emitir additionalContext con la checklist persist del policy para que el modelo persista antes del compact. - (c.3) stop-policy-check BLOCKER-SCAFFOLD. Shape D1 (safe-fail deny + double log + permissionDecision disponible), pero ZERO enforcement real hoy. policy.yaml.skills_allowed no existe todavia - skills_allowed_list() devuelve None y el hook degrada a log status=deferred. Activable sin refactor cuando E1a aterrice skills_allowed. Framing estricto: "puede bloquear por contrato, pero hoy esta en modo deferred salvo safe-fail + tests future-proof". - Ambos en la misma rama. Reuso loader + docs-sync compartido. - Failure mode canonico (c.2) reaplicado: accessor None -> pass-through advisory + log status=policy_unavailable. Nunca deny blind. ### Framing explicito (anti-sobrerrepresentacion) En docs + PR body, stop-policy-check.py NO se presenta como enforcement util en produccion. Se describe como: - hook con shape blocker listo - modo deferred mientras skills_allowed no exista en policy.yaml - safe-fail activo (deny ante payload malformado) - tests cubren el enforcement futuro para que E1a solo tenga que declarar skills_allowed en el policy ### Tests TDD estricto. Orden: 1. commit rojo: tests que fallan por accessors + hooks ausentes 2. accessors en _lib/policy.py (verde accessor tests) 3. pre-compact.py (verde pre_compact tests) 4. stop-policy-check.py (verde stop tests) 5. docs-sync 6. simplify 7. review Coverage objetivo: >=80% lines / >=75% branches por hook, >=90% sobre accessors. Suite global hooks/** >=500 tests verdes. ### Docs plan Dentro del mismo PR (docs-sync docs_sync_conditional activo por hooks/**): - ROADMAP.md: fila D6 marcada. - HANDOFF.md: §9 proxima rama (E1a), §12 estado D6, §7 contador 5->7 hooks. - MASTER_PLAN.md § Rama D6: cerrar con ajustes vs plan original. - docs/ARCHITECTURE.md §7: hook counter + eventos phase-gates (pre_compact, stop). - .claude/rules/hooks.md: Sexto hook + Septimo hook + ampliar Policy loader con pre_compact_rules + skills_allowed_list. ### NO incluye - No persistencia real de estado del LLM (pre-compact emite prompt, no escribe). - No enforcement activo de skills_allowed (scaffold). - No tocar templates/policy.yaml.hbs (drift meta-repo vs template documentado desde D5b, rama reconciliadora post-D6). - No skills, no runtime. * test(d6): red tests — accessors + hooks ausentes Fallan por diseño (TDD estricto, Fase 1 de rama): - _lib.policy.pre_compact_rules() no existe - _lib.policy.skills_allowed_list() no existe - hooks/pre-compact.py no existe (collection error) - hooks/stop-policy-check.py no existe (collection error) Lock-down de contrato pre-impl: - PreCompactRules frozen dataclass; persist: tuple[str, ...] - skills_allowed_list: tuple[str, ...] | None (None=deferred, ()=deny-all) - Pre-compact hook: shape informative (exit 0 always, no permissionDecision) - Stop hook: shape blocker-scaffold; c.3 deferred until skills_allowed declared; safe-fail blocker (deny exit 2 on malformed payload) 23 fails en test_lib_policy; 2 collection errors en los tests de hooks. Ningún test verde pre-impl. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(d6): impl pre-compact + stop-policy-check hooks (+ 2 accessors) Sexto hook — hooks/pre-compact.py (shape D2 informative): - Lee lifecycle.pre_compact.persist vía pre_compact_rules() y emite hookSpecificOutput.additionalContext como checklist para el modelo. - Exit 0 siempre; nunca permissionDecision. Nunca bloquea /compact. - Failure-mode (c.2): policy None → additionalContext mínimo + status: policy_unavailable en hook log. Pass-through advisory canónico. - Double log: pre-compact.jsonl (siempre) + phase-gates.jsonl (event: pre_compact sólo en happy path; policy_unavailable queda sólo en hook). - Safe-fail informative: malformed payload → additionalContext con "(error reading payload: ...)" + status: payload_error, exit 0. Séptimo hook — hooks/stop-policy-check.py (shape D1 blocker-scaffold): - Lee skills_allowed_list() + .claude/logs/skills.jsonl. - c.3 Scaffold: skills_allowed absent → status: deferred pass-through; meta-repo no declara el campo hoy, así que enforcement es DEFERRED en prod — la cadena entera existe para cuando E1a añada el campo. - Activable: skills_allowed declarado → _validate(invoked, allowed), deny exit 2 con primer violador en decisionReason; allow exit 0. - Failure-mode (c.2): policy None → status: policy_unavailable pass-through. Safe-fail blocker canónico: malformed payload → deny exit 2. - Double log sólo en decisiones reales (allow/deny). Deferred y policy_unavailable quedan sólo en hook log. - _extract_invoked_skills y _validate son helpers privados pero testeables como unidad (aserciones `sp._extract_invoked_skills(...)` y `sp._validate(...)` en la suite). Loader — hooks/_lib/policy.py: - pre_compact_rules(repo_root) → PreCompactRules | None (dataclass frozen con persist: tuple[str, ...]). - skills_allowed_list(repo_root) → tuple[str, ...] | None (None=deferred absent; ()=explicit deny-all). 555 pasados (+1 skip intencional D5) en hooks/**. Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d6): sync ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + hooks.md Cierra docs-sync de D6 (feat/d6-hook-pre-compact-stop). Dos entregas: pre-compact.py (PreCompact informative, shape D2) + stop-policy-check.py (Stop blocker scaffold — NO enforcement en produccion hoy: skills_allowed ausente en policy.yaml del meta-repo → status: deferred, pass-through). Contrato None/() documentado como distincion semantica del scaffold. Dos accessors nuevos en hooks/_lib/policy.py: pre_compact_rules + skills_allowed_list (5 accessors totales tras D5b+D6). Framing anti-sobrerrepresentacion (MASTER_PLAN + ARCHITECTURE + hooks.md): el hook Stop valida su propio shape, no enforcement real hasta E1a poblando skills_allowed. Precondicion lista: activacion sin cambio de codigo cuando la primera skill /pos:* exista. * refactor(d6): simplify pre-compact — inline _log_hook/_log_phase wrappers Los wrappers _log_hook / _log_phase en pre-compact.py eran triviales (3 + 1 call sites). Inline directo a _safe_append(cwd / HOOK_LOG, ...) / _safe_append(cwd / PHASE_LOG, ...): -8 lineas de wrappers, mismo contrato. Gana consistencia estilistica con stop-policy-check.py (otro hook D6) que ya usaba el shape inline. No afecta tests ni comportamiento. 555 passed + 1 skipped (sin regresion). * fix(d6): address PR #18 review — session scoping + tri-state policy contract Three substantive fixes on top of Copilot review: 1. Scope skills.jsonl reads by session_id (review concern #1). _extract_invoked_skills(repo_root, session_id) now streams line-by-line and only counts entries whose session_id matches the Stop payload. Entries without session_id, with non-string session_id, or from prior sessions are silently ignored — the log is append-only and accumulates across sessions. Payload Stop without session_id -> safe-fail deny (enforcement cannot scope safely). Tests: new TestSessionScoping class (6 cases incl. 5-session mixed log) + safe-fail cases for missing/empty/non-string session_id. 2. Tri-state skills_allowed_list — stop collapsing absent vs invalid (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in _lib/policy.py. None = section absent (deferred, prod state), sentinel = present but wrong-shape (misconfigured, observable), () = explicit deny-all, tuple = live enforcement. Stop hook emits status: policy_misconfigured on sentinel with literal reason. A typo in policy.yaml no longer silently turns enforcement off. Tests: new TestMisconfiguredPolicy class + test_three_states_are_all_distinct + test_invalid_sentinel_distinct_from_none in loader suite. 3. Remove exact-string quotes of pre-compact output from docs (review concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin the literal advisory wording — the suite validates shape + presence, not the string. Frees the hook to refine copy without doc drift. Suite: 575 passed + 1 skipped (+20 new tests). No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(d6): strip "PR #18" from in-repo cross-refs; keep rationale Copilot flagged 7 bullets across hooks.md / MASTER_PLAN.md / ARCHITECTURE.md that cite "post-review PR #18" inline. For long-lived rules docs the PR number is not a stable rendered identifier (forks/rebases lose it), while the rationale ("post-review") carries the same meaning. Drop the number. No contract change. Tests untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…icy.yaml vs .claude/logs/ (#25) * test(f1): RED — extend ALLOWED_SKILLS 13->14 + behavior tests for audit-session Fase 0 kickoff + Fase 1 RED-first per CLAUDE.md regla #3 and .claude/rules/tests.md. Plan ratificado por usuario: decisiones A1.a..A6.a + 3 ajustes obligatorios. Scope: skill /pos:audit-session — read-only advisory main-strict que compara 3 superficies de policy.yaml contra .claude/logs/ reales: 1. policy.yaml.skills_allowed vs skills.jsonl invocations. 2. policy.yaml.lifecycle.*.hooks_required vs logs por hook (existencia + nonempty del archivo log esperado). 3. policy.yaml.audit.required_logs vs existencia/edad/no-vacio. RED state confirmado: 16 failures esperados. - 10 parametrizados [audit-session] en TestStructure / TestFrontmatter / TestBody. No existe .claude/skills/audit-session/SKILL.md. - 5 TestAuditSessionBehavior: * test_body_declares_three_audit_surfaces * test_body_declares_advisory_only * test_body_declares_main_strict_no_delegation * test_body_declares_30day_review_window * test_body_declares_prefix_normalization_assumption - 1 test_real_skills_allowed_populated_by_f1. policy.yaml todavia declara 13; ALLOWED_SKILLS ya crecio a 14. Tests behavior siguen el patron de TestPatternAuditBehavior E3a — la referencia mas cercana: read-only advisory main-strict. Ajuste 3 del usuario aplicado: el test del 30-day window valida DECLARACION del body, no ejecucion de date math. Renames: - test_real_skills_allowed_populated_by_e3b -> _by_f1. Tupla 13 -> 14 via ALLOWED_SKILLS shared. - test_all_thirteen_e1_e3b_skills_end_to_end -> test_all_fourteen_e1_e3b_f1_skills_end_to_end. GREEN phase proxima: crear SKILL.md + bump policy.yaml.skills_allowed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(f1): GREEN - audit-session skill + bump skills_allowed 13->14 GREEN phase per CLAUDE.md regla #3 and .claude/rules/tests.md. RED commit (5d6091d ancestor + RED) introduced 16 failures; this commit turns all 16 green without touching any unrelated test. Skill body (.claude/skills/audit-session/SKILL.md ~110 lines): - Frontmatter minimal canonical: name=audit-session, description starts "Use when ...", allowed-tools list of 6 entries (Glob, Grep, Read, Bash(find:*), Bash(wc:*), Bash(.claude/skills/_shared/log-invocation.sh:*)). No Bash(git log:*) per ajuste 2 del usuario. - Read-only advisory main-strict: scope explicito MAY/MUST NOT. - Three audit surfaces declared (Fase -1 decision A1.a): Bucket 1: skills_allowed vs skills.jsonl invocations. Bucket 2: lifecycle.*.hooks_required vs per-hook log files. Bucket 3: audit.required_logs vs file existence/nonempty/mtime. - 30-day review window declared as textual guidance (A2.a + ajuste 3 del usuario): the skill does NOT execute date math, the human applies the lens when reading the report. - Prefix normalization assumption (A3.a): pos:<slug> stripped before cross-comparing with policy.yaml.skills_allowed. - Pre-existing drift expected (A4.a): hooks.jsonl declared in audit.required_logs but no such file exists. Skill reports it as Bucket 3 candidate, does NOT auto-fix. - Report structured by surface (A5.a): three sections + summary line. - audit.session_audit.schedule (e.g. weekly) explicitly NOT enforced (A6.a): documental cadence, no cron/CI hook in F1. - Out of scope: external fork delegation (main-strict by design), cross-session aggregation, date arithmetic, mutating policy or logs. Body satisfies all 5 TestAuditSessionBehavior tests literally: - skills_allowed + lifecycle + hooks_required + required_logs tokens. - "advisory"/"read-only"/"does not modify"/"no modifica" tokens. - No "subagent"/"code-architect"/"agent(" tokens (uses "fork" for external delegation refusal). - "30" + "day"/"review window" tokens. - "pos:" + "normaliz" tokens. policy.yaml: - skills_allowed: 13 -> 14 entries (audit-session appended). - Comment line 268 updated: "E3b 13 skills -> F1 14 skills". Test deltas (793 passed + 1 skipped, zero regression): - 10 parametrized [audit-session] in TestStructure / TestFrontmatter / TestBody pass. - 5 TestAuditSessionBehavior pass. - test_real_skills_allowed_populated_by_f1 passes (tuple is now 14). - test_all_fourteen_e1_e3b_f1_skills_end_to_end passes (logger -> Stop hook end-to-end with all 14 skills allowlisted). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(f1): docs-sync - ROADMAP + HANDOFF + MASTER_PLAN + skills-map Phase N+3 docs-sync per CLAUDE.md regla #2 (docs-sync within branch) and pre-pr-gate.py canonical baseline (ROADMAP + HANDOFF mandatory) plus conditional `.claude/rules/skills-map.md` for `skills/` paths. ROADMAP.md: - Top table: Fase F status `pendiente` -> `1/4 (F1 ok, F2..F4 pending)`. - F1 row: status `pending` -> `done (PR pending)` with concrete scope. - New section "Progreso Fase F" with feat/f1 detail block: entregables, allowed-tools rationale, contract locked by suite, A1.a..A6.a decisions, 3 mandatory user adjustments, criterio salida. HANDOFF.md: - Section 1 snapshot: Rama actual F1 (PR pendiente); next branch F2; F1 entregables one-liner. - Section 9 Proxima rama: F2 feat/f2-agents-subagents with scope (3 subagent definitions, naming-conflict question, agents_allowed evaluation). - New section 19 "Estado F1": full closure block parallel to E3a/E3b, with entregables + contract + 3 mandatory adjustments + YAML gotcha avoided + resultado (793 + 1 skip) + cross-references. MASTER_PLAN.md Rama F1: - Replaced 1-line stub with full closing block: scope concrete (3 surfaces), A1.a..A6.a decisions, 3 mandatory adjustments, contexto a leer, criterio de salida, carry-overs to F2..F4. Branch marker set to "PR pendiente". .claude/rules/skills-map.md: - Audit + Release section: audit-session row populated with concrete contract (3 surfaces + main-strict + 30-day textual guidance + no auto-fix + allowed-tools list). Replaces the 1-line stub from F0. Tests: 793 passed + 1 skipped (D5 intentional subprocess-no-cover). Zero regression D1..D6 + E1a..E3b. Behavior contract for audit-session locked across 5 tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

…#28) * test(f4): RED — marketplace.json + release.yml + plugin.json version pin Fase 0 (kickoff) + Fase 1 (RED tests) for feat/f4-marketplace-public-repo. Scope (per Fase -1 ratificada con 8 ajustes del usuario): - Aterrizar infra local del marketplace + release flow sin depender de que javiAI/pos-marketplace exista todavía (A1.b). - .claude-plugin/marketplace.json con schema oficial Claude Code: top-level {name, owner, plugins}; owner.name; plugin {name, source} con source.{source=github, repo, ref="v"+version}. - .github/workflows/release.yml trigger tag:v*, jobs version-match / selftest / build-bundle / publish-release / mirror-marketplace (mirror condicional/skippable hasta repo público). - Bump plugin.json.version 0.0.1 → 0.1.0 (primer release público). Archivos en este commit (RED tests, expected failures): - bin/tests/test_marketplace_json_schema.py (12 tests) - bin/tests/test_release_workflow_smoke.py (6 tests) - bin/tests/test_plugin_json_version_bump.py (3 tests) Tests verifican: - marketplace.json schema oficial mínimo (top-level + owner + plugin) - plugin name/version/ref sync entre marketplace ↔ plugin.json - release.yml trigger v*, jobs esperados, publish-release.needs ⊇ {version-match, selftest, build-bundle}, mirror-marketplace conditional/skippable - plugin.json.version pin = "0.1.0" Estado RED actual: 19 failed + 12 passed (12 = F3 baseline 9 + 3 plugin.json existe/parses). Sin regresión en F3. Diferidos en F4 (regla #7 CLAUDE.md): - audit.yml nightly (sin consumer hoy; rama propia post-F4). - /pos:pr-description, /pos:release skills (sin repetición demostrada). - CHANGELOG.md enforced (auto-generated from git log entre tags). - refactor/template-policy-d5b-migration (drift independiente). - Fase G (Knowledge Plane). GREEN impl + docs (RELEASE.md/ARCHITECTURE/ci-cd/MASTER_PLAN/ROADMAP/ HANDOFF) entran en commits siguientes dentro de esta rama. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(f4): GREEN — marketplace.json + release.yml + plugin.json 0.1.0 Fase 2 (GREEN) — flippea los 19 RED del commit previo. .claude-plugin/marketplace.json (NEW): - Schema oficial Claude Code marketplace. - top-level: name="pos-marketplace", owner.name="javiAI", plugins[]. - plugins[0]: name=pos, source={source:github, repo:javiAI/ project-operating-system, ref:v0.1.0}, version=0.1.0. - metadata.{description, version} para humans. .claude-plugin/plugin.json: - version 0.0.1 → 0.1.0 (primer release público; pre-1.0). - Single source of truth; tag git debe ser v${version}. .github/workflows/release.yml (NEW): - Trigger: push tags v*. - Jobs: - version-match: assert plugin.json.version == ${tag#v}. - selftest: pytest bin/tests -q (reusa contrato F3). - build-bundle: tar.gz curated plugin-only (.claude-plugin/, .claude/skills/, .claude/rules/, hooks/, agents/, policy.yaml, bin/pos-selftest.sh, bin/_selftest.py, docs/RELEASE.md). Excluye generator/, tools/, templates/, questionnaire/. - publish-release: needs [version-match, selftest, build-bundle]; gh release create con bundle como asset. - mirror-marketplace: condicional vía vars.POS_MARKETPLACE_REPO; si vacío skippea sin fallar release. Abre PR contra repo público cuando esté configurado. - Actions pinneadas por SHA (ci-cd.md regla #2). - permissions.contents=write para gh release create. Tests post-GREEN: 21 passed (12 marketplace + 6 release.yml + 3 plugin version), suite total 644 passed + 1 skipped (skip D5 intencional F3). Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(f4): sync — RELEASE runbook + ARCH §13 + ci-cd + ROADMAP/HANDOFF/MASTER_PLAN Fase N+3 — docs-sync dentro de la rama (CLAUDE.md regla #2, docs.md § Docs-sync en cada rama). docs/RELEASE.md (NEW): - Runbook user-facing de versionado + bundle + flujo + recovery. - Contrato de versionado: plugin.json.version source of truth; tag = v${version}; marketplace.json.source.ref espeja. - Bundle scope plugin-only curated (incluye/excluye explícitos). - Flujo en 5 pasos: bump → tag → workflow → verify → recovery. - Activación del mirror cuando exista repo público (3 pasos: crear repo + gh variable set POS_MARKETPLACE_REPO + gh secret set POS_MARKETPLACE_TOKEN). - Instalación user-facing (/plugin marketplace add + /plugin install pos). - Diferidos enumerados. docs/ARCHITECTURE.md § 13 (Marketplace + Release flow): - Reescrita de placeholder de 6 líneas a sub-sección completa. - Manifest, source of truth de versión, jobs del workflow, bundle scope curated, deferral del repo público, determinismo del flujo, instalación user-facing, deferrals. .claude/rules/ci-cd.md: - Bullet release.yml promovido de "Diferidos" a "Aterrizado" (entregado en F4). - Nuevo H3 "### Job release (entregado en F4)" con scope completo (5 jobs + bundle curated + source of truth). ROADMAP.md: - Tabla: F4 marcada ✅ (PR pendiente). - Nueva sección § feat/f4-marketplace-public-repo en Progreso Fase F: scope, entregables, decisiones Fase -1 (A1.b..A8), contrato fijado, carry-overs, criterio de salida (665 passed + 1 skipped). HANDOFF.md: - §1 Snapshot: rama actual F4 (entrega + suite update). - §9 Próxima rama: Fase F cerrada; carry-overs (template-policy d5b migration, marketplace activación, skills diferidas, audit.yml). - §22 nuevo: Estado F4 con entregables + contrato + decisiones + carry-overs (paralelo a §19 F1, §20 F2, §21 F3). MASTER_PLAN.md § Rama F4: - Expandida de 3 líneas a sección completa: scope realizado, archivos entregados con detalle por path, decisiones Fase -1 (A1.b..A8), contexto leído, criterio de salida, carry-overs. Simplify pre-commit: - Recortados 3 bullets de "Ajustes durante implementación" (heredoc syntax glitch + rtk wrapper output filter) — debug ephemera, pertenecen a commit history. - Mantenido el único gotcha persistente: PyYAML 1.1 parsea `on:` como Python bool True (patrón reutilizable para tests futuros de workflow YAML). Sin tocar (per A8 ratificado en Fase -1): policy.yaml, hooks/**, .claude/skills/**, agents/**, generator/**, templates/**, .claude/rules/skills-map.md. Tests: sin cambios (GREEN ya verde con 665 passed + 1 skipped). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(f4): release.yml — gating de version-match + idempotencia mirror (PR #28 review) Aplica las 4 findings de Copilot review de PR #28. Las 4 son real correctness/idempotency bugs, no estilo. Triage value/effort: todas high/trivial-low → FIX. Gating de version-match (findings 1 + 4): - selftest: needs [version-match]. - build-bundle: needs [version-match]. Antes corrían en paralelo con version-match → CI gastaba tiempo en tags mismatched y contradecía el "orden estricto" documentado. Ahora: version-match → (selftest + build-bundle) → publish-release → mirror-marketplace. Idempotencia mirror-marketplace (findings 2 + 3): - Tras `git add marketplace.json`, si `git diff --cached --quiet` no hay cambios → exit 0. Antes `git commit` no-op fallaba la re-run del workflow. - Antes de `gh pr create`, `gh pr list --head $branch --state open`. Si ya existe un PR abierto → skip create con mensaje. Antes `gh pr create` con PR existente fallaba la re-run. Tests: bin/tests 31/31 verde; full explicit run 850 passed + 1 skipped (skip D5 intencional F3). Sin regresión. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Javier <javier.abril@glassnode.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Javier and others added 9 commits April 20, 2026 01:26

chore(b2): drop questionnaire/profiles/.gitkeep (dir no longer empty)

9bc87d8

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 20, 2026 07:45

Copilot started reviewing on behalf of javiAI April 20, 2026 07:45 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

javiAI merged commit f361c19 into main Apr 20, 2026
2 checks passed

javiAI deleted the feat/b2-profiles-starter branch April 20, 2026 07:57

javiAI mentioned this pull request Apr 21, 2026

feat(d4): hook pre-pr-gate — docs-sync enforcer on gh pr create #15

Merged

4 tasks

javiAI mentioned this pull request Apr 22, 2026

feat(d6): hooks pre-compact + stop-policy-check — cierre Fase D #18

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(b2): profiles starter + profile validator CLI#2

feat(b2): profiles starter + profile validator CLI#2
javiAI merged 10 commits into
mainfrom
feat/b2-profiles-starter

javiAI commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

javiAI commented Apr 20, 2026

Uh oh!

javiAI commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if (field.pattern !== undefined && !new RegExp(field.pattern).test(value)) {
		issues.push(violation(path, `value '${value}' does not match pattern /${field.pattern}/`));

Conversation

javiAI commented Apr 20, 2026

Summary

Scope decisions (vs MASTER_PLAN)

Test plan

Docs-sync

Simplify pass

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

javiAI commented Apr 20, 2026

Uh oh!

javiAI commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants