Skip to content

feat(b2): profiles starter + profile validator CLI#2

Merged
javiAI merged 10 commits into
mainfrom
feat/b2-profiles-starter
Apr 20, 2026
Merged

feat(b2): profiles starter + profile validator CLI#2
javiAI merged 10 commits into
mainfrom
feat/b2-profiles-starter

Conversation

@javiAI
Copy link
Copy Markdown
Owner

@javiAI javiAI commented Apr 20, 2026

Summary

  • 3 canonical profiles (nextjs-app, agent-sdk, cli-tool) in questionnaire/profiles/. Each answers ~60% of the schema; 3 user-specific fields are omitted by design (parcialidad).
  • New profile validator: tools/lib/profile-validator.ts (zod-strict parser + issue emitter) + tools/validate-profile.ts CLI (exit 0/1/2).
  • Fixtures in tools/__fixtures__/profiles/{valid,invalid}/ cover all 5 emitted ProfileIssueKinds.
  • CI step Validate profiles + npm script validate:profiles run all 3 canonical profiles against the schema on every push.
  • Shared YAML I/O extracted to tools/lib/read-yaml.ts (dedupes validate-profile + validate-questionnaire — meets 2x pattern threshold).

Scope decisions (vs MASTER_PLAN)

  • Fixtures location: tools/__fixtures__/profiles/ (not generator/__fixtures__/profiles/) because the generator does not exist yet. Consolidation deferred to B3 if applicable. Documented in MASTER_PLAN.md section B2.
  • Known gap: answer-value-not-in-array-allowlist is not validated at instance level in this branch. ArrayField.values exists in the schema (integrations.mcps) but the per-item allowlist check is deferred. Documented in MASTER_PLAN.md section B2 and docs/ARCHITECTURE.md section Profiles.

Test plan

  • Unit tests: tools/lib/profile-validator.test.ts — 21 tests, one per issue kind + multi-issue aggregation + partial-profile acceptance.
  • Integration tests: tools/validate-profile.test.ts — 14 tests via spawnSync on the CLI (exit codes, stderr, formatReport).
  • Full suite: 106 tests passing.
  • Coverage: 95.92% lines / 89.91% branches / 100% functions (thresholds 90/85/90/90).
  • Typecheck: clean (tsc --noEmit).
  • CI: validate:profiles and validate:questionnaire both exit 0 against committed profiles.

Docs-sync

  • ROADMAP.md, HANDOFF.md, MASTER_PLAN.md section B2, docs/ARCHITECTURE.md section Profiles, .claude/rules/generator.md updated in-branch.

Simplify pass

  • Extracted tools/lib/read-yaml.ts; inlined 3 single-call helpers in profile-validator.ts; collapsed 3 duplicate canonical CLI tests to one it.each. Net -45 LOC.

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

Javier and others added 9 commits April 20, 2026 01:26
- CLAUDE.md: add Fase N+7 (Context gate) to branch flow table as last
  phase of the previous branch.
- AGENTS.md: add Context gate as non-negotiable rule #1 and as step 3
  of the "continúa" autonomous execution flow.
- HANDOFF.md §3: rename to "Decisión /clear vs /compact vs sesión
  nueva (Fase N+7 Context gate)" + add checklist pre-Fase-1 + §6b
  carry-over to propagate the rule to templates/*.hbs in C1.
- .claude/rules/docs.md: add trazabilidad checkbox to docs-sync list
  (first kickoff commit references the resume prompt when the branch
  was started post-/compact or post-/clear).

Establishes the context-management decision as the final phase of the
previous branch, enforcing explicit evaluation of
continuar | /compact | /clear | sesión nueva before Fase -1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
## Kickoff

**Scope**: Crear 3 profiles canónicos (nextjs-app, agent-sdk, cli-tool)
+ profile validator CLI + fixtures + tests unit/integration + CI step.
Establece los arranques canónicos del cuestionario: cada profile
precocina las decisiones típicas de un stack, cubre todos los campos
`required` no-usuario-específicos (domain.type, stack.language,
testing.unit_framework) y deja abiertos sólo los 3 user-specific
(identity.name, identity.description, identity.owner).

**Archivos a crear**:
- questionnaire/profiles/nextjs-app.yaml, agent-sdk.yaml, cli-tool.yaml
- tools/lib/profile-validator.ts (+ .test.ts)
- tools/validate-profile.ts (+ .test.ts)
- tools/__fixtures__/profiles/valid/<3 canónicos>.yaml (copia literal
  por valor; consolidar en B3 si el runner revela mejor mecanismo)
- tools/__fixtures__/profiles/invalid/*.yaml (3-4 negativos: unknown
  path, type mismatch, enum-out-of-values, pattern violation)

**Archivos a modificar**:
- .github/workflows/ci.yml → step validate-profiles (matrix
  ubuntu+macos, node 20, actions pineadas por SHA, reusa toolchain B1)
- docs/ARCHITECTURE.md → §2 Profiles: shape canonical + sub-sección
  Profile validator con issue kinds
- .claude/rules/generator.md → bloque Profiles (location + shape + CLI)
- ROADMAP.md → arrastra drift B1 ✅ PR #1 + B2 en curso → ✅ al cerrar
- HANDOFF.md → §1 snapshot + §9 próxima B3 + §10 estado B2
- MASTER_PLAN.md § Rama B2 → ✅ al cerrar

**Shape canonical del profile**:

  version: "0.1.0"
  profile:
    name: <string>
    description: <string>
  answers:
    "<path.dotted>": <value>

Claves dotted alineadas 1:1 con field.path del schema. Facilita
override key-por-key en el runner (B3). Rechazada la alternativa
anidada por acoplamiento fuerte al renombrar fields.

**Issue kinds del profile validator (B2)**:
- answer-unknown-path
- answer-type-mismatch
- answer-value-not-in-enum
- answer-array-item-type-mismatch
- answer-constraint-violation (pattern / minLength / maxLength /
  min / max / minItems / maxItems)

**Brecha conocida (decisión explícita del usuario)**:
answer-value-not-in-array-allowlist NO se implementa en B2.
ArrayField.values existe en tools/lib/meta-schema.ts:43 y
questionnaire/schema.yaml:95-100 usa la capacidad (integrations.mcps
con allowlist ["mempalace","notebooklm"]). La validación a nivel de
instancia se difiere. Si ArrayField.values se introduce formalmente
en una rama posterior o antes de cerrar B2, añadir el check
correspondiente en el profile validator.

**Principio**: los profiles son PARCIALES. No tienen que cubrir todos
los campos `required` del project_profile final. El validator sólo
verifica que los paths declarados existan en el schema y que sus
valores respeten los constraints del field. Los campos user-specific
quedan fuera de los profiles por diseño.

**No incluido en B2 (llega después)**:
- Ejecución interactiva del cuestionario (B3 runner).
- Merging profile + overrides CLI (B3).
- Generación real de archivos (C1+).
- Resolución de `when:` para decidir requiredness condicional (B3).

**Alternativas descartadas**:
- (B) Extender tools/validate-questionnaire.ts con flag --profile:
  acopla responsabilidades (meta-schema vs instancia).
- (C) Sólo tests, sin CLI: bloquea CI step y futuros usos desde
  pre-PR gate (D4).

**Risks**:
- Duplicación de datos entre questionnaire/profiles/ y
  tools/__fixtures__/profiles/valid/. Mitigación: copia literal;
  consolidar si B3 lo pide. Scope controlado (~150 líneas YAML).
- Brecha del array allowlist documentada arriba; no bloquea el MVP.

**Test plan**:
- Unit (tools/lib/profile-validator.test.ts): cada issue kind + 3
  profiles canónicos válidos + profile parcial OK (sin user-specific)
  + profile con campos extra no declarados → issue.
- Integration (tools/validate-profile.test.ts): CLI exit 0 sobre
  canónicos; exit 1 sobre negativos con stderr del issue kind; exit 2
  sobre archivo inexistente o YAML ilegible.
- CI: step validate-profiles ejecuta el CLI sobre los 3 profiles en
  matrix ubuntu+macos.
- Coverage: thresholds vigentes (90/85/90/90) deben seguir pasando.

**Docs plan** (Fase N+3):
- docs/ARCHITECTURE.md §2 Profiles → shape + sub-sección Profile
  validator con issue kinds + nota de brecha diferida.
- .claude/rules/generator.md → bloque Profiles.
- ROADMAP drift (B1 ✅ PR #1) + B2 ✅ + progreso Fase B.
- HANDOFF snapshot + próxima rama B3 + estado B2.
- MASTER_PLAN § Rama B2 → ✅.

**Trazabilidad Fase N+7** (aplicada per regla #1 AGENTS.md y checkbox
de .claude/rules/docs.md): esta rama se inició post-/compact con
focus="B1 merged + Fase -1 B2 draft + ROADMAP drift + sistematización
Fase N+7 aplicada". Archivos releídos post-compact para retomar Fase
-1: MASTER_PLAN.md § Rama B2 (L67-71), docs/ARCHITECTURE.md §2
Profiles (L54-60) + §Schema DSL (L62-89), .claude/rules/generator.md
(entero), questionnaire/schema.yaml, questionnaire/questions.yaml,
tools/lib/meta-schema.ts, HANDOFF.md §3 + §6b. Decisiones preservadas
del pre-compact: alternativa (A) CLI validator, shape answers-dotted,
denominador de cobertura = required-fields-no-user-specific,
sistematización Fase N+7 aplicada en CLAUDE/AGENTS/HANDOFF/rules
(commit anterior c9e3de5 en esta misma rama).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Introduces ProfileFile zod schema (strict) with shape {version, profile:
{name,description}, answers:{path.dotted:value}} and the validateProfile()
function that walks answers, looks up each path in the meta-schema, and
emits ProfileIssue[] covering:

- answer-unknown-path
- answer-type-mismatch (scalar type disagreement with field.type)
- answer-value-not-in-enum
- answer-array-item-type-mismatch
- answer-constraint-violation (pattern / minLength / maxLength /
  min / max / minItems / maxItems)

Profiles are treated as partial by design: fields not mentioned in
answers are not flagged, and the user-specific required fields
(identity.name, identity.description, identity.owner) are expected to
be missing from profiles.

Tests (TDD, 21 cases): canonical shape + each issue kind + multi-issue
aggregation + partial-profile acceptance. Reuses the existing
meta-schema parser; no new deps.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CLI entry point `npx tsx tools/validate-profile.ts <profile.yaml>
[--schema questionnaire/schema.yaml]` with exit codes mirroring the
questionnaire validator: 0 on OK, 1 on semantic issues, 2 on missing
file / unreadable YAML / missing profile arg.

formatReport() emits a human-readable block with schema + profile
paths, status, and one line per issue (kind, path, detail). Stdout for
diagnostics so CI captures them; stderr reserved for CLI usage errors.

Integration tests (15 cases) cover: valid canonical profiles (3 × exit
0), each invalid fixture (4 × exit 1 with matching issue kind in
stdout), missing file and missing arg (2 × exit 2), plus the unit-level
coverage of formatReport and validateProfileFile.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Canonical profiles (questionnaire/profiles/*.yaml):
- nextjs-app — web-app + TS + postgres + vitest + playwright + sentry
  + changesets + team (12 answers).
- agent-sdk — agent-sdk + python + pytest + opentelemetry + manual
  release + opus as default model (11 answers).
- cli-tool — cli + TS + vitest + semantic-release + solo (11 answers).

Each profile omits the 3 user-specific required fields (identity.name,
identity.description, identity.owner) by design. Coverage of the
remaining required fields (domain.type, stack.language,
testing.unit_framework) is 100%; overall answers land at ~55-65% of
the 18 schema fields per profile (MASTER_PLAN target ~60%).

Valid fixtures (tools/__fixtures__/profiles/valid/*.yaml) are literal
duplicates of the canonical profiles — kept in the tools scope since
the generator does not exist yet (Fase B3+). Consolidation with the
generator-side fixtures is deferred until B3 reveals a better
mechanism (e.g., loader or symlink).

Invalid fixtures (tools/__fixtures__/profiles/invalid/*.yaml) one per
issue kind exercised by the CLI integration tests:
- unknown-path.yaml         → answer-unknown-path
- type-mismatch.yaml        → answer-type-mismatch
- enum-out-of-values.yaml   → answer-value-not-in-enum
- pattern-violation.yaml    → answer-constraint-violation

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Runs \`npm run validate:profiles\` (new script) after validate:questionnaire
on the ubuntu+macos × node 20 matrix. Invokes the CLI on each of the
3 canonical profiles; any non-zero exit fails the job.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ROADMAP: arrastra drift de B1 (🔄 abierta → ✅ PR #1) y añade B2
  (🔄 abierta) con listado de entregables + brecha conocida. Fase B
  marcada "en curso (B2)".
- HANDOFF §1: snapshot apunta a B2 en curso + próxima B3.
- HANDOFF §9 reescrita: próxima rama = B3 con lectura mínima incluyendo
  profile-validator y el checkbox Fase N+7 como primer ítem del
  pre-flight.
- HANDOFF §10 sustituida: estado B2 (cerrando) con entregables, meta
  commit sistematización, 106 tests, coverage 95.97%, brecha conocida.
- MASTER_PLAN Rama B1 marcada ✅ PR #1. Rama B2 marcada ✅ con:
  ajuste vs plan original (fixtures en tools/ no generator/),
  brecha conocida, criterio de salida actualizado.
- docs/ARCHITECTURE.md §2 Profiles: añade shape canonical + principio
  de parcialidad + 5 issue kinds del profile validator + brecha +
  comando CLI + integración CI.
- .claude/rules/generator.md: bloque nuevo "Profiles" con location,
  shape, parcialidad, validator, fixtures y pasos para añadir un
  nuevo profile.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- extract tools/lib/read-yaml.ts (readAndParseYaml + errorMessage)
  shared by validate-profile and validate-questionnaire (2 call sites, meets
  pattern-before-abstraction threshold)
- inline 3 single-call helpers in profile-validator.ts
  (checkStringConstraints, checkArrayItems, constraintViolation) — switch
  branches become self-contained and readable
- collapse 3 duplicate canonical CLI smoke tests into one parameterised
  it.each — same coverage, less scaffolding

Net -45 LOC. Tests 106 green, coverage 95.92% lines / 89.91% branches
(above thresholds 90/85/90/90). Typecheck clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 20, 2026 07:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces a “profiles” layer for the questionnaire system, including canonical starter profiles and a CLI validator to ensure profiles stay consistent with questionnaire/schema.yaml.

Changes:

  • Added tools/lib/profile-validator.ts (Zod profile parser + schema-based answer validation) and tools/validate-profile.ts CLI with exit codes and reporting.
  • Added canonical profiles in questionnaire/profiles/ plus fixtures and tests for validator + CLI.
  • Extracted shared YAML read/parse helper to tools/lib/read-yaml.ts and wired CI/package scripts to validate profiles.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tools/validate-questionnaire.ts Reuses shared YAML read/parse helper to dedupe CLI I/O logic.
tools/validate-profile.ts New CLI entrypoint to validate a profile YAML against the schema and emit issues.
tools/validate-profile.test.ts Adds unit + CLI integration coverage for the validate-profile tool.
tools/lib/read-yaml.ts New shared YAML reader/parser utility with consistent error formatting.
tools/lib/profile-validator.ts New profile parser + validator emitting typed issue kinds.
tools/lib/profile-validator.test.ts Unit tests covering parsing and validator issue emission/aggregation.
tools/fixtures/profiles/valid/nextjs-app.yaml Valid profile fixture mirroring the canonical nextjs-app profile.
tools/fixtures/profiles/valid/cli-tool.yaml Valid profile fixture mirroring the canonical cli-tool profile.
tools/fixtures/profiles/valid/agent-sdk.yaml Valid profile fixture mirroring the canonical agent-sdk profile.
tools/fixtures/profiles/invalid/unknown-path.yaml Invalid fixture to trigger answer-unknown-path.
tools/fixtures/profiles/invalid/type-mismatch.yaml Invalid fixture to trigger answer-type-mismatch.
tools/fixtures/profiles/invalid/pattern-violation.yaml Invalid fixture to trigger answer-constraint-violation (pattern).
tools/fixtures/profiles/invalid/enum-out-of-values.yaml Invalid fixture to trigger answer-value-not-in-enum.
questionnaire/profiles/nextjs-app.yaml Adds canonical Next.js app starter profile (partial by design).
questionnaire/profiles/cli-tool.yaml Adds canonical CLI tool starter profile (partial by design).
questionnaire/profiles/agent-sdk.yaml Adds canonical agent SDK starter profile (partial by design).
package.json Adds validate:profiles script to validate all canonical profiles.
docs/ARCHITECTURE.md Documents profile shape, partiality, validator semantics, and CLI usage.
ROADMAP.md Updates phase/progress tracking to reflect B2 work.
MASTER_PLAN.md Updates B1 completion and B2 scope/acceptance criteria to include validator/CI.
HANDOFF.md Updates current phase and adds/propagates the “Context gate” process details.
CLAUDE.md Adds “Fase N+7 Context gate” to the documented lifecycle.
AGENTS.md Updates non-negotiable rules and “continúa/siguiente” flow to include context gate.
.github/workflows/ci.yml Adds CI step to run validate:profiles.
.claude/rules/generator.md Documents profile location/shape/partiality and validation/fixture expectations.
.claude/rules/docs.md Adds context-traceability checklist item for branches started via compact/clear.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +84 to +100
it("exits 1 for type-mismatch fixture", () => {
const r = runCli(["tools/__fixtures__/profiles/invalid/type-mismatch.yaml"]);
expect(r.code).toBe(1);
expect(r.stdout).toMatch(/answer-type-mismatch/);
}, 30000);

it("exits 1 for enum-out-of-values fixture", () => {
const r = runCli(["tools/__fixtures__/profiles/invalid/enum-out-of-values.yaml"]);
expect(r.code).toBe(1);
expect(r.stdout).toMatch(/answer-value-not-in-enum/);
}, 30000);

it("exits 1 for pattern-violation fixture", () => {
const r = runCli(["tools/__fixtures__/profiles/invalid/pattern-violation.yaml"]);
expect(r.code).toBe(1);
expect(r.stdout).toMatch(/answer-constraint-violation/);
}, 30000);
Comment on lines +73 to +74
if (field.pattern !== undefined && !new RegExp(field.pattern).test(value)) {
issues.push(violation(path, `value '${value}' does not match pattern /${field.pattern}/`));
Copy link
Copy Markdown
Owner Author

javiAI commented Apr 20, 2026

Buen cierre de B2. La base está sólida y no veo nada bloqueante para merge.

Un único apunte no bloqueante para B3 o una follow-up pequeña:

  • En profile-validator.ts, los fields enum no distinguen entre type mismatch y value not in enum. Ahora mismo, si alguien mete un array u objeto donde el field es enum, el issue emitido sería answer-value-not-in-enum en lugar de answer-type-mismatch. Eso no rompe B2, pero sí hace el reporting un poco menos preciso para el runner futuro y para overrides CLI.

No pediría cambios por esto; sólo lo dejaría apuntado para mantener la taxonomía de errores lo más limpia posible cuando llegue B3.

- add invalid fixture array-item-type-mismatch.yaml + CLI test
  (Copilot: CLI coverage gap for answer-array-item-type-mismatch)
- validate pattern is a compilable regex at meta-schema parse time
  (Copilot: new RegExp(field.pattern) could throw — now a clear
  schema-scoped error via zod .refine, exit 2 instead of uncaught)
- document deferred B2 brecha: enum fields emit value-not-in-enum
  instead of type-mismatch when given array/object (per user PR
  comment — non-blocking, noted for B3)

107 tests green, coverage still above thresholds, typecheck clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@javiAI
Copy link
Copy Markdown
Owner Author

javiAI commented Apr 20, 2026

Addressed review feedback in 95515d3:

FIX #1tools/validate-profile.test.ts:100: added fixture tools/__fixtures__/profiles/invalid/array-item-type-mismatch.yaml + CLI test assertion. All 5 ProfileIssueKinds now covered at CLI level.

FIX #2tools/lib/profile-validator.ts:74: pushed the compilable-regex check one level up to tools/lib/meta-schema.ts via z.string().refine(...). An invalid regex in questionnaire/schema.yaml now fails schema parsing with a field-scoped zod error (exit 2), so the validator never sees an uncompilable pattern. Applied to both StringField.pattern and TextQuestion.validation.pattern.

Deferred — user issue comment on enum type-mismatch taxonomy: per javiAI's own comment ("No pediría cambios por esto; sólo lo dejaría apuntado") documented as known gap in MASTER_PLAN.md §B2 for B3.

107 tests green, typecheck clean.

@javiAI javiAI merged commit f361c19 into main Apr 20, 2026
2 checks passed
@javiAI javiAI deleted the feat/b2-profiles-starter branch April 20, 2026 07:57
javiAI pushed a commit that referenced this pull request Apr 20, 2026
Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta
previos a la implementación TDD del runner:

1. Context-gate hardening (heredado de sesión previa):
   - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA
     elección explícita del usuario. Nunca decide por su cuenta.
   - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento
     (parar + esperar) alineado con MEMORY.md feedback_context_gate.
   - HANDOFF.md §3 checklist: presentar + esperar explícito antes
     de emitir resume prompt o proceder a Fase -1.

2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1):
   - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso;
     B2 marcada como completada (PR #2) con 2 brechas documentadas;
     B3 abierta con scope y ajuste (token-budget diferido).
   - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1.
   - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs).
   - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por
     "Estado B3 en curso" con decisiones Fase -1 aprobadas
     + archivos previstos + brechas heredadas.
   - MASTER_PLAN.md §B3: nota explícita del diferimiento de
     token-budget.ts, re-export desde tools/lib/, flags
     --out y --dry-run rechazados, semántica exit codes
     user-specific.

NO parte funcional del runner. La implementación arranca en
commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md
regla #3 y AGENTS.md regla #4).

Trazabilidad Fase -1: aprobada explícitamente por el usuario
en esta sesión tras presentación de scope + ambigüedades +
alternativas + test plan + docs plan. Marker creado en
.claude/branch-approvals/feat_b3-generator-runner.approved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 20, 2026
* chore(meta): pre-kickoff B3 — context-gate hardening + docs sync

Commit 1 de feat/b3-generator-runner. Bundle de dos cambios meta
previos a la implementación TDD del runner:

1. Context-gate hardening (heredado de sesión previa):
   - AGENTS.md regla #1: Claude presenta las 4 opciones y ESPERA
     elección explícita del usuario. Nunca decide por su cuenta.
   - AGENTS.md §3 paso 3 "continúa": mismo endurecimiento
     (parar + esperar) alineado con MEMORY.md feedback_context_gate.
   - HANDOFF.md §3 checklist: presentar + esperar explícito antes
     de emitir resume prompt o proceder a Fase -1.

2. Docs sync previo a B3 (Fase N+3 aplicada en commit 1):
   - ROADMAP.md: Fase B pasa de B2 en curso a B3 en curso;
     B2 marcada como completada (PR #2) con 2 brechas documentadas;
     B3 abierta con scope y ajuste (token-budget diferido).
   - HANDOFF.md §1: B3 en curso, B2 cerrada en f361c19, próxima C1.
   - HANDOFF.md §9: Próxima rama pasa a ser C1 (renderers core docs).
   - HANDOFF.md §10: reemplaza "Estado B2 cerrando" por
     "Estado B3 en curso" con decisiones Fase -1 aprobadas
     + archivos previstos + brechas heredadas.
   - MASTER_PLAN.md §B3: nota explícita del diferimiento de
     token-budget.ts, re-export desde tools/lib/, flags
     --out y --dry-run rechazados, semántica exit codes
     user-specific.

NO parte funcional del runner. La implementación arranca en
commit 2 con TDD estricto (tests rojos primero, por CLAUDE.md
regla #3 y AGENTS.md regla #4).

Trazabilidad Fase -1: aprobada explícitamente por el usuario
en esta sesión tras presentación de scope + ambigüedades +
alternativas + test plan + docs plan. Marker creado en
.claude/branch-approvals/feat_b3-generator-runner.approved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(b3): red tests for runner — fixtures + loader + validators + CLI

TDD step 1 (per CLAUDE.md regla #3 + .claude/rules/tests.md):
failing tests written before implementation. All three test
suites fail to load because profile-loader.ts / validators.ts /
run.ts don't exist yet (commit 3 will turn them green).

Fixtures (generator/__fixtures__/profiles/):
- valid-partial/profile.yaml — all non-user-specific required
  present, 3 user-specific missing. Expects exit 0 + warning.
- missing-required/profile.yaml — omits domain.type. Expects
  exit 1 (completeness error).
- invalid-value/profile.yaml — stack.language out of enum.
  Expects exit 1 (profile-validator issue answer-value-not-in-enum).

Test files:
- generator/lib/profile-loader.test.ts — 5 tests: happy, missing
  file, malformed YAML, invalid shape (missing profile key),
  strict rejection of unknown top-level key.
- generator/lib/validators.test.ts — 5 tests for completenessCheck:
  only user-specific missing → 3 warnings; all present → clean;
  1 required missing → 1 error + 3 warnings; 2 required missing
  → 2 errors; required with default value → satisfied (uses a
  synthetic schema to isolate default semantics from canonical).
- generator/run.test.ts — 15 tests split in three describes:
    runValidation (unit) — 5 tests covering 0/1/2 exit codes
      across fixtures + missing file + malformed YAML.
    formatReport — 4 tests covering OK / WARN / FAIL rendering
      and required-missing + enum issue lines.
    CLI integration (spawnSync) — 9 tests covering valid,
      --validate-only, missing-required, invalid-value,
      rejection of --out and --dry-run with exact deferral
      message ("flag --X not supported in B3; planned for C1"),
      missing --profile, missing file, unknown flag.

Decisions locked in tests:
- loadProfile return shape: { ok: true, profile } | { ok: false, error }.
- completenessCheck return shape: { errors[], warnings[] }.
- USER_SPECIFIC_PATHS exported from validators.ts for reuse + test assertion.
- runValidation takes only profilePath (schema hard-coded inside).
- formatReport takes (result, profilePath); no schema param needed.

Vitest output (expected): 3 failed suites, all "Failed to load url
./<module>.ts. Does the file exist?" — classic TDD red state.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(b3): generator runner — profile loader + completeness + CLI

Implementación mínima que pone verde el commit anterior (135/135
tests en el proyecto; 28/28 en generator/). Runner B3 cierra el
círculo profile YAML → zod-validado → completeness-check → exit
0/1/2. Sin renderers todavía (llegan en C*).

Ficheros:

- generator/lib/schema.ts — re-export puro de parseSchemaFile /
  parseProfileFile / validateProfile + tipos ProfileFile /
  ProfileIssue / ProfileIssueKind / SchemaFile desde tools/lib/.
  3ª aplicación de pattern-before-abstraction (la 2ª fue
  tools/lib/read-yaml.ts en B2). Ninguna lógica duplicada.

- generator/lib/profile-loader.ts — loadProfile(path): discriminated
  union { ok: true, profile } | { ok: false, error }. Reúsa
  readAndParseYaml + errorMessage de tools/lib/read-yaml.ts.

- generator/lib/validators.ts — completenessCheck(schema, profile)
  retorna { errors, warnings }. Escanea required fields del schema;
  si el path está ausente del profile Y el schema no declara default,
  emite error/warning según USER_SPECIFIC_PATHS (identity.name /
  description / owner → warning; resto → error). La constante se
  exporta para que los tests puedan aseverar la lista exacta sin
  duplicarla.

- generator/run.ts — CLI entrypoint:
    * parseArgs strict con --profile (req), --validate-only,
      --out y --dry-run declarados pero rechazados explícitamente
      con mensaje exacto "flag --X not supported in B3; planned
      for C1". Evita falsa sensación de funcionalidad.
    * Schema hard-coded a questionnaire/schema.yaml (sin flag).
    * runValidation(profilePath) + formatReport(result, profilePath)
      exportadas para tests unit; main() con /* v8 ignore */ para
      excluir parseArgs + exit paths del coverage (mismo patrón
      que tools/validate-profile.ts).
    * Semántica exit: profile ok sin blockers → 0 (warnings permitidas);
      issues o completeness-errors → 1; I/O, YAML roto, args
      inválidos o flag diferido → 2. Profile shape invalid (top-level
      strict) se mapea a exit 1 porque es un error de contenido,
      no de I/O.

Verificaciones locales:
  - tsc --noEmit: limpio.
  - vitest run: 135 tests en 13 suites (28 nuevos en generator/).
  - vitest run --coverage: lines 95.36%, functions 98.52%,
    branches 88.83%, statements 95.36%. Todos por encima del
    threshold del proyecto (90/90/85/90).
  - npm run validate:profiles: OK x3 (no regresión B2).
  - npm run validate:questionnaire: OK (no regresión B1).
  - Smoke E2E: tsx generator/run.ts --profile nextjs-app.yaml →
    exit 0 + 3 warnings user-specific. --out rechazado con
    mensaje exacto esperado, exit 2.

Pendiente en commits siguientes (docs-sync Fase N+3 + posible
package.json script validate:generator):
  - Actualizar docs/ARCHITECTURE.md §3 con shape real (signatures
    exportadas, exit codes definitivos) si difiere del snippet
    actual.
  - Añadir sección "Deferrals" en .claude/rules/generator.md
    (token-budget.ts) + sección "Reuso desde tools/lib" (3ª
    aplicación pattern-before-abstraction).
  - Script npm run validate:generator — una invocación mínima que
    corra el runner sobre los 3 canónicos para detectar
    regresiones en CI futuro.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(b3): docs-sync + CI smoke + rules/generator deferrals

Docs-sync Fase N+3 + endurecimiento de CI antes del pre-PR gate.

CI smoke (no regresión del runner):
- package.json: nuevo script `validate:generator` que corre
  generator/run.ts sobre los 3 profiles canónicos (nextjs-app,
  agent-sdk, cli-tool). Exit 0 esperado con warnings
  user-specific.
- .github/workflows/ci.yml: nuevo step "Validate generator
  (smoke — 3 canonical profiles)" entre "Validate profiles" y
  "Test (with coverage)". Detecta regresiones de integración
  antes de que los unit tests corran.

Docs ARCHITECTURE.md §3 "Generador":
- Reemplaza el snippet aspiracional pre-B3 (que importaba
  renderers + fs-writer que aún no existen) por las signatures
  reales entregadas: RunResult shape, runValidation,
  formatReport.
- Documenta exit codes 0/1/2 + deferrals de B3 (token-budget,
  --schema, --out, --dry-run).
- Añade enlaces relativos a los 3 archivos nuevos
  (generator/run.ts, lib/schema.ts, lib/profile-loader.ts,
  lib/validators.ts).

Rules .claude/rules/generator.md:
- Sección nueva "Runner (entregado en B3)": fixtures de
  integración, semántica exit codes, flags diferidos, smoke CI.
- Sección nueva "Deferrals (B3)": token-budget.ts, --schema,
  --out/--dry-run con razón explícita de cada uno para que
  ramas futuras puedan decidir cuándo reintroducir.
- Sección nueva "Reuso desde tools/lib/
  (pattern-before-abstraction, 3ª aplicación)": norma de
  no-duplicación + historial de las 3 aplicaciones (B1
  condition-parser, B2 read-yaml, B3 schema re-export). Fija
  el umbral para bifurcar cuando aparezca una 4ª aplicación
  con lógica generator-only.

Verificaciones:
  - tsc --noEmit: OK.
  - vitest run: 135/135 tests.
  - npm run validate:generator: 3 x "status: OK" con warnings
    user-specific esperados.
  - npm run validate:profiles: 3 x OK (no regresión B2).
  - npm run validate:questionnaire: OK (no regresión B1).

Post-commit: rama lista para /pos:pre-commit-review
(equivalente manual: subagent code-reviewer sobre el diff
completo de la rama) y pre-PR gate manual. ROADMAP + HANDOFF
+ MASTER_PLAN ya sincronizados en el commit 1 de la rama.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(b3): address pre-commit review — tighter flag guard + pinned enum test

Two findings from the manual code-reviewer pass (equivalent of
/pos:pre-commit-review) on feat/b3-generator-runner.

1. generator/run.ts line 129: changed --dry-run guard from truthy
   check to `values["dry-run"] !== undefined` for consistency with
   the --out guard above. Node's parseArgs with type:"boolean"
   does not emit false natively (only true on presence or
   undefined on absence), so the observable behavior is unchanged
   in practice — but the defensive shape now matches --out and
   survives any future parseArgs or caller variant that could
   produce an explicit false.

2. generator/run.test.ts: the "exit 1 value-not-in-enum" test used
   `r.issues.some(...)` as its only assertion, which would still
   pass if a regression also populated r.errors or changed r.issues
   length. Tightened to pin the exact shape: issues length 1,
   kind + path on issues[0], errors empty, warnings equal to the
   3 user-specific paths. Matches the assertion style already
   used in the sibling "missing-required" test and in
   validators.test.ts.

Reviewer's third finding (validate:generator CI step semantically
redundant with Test-with-coverage) was considered and kept as-is.
Reason documented in .claude/rules/generator.md § Runner: the
smoke step catches broken tsx invocation / main() wiring before
unit tests run, and the 3-profile loop adds ~3s to CI for
earlier signal. Not inertia — deliberate design.

Verification:
  - tsc --noEmit: OK.
  - vitest run generator/: 28/28.
  - No production behavior change for --dry-run in common usage
    (parseArgs produces true or undefined, not false).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(b3): address PR #3 review — docs stdout correction

Copilot flagged a mismatch between docs and runner behavior: 2 doc
sites claimed user-specific warnings land in stderr, while the CLI
prints the full formatReport (including warnings) to stdout and the
integration tests assert on stdout. Align docs with implementation;
do not change the CLI/tests contract.

- MASTER_PLAN.md §B3 exit-codes line: "stderr" → "stdout (dentro del reporte)".
- generator/__fixtures__/profiles/valid-partial/profile.yaml:5 header: same correction.

Typecheck + runner tests (18/18) still pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI pushed a commit that referenced this pull request Apr 21, 2026
Closes the 3 points raised in PR #11 review (Copilot + human feedback).

1) Hook `tool_input` validation (BLOCKER):
   - `tool_input is None` → `{}` (pass-through).
   - `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`.
     Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError`
     on list/string payloads, crashing the hook instead of responding with a controlled
     contract. +3 in-process tests (null, list, string) cover the new branches.

2) Docs alignment with actual safe-fail policy:
   - `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed
     payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a
     Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches
     the actual hook behavior and establishes a canonical policy for D2..D6 hooks.

3) CI coverage (IMPORTANT):
   - New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` +
     `macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q
     --cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA
     per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not
     "passed locally".

Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line
`sys.exit(main())` under `__main__`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 21, 2026
* test(d1): failing tests for pre-branch-gate + test-env bootstrap

Kickoff block (Fase 0) — rama feat/d1-hook-pre-branch-gate:

Scope (Fase -1 cerrada):
- hooks/pre-branch-gate.py (impl en commit siguiente): PreToolUse(Bash) que bloquea
  `git checkout -b`, `git switch -c`, `git worktree add -b` sin marker
  `.claude/branch-approvals/<slug-sanitized>.approved`.
- Test pair pytest para el hook (este commit, RED).
- Bootstrap mínimo del test env: `.venv` local + `requirements-dev.txt`
  (pytest + pytest-cov). Sin ruff, sin selftest, sin hooks/_lib/ abstraído.

Decisiones Fase -1 cerradas (vs MASTER_PLAN §D1):
1. Alcance: cubre checkout -b, switch -c, worktree add -b. Excluye
   `git branch <slug>` (crea ref sin iniciar trabajo).
2. Sin bypass env var. Bypass legítimo = crear marker explícito.
3. Doble log: `.claude/logs/pre-branch-gate.jsonl` +
   `.claude/logs/phase-gates.jsonl` (evento branch_creation).
4. Parsing con shlex.split (robusto a quoting). Soporta global options
   pre-subcommand.
5. Mensaje al bloquear: ruta exacta del marker + comando `touch` sugerido
   + referencia textual a `MASTER_PLAN.md`. Sin parseo del plan.
6. Pass-through silencioso: cero ruido salvo branch creation.
7. Sin `hooks/_lib/` compartido (CLAUDE.md regla 7: ≥2 reps antes de
   abstraer; D1 es la primera).

Tests añadidos (RED intencional):
- hooks/tests/test_pre_branch_gate.py
  · detección de branch creation
  · pass-through silencioso
  · sanitización de slug
  · doble log allow/deny
  · robustez ante stdin/comandos inválidos
- Fixtures: 6 JSON en hooks/tests/fixtures/payloads/.

Bootstrap del env:
- requirements-dev.txt: pytest>=7, pytest-cov>=4
- .gitignore: /.venv/, __pycache__/, *.pyc, .pytest_cache/
- ejecución local: \`.venv/bin/pytest hooks/tests/\`

Siguientes commits previstos:
- feat(d1): implement hook + chmod +x
- docs(d1): docs-sync

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(d1): implement pre-branch-gate hook + in-process coverage tests

Implementation:
- hooks/pre-branch-gate.py (executable, 4.6K, stdlib-only):
  · Detects `git checkout -b`, `git switch -c`, `git worktree add -b` via
    shlex tokenisation. Handles git global options pre-subcommand
    (`git -c k=v ...`, `git --git-dir=X ...`, `git -C /p ...`).
  · `extract_branch_slug()` returns None for non-branch commands
    (`git status`, `git branch <x>`, `git worktree list`, etc.).
  · On branch-creation command: sanitizes slug (`/` → `_`) and checks
    `.claude/branch-approvals/<sanitized>.approved`.
  · Marker present → allow (silent, exit 0) + append allow event to both
    logs.
  · Marker absent → deny (exit 2) with `decisionReason` containing:
    exact marker path, suggested `touch` command, textual reference to
    MASTER_PLAN.md, and the blocked command.
  · Pass-through silent on all non-branch Bash, all non-Bash tools,
    missing/empty fields, and shlex-unparseable commands.
  · Malformed JSON stdin → deny (exit 2).
  · Double logging: `.claude/logs/pre-branch-gate.jsonl` +
    `.claude/logs/phase-gates.jsonl` (event: branch_creation).

Test suite (55 passing, 99% coverage on pre-branch-gate.py):
- 23 subprocess integration tests (pre-existing, from commit 1).
- 32 in-process unit tests added for coverage visibility, covering:
  · sanitize_slug: 3 cases.
  · extract_branch_slug: 20 branches (all subcommand/flag/global-opt
    combinations + negative cases).
  · build_deny_reason: shape assertions.
  · main() direct calls with monkeypatched chdir + stdin: 8 paths
    (malformed, non-dict, non-bash, missing input, empty command,
    non-branch, branch-with-marker, branch-without-marker).
- The single uncovered line (175) is `sys.exit(main())` under
  `if __name__ == "__main__":` — not reachable from in-process tests
  and intrinsic to script entry.

Run locally:
  .venv/bin/pytest hooks/tests/ --cov=hooks --cov-report=term-missing

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d1): docs-sync — mark D1 ✅ + record ajustes vs plan

- ROADMAP.md:
  · Fase D row: ⏳ pendiente → ⏳ parcial (D1 ✅).
  · feat/d1-hook-pre-branch-gate row: ⏳ → ✅ (PR pendiente).
  · Nueva sección "Progreso Fase D" con entregables + ajustes vs plan
    original (alcance ampliado a worktree add -b, sin bypass env var,
    sin hooks/_lib/, bootstrap de test env, in-process tests para
    coverage).

- HANDOFF.md:
  · § Snapshot: Fase actual C5 → D1 cerrada. Siguiente D1 → D2.
  · § Gotchas: pre-branch-gate.py "aún no existe" → vivo desde D1;
    resto de hooks (session-start, pre-write-guard, post-action,
    pre-compact, stop-policy-check) siguen ausentes como stubs
    tolerados.
  · § Próxima rama: reescrito D1 → D2 con scope + lectura mínima.
  · § Estado C5 → Estado D1: resumen del entregable + "lo que D1 NO
    hace" + apuntes para D2 (patrón hook consolidado, señal para
    extraer hooks/_lib/ en D2 cuando sea 2ª repetición).

- MASTER_PLAN.md § Rama D1:
  · Status: ✅ COMPLETADA (PR pendiente).
  · Scope entregado (detalle real) + Ajustes vs plan original
    (alcance, parsing, logging, decision reason, deferrals).

- docs/ARCHITECTURE.md § 7 Capa 1:
  · Referencia a hooks/pre-branch-gate.py como implementación canónica
    del patrón de hook enforcer (shebang + stdlib-only + pass-through
    silencioso + shlex parsing + double log shape).

- .claude/rules/hooks.md:
  · Nueva sub-sección "Primer hook entregado" con la estructura
    consolidada: pass-through silencioso, shlex, sanitización local
    (no helper todavía), decisionReason constructivo, double log,
    patrón de tests (subprocess integration + in-process unit via
    importlib.util por guión en el nombre).

Tests locales: .venv/bin/pytest hooks/tests/ --cov=hooks
→ 55 passed, 99% coverage on hooks/pre-branch-gate.py.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(d1): simplify hook + close review gap on git global opts

- _flag_value(): extract shared -b/-c lookup (was duplicated across checkout/switch/worktree).
- log(): collapse dual-log (hook-scoped + phase-gates) into a single local helper in main().
- GIT_GLOBAL_OPTS_WITH_ARG: add --exec-path and --upload-pack (pre-commit-review gap: space-form of these options previously consumed the subcommand as the argument, causing a detection miss on `git --exec-path /x checkout -b slug`).
- Tests: +2 cases covering the new global opts (space-form), 57 passed, 99% coverage (line 166 `sys.exit(main())` only miss; __main__-gated, intrinsic).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(d1): review follow-up — tool_input guard, docs alignment, CI job

Closes the 3 points raised in PR #11 review (Copilot + human feedback).

1) Hook `tool_input` validation (BLOCKER):
   - `tool_input is None` → `{}` (pass-through).
   - `tool_input` no-dict (list/string/etc.) → `deny` exit 2 with `decisionReason`.
     Previously `payload.get("tool_input") or {}` + `.get()` would raise `AttributeError`
     on list/string payloads, crashing the hook instead of responding with a controlled
     contract. +3 in-process tests (null, list, string) cover the new branches.

2) Docs alignment with actual safe-fail policy:
   - `docs/ARCHITECTURE.md §7` and `.claude/rules/hooks.md`: explicit that a malformed
     payload (empty stdin, invalid JSON, top-level no-dict, `tool_input` no-dict for a
     Bash call) maps to `deny` exit 2 + `decisionReason`, NOT pass-through. This matches
     the actual hook behavior and establishes a canonical policy for D2..D6 hooks.

3) CI coverage (IMPORTANT):
   - New `python` job in `.github/workflows/ci.yml` — matrix `ubuntu-latest` +
     `macos-latest` × Python `3.10` + `3.11`, running `pytest hooks/tests -q
     --cov=hooks --cov-report=term-missing`. Pins `actions/setup-python@v5.6.0` by SHA
     per `.claude/rules/ci-cd.md §Reglas duras #2`. D1 now enforced by real CI, not
     "passed locally".

Totals: 60 tests (was 57), 99% coverage maintained (same single uncovered line
`sys.exit(main())` under `__main__`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Update ROADMAP.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* docs(d1): close count drift + CI contract gap flagged in review

Follow-up to the second Copilot review pass on PR #11. All local repo
edits — no hook/test code changes.

1) Count drift (Copilot: ROADMAP.md:225 + HANDOFF.md:141):
   - ROADMAP.md: remove hardcoded "55 tests en 8 clases" for the D1 test
     suite. Continues the pattern set by the user's own UI edit in ecdcbbc
     (removed "32 unit tests" from line 237) — docs describe the suite by
     shape, not by a brittle number that drifts with every new test.
   - HANDOFF.md §10: reflect the actual safe-fail contract (malformed
     stdin → deny, non-dict tool_input → deny) instead of the outdated
     "stdin vacío/malformado → exit 2 sin crash pero no loggea" bullet,
     and drop the "55 tests pytest" number.

2) CI contract gap (Copilot: ci.yml:74 — mypy/ruff declared in
   policy.yaml:68-74 + ci-cd.md:21-24 but not in the workflow):
   - policy.yaml.pre_push: inline comment clarifies that `command_meta`
     declares the aspirational contract; actual enforcement lands
     incrementally in CI + pre-pr-gate.py (Fase D4). Lists which checks
     are live today (tsc, vitest, pytest hooks) and which are deferred
     (mypy hooks, eslint, prettier, ruff) so the doc no longer reads as
     a broken promise.
   - .claude/rules/ci-cd.md §Workflows obligatorios §1: split into
     "Aterrizado" vs "Diferidos a rama dedicada", matching the actual
     state, plus an invariant that future branches adding a check must
     also move the bullet. Keeps the rule honest and makes drift
     explicit going forward.

Scope preserved: no mypy/ruff added to CI (D1 Fase -1 explicitly
excluded them). The fix is docs/contract-alignment, not tooling
expansion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
javiAI pushed a commit that referenced this pull request Apr 21, 2026
Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2
dentro de la rama, sobre el trigger "gh pr create".

Comportamiento
--------------
- Matcher: shlex.split(command); gate solo cuando tokens[:3] ==
  ["gh","pr","create"] (cubre --draft / --title / --body / --base).
  Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create,
  git push, git status, non-Bash) → pass-through silencioso, cero log.
- Skip advisory (pass-through + log en hook log, NO phase log):
    * branch main / master / HEAD detached
    * git unavailable (cwd no es repo)
    * merge-base HEAD main no resoluble (main borrada localmente)
- Empty diff (HEAD == base) → deny exit 2 con reason "empty PR"
  dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF.
- Docs-sync check (reglas hardcoded, mirror de
  policy.yaml.lifecycle.pre_pr.docs_sync_*):
    baseline   : ROADMAP.md + HANDOFF.md siempre.
    conditional: generator/**            → docs/ARCHITECTURE.md
                 hooks/** (no tests/)    → docs/ARCHITECTURE.md
                 skills/**               → .claude/rules/skills-map.md
                 .claude/patterns/**     → docs/ARCHITECTURE.md
  Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola
  vez aunque múltiples prefijos lo exijan. Triggering paths capeados a
  3 por doc en el reason, con sufijo "... (+N more)" cuando >3.
- Safe-fail D1 blocker canonical: stdin vacío / JSON inválido /
  top-level no-dict / tool_input no-dict → deny exit 2. Command
  ausente / no-string / vacío / shlex unparsable → pass-through 0.
- Double log en decisiones (allow/deny):
    .claude/logs/pre-pr-gate.jsonl   {ts, hook, command, decision, reason}
    .claude/logs/phase-gates.jsonl   {ts, event:"pre_pr", decision}
  + 3 entradas status:"deferred" en hook log por cada decisión real
    (skills_required, ci_dry_run_required, invariants_check). Estas
    NO se emiten en skip ni en pass-through — el test las exige
    solo para gated decisions.

Simplify pass (N+1)
-------------------
- Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff).
- _conditional_triggers docstring: eliminada (privada, nombre self-explaining).
- main(): missing, _triggers → missing, _ (unused var sin pseudónimo).

Tests
-----
- 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed
  girados a pass, 10 previamente falsos positivos confirmados como
  reales, 39 @needs_hook desbloqueados con el módulo ya disponible).
- Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión.
- Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el
  except FileNotFoundError/SubprocessError de _run_git y un branch
  out.strip()==""; sobre el target 90%).

Deferrals explícitos documentados en Fase -1 (no se tocan en D4)
----------------------------------------------------------------
- Migración de reglas hardcoded → parser de policy.yaml (rama propia).
- Migración de paths D3 (pre-write-guard) → policy-driven (misma rama).
- Matcher de git push --force (no gated por D4).
- pre-write-guard.py intacto (cero edit).
- policy.yaml intacto (cero edit).
- requirements-dev.txt intacto (sin pyyaml).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
javiAI pushed a commit that referenced this pull request Apr 21, 2026
…+ ARCHITECTURE + rules/hooks

Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2,
Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta
sincronización antes de permitir gh pr create — dogfooding del propio blocker.

- ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93%
  coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como
  siguientes en cola.
- HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a
  la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el
  cierre numérico.
- MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas
  hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory,
  razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos
  helpers, 3 cuts de simplify).
- docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a
  D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip
  advisory, empty-diff dedicated reason.
- .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync
  reglas en tablas + reuso de _lib + 96 tests / 93% cov.

Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque
ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md
requerido por los paths hooks/ tocados en Fase 2.
javiAI added a commit that referenced this pull request Apr 22, 2026
* test(d4): kickoff — failing suite for pre-pr-gate hook (docs-sync enforcement on gh pr create)

## Kickoff — D4 (feat/d4-hook-pre-pr-gate)

Contexto: continuación post-merge D3 (9aed1ee en main) y PR #14 docs (fx
Knowledge Plane. Fase -1 ejecutada y aprobada en la misma sesión (v1
rechazada por scope inflado; v2 recortada y aprobada). Decisiones cerradas:
solo  como trigger; docs-sync como único enforcement real;
skills_required + ci_dry_run_required + invariants_check como advisory
scaffold no-blocking; sin pyyaml; sin migración D3 hardcode→policy; sin
tocar pre-write-guard.py; merge-base HEAD main como baseline; git no
disponible / base no resuelto → pass-through con advisory log explícito
(no silencioso); diff vacío → deny exit 2 con mensaje distinto de docs-sync.

### Scope
- Nuevo hook blocker  (shape D1, no D2).
  PreToolUse(Bash) matcher  únicamente.  deferrido.
- docs-sync enforcement real (blocker exit 2) con reglas hardcoded en el
  hook (mirror textual de ;
  migración a policy-driven se aborda en rama policy-loader propia junto
  a los paths hardcoded de D3):
  * baseline: ROADMAP.md + HANDOFF.md en diff.
  * conditional: generator/** | hooks/** → docs/ARCHITECTURE.md;
    skills/** → .claude/rules/skills-map.md;
    .claude/patterns/** → docs/ARCHITECTURE.md.
- Advisory scaffold no-blocking (logueado, no deniega). Activable sin
  cambio de shape cuando sus ramas dedicadas aporten sustrato:
  * skills_required → skills not yet landed (Fase E*).
  * ci_dry_run_required → ci_dry_run deferred to dedicated rama.
  * invariants_check → invariants directory empty — deferred.
- Pass-through + advisory log (no silencioso) en: main / master / detached
  HEAD; git no disponible / no es repo; merge-base HEAD main no resuelve.
- Diff vacío → deny exit 2 con reason empty PR (NO menciona docs-sync).
- Double log:  +
  (evento ) sobre decisiones reales (allow/deny). Advisory skip
  sólo en hook log, NO en phase-gates. Pass-through silencioso (no-match)
  sin log (mismo patrón D1/D3).
- Safe-fail blocker canonical D1: stdin vacío / JSON malformado / top-level
  no-dict / tool_input no-dict → deny exit 2. command no-string o vacío /
  shlex unparsable → pass-through.

### Archivos a crear en la rama
- hooks/pre-pr-gate.py (Fase 2, GREEN — no en este commit).
- hooks/tests/test_pre_pr_gate.py (este commit, RED).
- hooks/tests/fixtures/payloads/gh_pr_create.json (este commit).
- hooks/tests/fixtures/payloads/gh_pr_create_draft.json (este commit).
- hooks/tests/fixtures/payloads/gh_pr_list.json (este commit).
  Reutilizo git_status.json y non_bash.json heredados de D1/D2.

### Archivos explícitamente NO tocados (deferrals documentados)
- hooks/pre-write-guard.py — sin migración de paths hardcoded a policy.yaml.
- policy.yaml — sin nueva clave; sigue declarativo no-parseado.
- hooks/_lib/ — sin policy.py; cero helpers nuevos.
- requirements-dev.txt — sin pyyaml (blocker explícito de scope D4).

### Riesgos
- Tests con real-git subprocess setup (git init + config + commits) son
  más pesados que D3 (D3 no necesitaba git). Mitigación: fixture
  encapsula el setup; helpers  /  mantienen tests
  legibles.
- Detached HEAD devuelve HEAD de . Se trata
  como skip explícito (no gated, no implicit deny).
-  requiere main local presente. Si main fue
  borrada → advisory skip (testeado explícitamente).

### Test plan (Fase 1, este commit)
- TestMatcherDetection (11): gh pr create + variantes (title/draft/body/
  base) → gate; gh pr list/view, gh issue create, git status, git push,
  non-Bash → pass-through.
- TestBranchSkip (3): main, master, detached → advisory skip + sin phase log.
- TestGitUnavailable (1): cwd sin git repo → advisory skip.
- TestMergeBaseUnresolved (1): main borrada → advisory skip con reason
  merge-base.
- TestEmptyDiff (2): empty PR → deny + mensaje sin docs-sync; reason
  incluye base.
- TestDocsSyncBaseline (4): ROADMAP / HANDOFF / ambos faltando → deny;
  ambos presentes sin conditional → allow.
- TestDocsSyncConditional (9): generator/hooks/skills/patterns triggers;
  multi-conditional dedup ARCHITECTURE.md; tests/** fuera de conditional.
- TestDecisionReason (3): reason menciona CLAUDE.md + docs-sync;
  triggering paths listados; cap a 3 con indicador more.
- TestAdvisoryLogs (4): deny / allow / empty-diff → 3 entradas deferred;
  skip → 0 entradas deferred.
- TestLogging (8): double-log sólo en decisiones reales; skip sólo hook
  log; no-match / non-Bash / gh pr list → cero log; shape de entry.
- TestRobustness (11): blocker safe-fail canonical D1.
- TestIsGhPrCreateUnit (14): matcher classifier in-process.
- TestCheckDocsSyncUnit (13): docs-sync classifier in-process.
- TestMainInProcess (12): coverage paths subprocess no mide.

Target: 96 tests. Coverage ≥90% sobre pre-pr-gate.py, ≥90% combinado hooks/.
D1 (60) + D2 (66) + D3 (83) = 209 tests intactos. Suma esperada: 305.

Estado RED ahora: 47 failed + 10 passed + 39 skipped. Los 39 skipped son
tests @needs_hook (in-process) que se activan cuando existe el módulo. Los
10 passed son falsos positivos —  retorna exit 2, que
coincide con la expectativa deny exit 2 de los tests gated; al entregar la
impl, esos 10 deben seguir passing con el deny correcto por lógica real, y
los 47 failed deben convertirse en pass.

### Docs plan (Fase N+3)
- ROADMAP.md — fila D4 ✅.
- HANDOFF.md — §1 Fase actual actualizado (arrastra texto obsoleto
  post-merge D3 PR #13); §9 Próxima rama → D5; §10 renombrada
  Estado D4 con resumen.
- MASTER_PLAN.md § Rama D4 — Status ✅ + Ajustes vs plan original
  (recorte scope v2: hardcode rules, solo gh pr create, advisory
  skills/CI/invariants, migración D3 diferida a rama policy-loader).
- docs/ARCHITECTURE.md §7 — pre-pr-gate como 4º hook canónico en Capa 1.
- .claude/rules/hooks.md — sección Cuarto hook entregado — pre-pr-gate (D4).
- policy.yaml — no tocado en D4 (contrato declarativo sin enforcer real
  de parsing; se aborda en rama dedicada).
- pre-write-guard.py — no tocado en D4.

Trazabilidad de contexto: sesión arrancada  desde main
post-merge D3 PR #13 (y PR #14 docs Knowledge Plane). No se usó /clear
ni /compact en esta sesión — no hay resume prompt que referenciar.

Marker: .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved
(gitignored por diseño, igual que D1/D2/D3).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
EOF
)

* docs(d4): restore kickoff commit message context (e73416b)

El commit de kickoff e73416b quedó con el message dañado: los backticks
inline dentro del HEREDOC (cat <<'EOF' ... EOF) fueron interpretados por
el $(...) externo como command substitution y reemplazados por cadena
vacía. El código committeado (4 files, 903 insertions) está correcto;
sólo los inline-code spans del message se perdieron.

Reponer aquí, sin reescribir historia, las referencias textuales que
quedaron en blanco en e73416b (se evita backtick en todo este commit):

Decisiones cerradas
-------------------
- Trigger único del hook: "gh pr create" (no gh issue, no git push).
- Skip explícito de branch: main / master / detached-HEAD.
- Hook hooks/pre-write-guard.py (D3) no se toca en D4.

Scope
-----
- Nuevo hook blocker: hooks/pre-pr-gate.py (shape D1, no D2).
  Matcher PreToolUse(Bash) sobre command == "gh pr create" + flags.
- Mirror textual (hardcoded en el hook) de:
    policy.yaml -> lifecycle.pre_pr.docs_sync_baseline
    policy.yaml -> lifecycle.pre_pr.docs_sync_conditional
  Migración a parser declarativo diferida a rama policy-loader propia.
- Double log:
    .claude/logs/pre-pr-gate.jsonl   (shape propio del hook)
    .claude/logs/phase-gates.jsonl   (evento "pre_pr")

Riesgos
-------
- Tests con real-git subprocess setup. Helpers _git y
  _gh_pr_create_payload encapsulan init + commits.
- Detached HEAD devuelve HEAD literal de
    git rev-parse --abbrev-ref HEAD
  -> tratado como skip advisory, no como gated.
- Resolución de baseline requiere main local presente:
    git merge-base HEAD main
  -> si main fue borrada, skip advisory con reason "merge-base".

Trazabilidad de contexto
------------------------
- Sesión arrancada sin /clear desde main, post-merge de PR #13 (D3) y
  PR #14 (docs Knowledge Plane).
- Marker:
    .claude/branch-approvals/feat_d4-hook-pre-pr-gate.approved
  (gitignored por diseño, igual que D1/D2/D3).

Follow-up commit vacío (--allow-empty); cero cambios de código.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(d4): impl hooks/pre-pr-gate.py — docs-sync enforcer on gh pr create

Fase 2 (GREEN). Hook blocker (shape D1) que enforza CLAUDE.md regla #2
dentro de la rama, sobre el trigger "gh pr create".

Comportamiento
--------------
- Matcher: shlex.split(command); gate solo cuando tokens[:3] ==
  ["gh","pr","create"] (cubre --draft / --title / --body / --base).
  Todo lo demás (gh pr list, gh pr view, gh pr edit, gh issue create,
  git push, git status, non-Bash) → pass-through silencioso, cero log.
- Skip advisory (pass-through + log en hook log, NO phase log):
    * branch main / master / HEAD detached
    * git unavailable (cwd no es repo)
    * merge-base HEAD main no resoluble (main borrada localmente)
- Empty diff (HEAD == base) → deny exit 2 con reason "empty PR"
  dedicada, que NO menciona docs-sync / ROADMAP / HANDOFF.
- Docs-sync check (reglas hardcoded, mirror de
  policy.yaml.lifecycle.pre_pr.docs_sync_*):
    baseline   : ROADMAP.md + HANDOFF.md siempre.
    conditional: generator/**            → docs/ARCHITECTURE.md
                 hooks/** (no tests/)    → docs/ARCHITECTURE.md
                 skills/**               → .claude/rules/skills-map.md
                 .claude/patterns/**     → docs/ARCHITECTURE.md
  Missing docs → deny exit 2. Dedupe: ARCHITECTURE.md aparece una sola
  vez aunque múltiples prefijos lo exijan. Triggering paths capeados a
  3 por doc en el reason, con sufijo "... (+N more)" cuando >3.
- Safe-fail D1 blocker canonical: stdin vacío / JSON inválido /
  top-level no-dict / tool_input no-dict → deny exit 2. Command
  ausente / no-string / vacío / shlex unparsable → pass-through 0.
- Double log en decisiones (allow/deny):
    .claude/logs/pre-pr-gate.jsonl   {ts, hook, command, decision, reason}
    .claude/logs/phase-gates.jsonl   {ts, event:"pre_pr", decision}
  + 3 entradas status:"deferred" en hook log por cada decisión real
    (skills_required, ci_dry_run_required, invariants_check). Estas
    NO se emiten en skip ni en pass-through — el test las exige
    solo para gated decisions.

Simplify pass (N+1)
-------------------
- Docstring: 10 → 6 líneas (referencias externas redundantes con kickoff).
- _conditional_triggers docstring: eliminada (privada, nombre self-explaining).
- main(): missing, _triggers → missing, _ (unused var sin pseudónimo).

Tests
-----
- 96/96 verde en hooks/tests/test_pre_pr_gate.py (47 previamente failed
  girados a pass, 10 previamente falsos positivos confirmados como
  reales, 39 @needs_hook desbloqueados con el módulo ya disponible).
- Suite completa hooks/: 317 passed (D1+D2+D3+D4). Cero regresión.
- Coverage sobre pre-pr-gate.py: 93% (10 líneas sin cubrir son el
  except FileNotFoundError/SubprocessError de _run_git y un branch
  out.strip()==""; sobre el target 90%).

Deferrals explícitos documentados en Fase -1 (no se tocan en D4)
----------------------------------------------------------------
- Migración de reglas hardcoded → parser de policy.yaml (rama propia).
- Migración de paths D3 (pre-write-guard) → policy-driven (misma rama).
- Matcher de git push --force (no gated por D4).
- pre-write-guard.py intacto (cero edit).
- policy.yaml intacto (cero edit).
- requirements-dev.txt intacto (sin pyyaml).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d4): docs-sync dentro de rama — ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + rules/hooks

Sincroniza los 5 docs canónicos con el estado final de la rama D4 (CLAUDE.md regla #2,
Fase N+3). El hook que esta misma rama introdujo (hooks/pre-pr-gate.py) exige esta
sincronización antes de permitir gh pr create — dogfooding del propio blocker.

- ROADMAP.md: fila D4 pasa a estado cerrado con entregables listados (96 tests, 93%
  coverage, double log, advisory scaffold diferido). Fase D mantiene D5..D6 como
  siguientes en cola.
- HANDOFF.md: sección 1 refleja D4 cerrada en rama; sección 7 añade pre-pr-gate vivo a
  la lista de gotchas; sección 9 apunta la próxima rama a D5; sección 10 resume el
  cierre numérico.
- MASTER_PLAN.md sección Rama D4: scope entregado + ajustes vs plan original (reglas
  hardcoded como mirror de policy.yaml, docs-sync único enforcement, skip advisory,
  razón dedicada para diff vacío, safe-fail D1 canonical, reuso _lib sin nuevos
  helpers, 3 cuts de simplify).
- docs/ARCHITECTURE.md sección 7: tercera aplicación blocker canonicalizada junto a
  D1 y D3; matcher shlex, docs-sync baseline + condicional, advisory deferred, skip
  advisory, empty-diff dedicated reason.
- .claude/rules/hooks.md: sección Cuarto hook entregado con contrato, docs-sync
  reglas en tablas + reuso de _lib + 96 tests / 93% cov.

Sin cambios de código en esta commit. El hook seguirá aprobando su propio PR porque
ROADMAP.md y HANDOFF.md aparecen en el diff main..HEAD junto con docs/ARCHITECTURE.md
requerido por los paths hooks/ tocados en Fase 2.

* fix(d4): review PR#15 — distinguir empty-diff de diff-no-disponible + rename docs key

Aborda los 5 inline comments de la review en PR#15 más los 3 items explícitos del
usuario. Triage: 5 FIX (1 BLOCKER + 4 trivial/docs), 0 SKIP, 0 DISCUSS.

BLOCKER (code): hooks/pre-pr-gate.py

- diff_files devolvía [] tanto si el diff estaba vacío como si git subprocess fallaba
  (timeout, FileNotFoundError, returncode != 0). En main eso se trataba como empty PR
  y emitía deny. False-deny ante fallos transitorios de git.
- Cambio: diff_files ahora devuelve list[str] | None. None = git no disponible
  (skip advisory con status: skipped, reason: git diff unavailable). [] = diff real
  vacío (deny con razón dedicada empty PR). Sin call sites externos a diff_files.

Docs/naming:

- policy.yaml.lifecycle.pre_pr expone la key como docs_sync_required, no
  docs_sync_baseline. Tres referencias alineadas:
  - MASTER_PLAN.md sección Rama D4 bullet de reglas hardcoded.
  - .claude/rules/hooks.md sección Cuarto hook bullet de reglas hardcoded.
  - docs/ARCHITECTURE.md sección 7 bullet de docs-sync + descripción del comando
    git real (merge-base HEAD main + diff --name-only base HEAD, no diff main..HEAD).

Divergencia deliberada hooks/tests/ (docs-only, sin cambio de lógica):

- CONDITIONAL_RULES del hook excluye hooks/tests/, mientras policy.yaml lista hooks/**
  uniforme. El hook tiene la lógica correcta (editar tests no altera arquitectura);
  la policy queda más laxa. Anotado como decisión D4 explícita en:
  - hooks/pre-pr-gate.py (comment encima de CONDITIONAL_RULES).
  - MASTER_PLAN.md sección Rama D4 (nuevo bullet de divergencia).
  - .claude/rules/hooks.md sección Cuarto hook (nuevo bullet).
  - docs/ARCHITECTURE.md sección 7 (nuevo bullet).
  Convergencia hook ↔ policy diferida a la rama policy-loader, donde el loader
  decidirá si representa exclusiones granulares o si la policy se vuelve específica.

Tests: 322 passed (317 pre-fix + 5 nuevos en TestDiffUnavailable). Coverage sobre
hooks/pre-pr-gate.py sube a 94% (+1% vs baseline D4). TestDiffUnavailable incluye
unit tests de diff_files con monkeypatch + in-process main tests verificando que
el skip no emite phase-gate ni false-deny empty PR.

* docs(d4): align wording — docs_sync_required + git merge-base phrasing + 322 tests

Post-review sweep requested by reviewer:

- Strip ambiguous `docs_sync_*` wildcard; use exact
  `policy.yaml.lifecycle.pre_pr.docs_sync_required` +
  `docs_sync_conditional` everywhere.
- Replace `git diff main..HEAD` references (wrong for D4 impl)
  with `git merge-base HEAD main` + `git diff --name-only <base> HEAD`.
  Session-start (D2) references to `main..HEAD` preserved (hook
  literally uses `{base}..HEAD` there).
- Make `hooks/tests/` deliberate-divergence note explicit in
  MASTER_PLAN, rules/hooks.md, docs/ARCHITECTURE.md, ROADMAP.
- Bump test counts to 322 / 101 and coverage ≥94% after
  adding TestDiffUnavailable.

No logic change; docstring + docs only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 22, 2026
…ection) (#16)

* test(d5): kickoff — failing suite for post-action hook (PostToolUse compound trigger)

## Kickoff — D5 (feat/d5-hook-post-action-compound)

Contexto: continuación post-merge D4 (992137f en main). Fase -1 ejecutada y
aprobada en esta sesión (v1 entregada con B=gh-pr-merge incluido; v2 recortada
para eliminar B tras confirmar que tool_response.exit_code no está garantizado
en PostToolUse(Bash) por la doc oficial de Claude Code). Decisiones cerradas:

- Matchers finales: A (git merge) + C (git pull sin --rebase). B excluido.
- Hardcode 3ª aplicación (policy-loader queda diferido post-D5/D6).
- Mirror literal de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger
  (TRIGGER_GLOBS + SKIP_IF_ONLY_GLOBS + MIN_FILES_CHANGED=2).
- PostToolUse non-blocking. Exit 0 siempre. Nunca permissionDecision.
- Sin skill dispatch real (E3a futura); sólo additionalContext + advisory log.
- Coverage ≥90% lines / ≥85% branches sobre post-action.py.
- Web UI merge queda fuera (no observable vía Bash); el pull del usuario lo
  captura cuando el código aterriza local.

### Scope

Nuevo hook PostToolUse(Bash): hooks/post-action.py. Shape emparentado con D1
blocker (shlex, double log, importlib in-process) pero NO blocker — nunca
deniega.

Estrategia de detección jerárquica:

- Tier 1 (command match, shlex-parsed):
  * A = tokens[:2] == ["git","merge"] y tokens[2:3] ∉ {--abort, --quit,
    --continue, --skip}.
  * C = tokens[:2] == ["git","pull"] y "--rebase"/"-r" ausente.
  * Todo lo demás → pass-through silencioso (cero log, cero stdout).

- Tier 2 (post-hoc reflog determinista):
  * git reflog HEAD -1 --format=%gs.
  * A exige prefijo "merge ".
  * C exige prefijo "pull:" o "pull " (sin "--rebase").
  * Fallo → status "tier2_unconfirmed" (log advisory; phase-gates intacto).

- Derivación touched_paths:
  * git diff --name-only HEAD@{1} HEAD → list[str] | None.
  * None → status "diff_unavailable" (log advisory; phase-gates intacto).
  * [] → status "confirmed_no_triggers" (ambos logs; sin additionalContext).

- Mirror hardcoded (policy.yaml L105-120):
  * TRIGGER_GLOBS: generator/lib/** | generator/renderers/** | hooks/** |
    skills/** | templates/**/*.hbs.
  * SKIP_IF_ONLY_GLOBS: docs/** | *.md | .claude/patterns/**.
  * MIN_FILES_CHANGED: 2.
  * Match se emite sólo si: len ≥ 2 AND NOT all-skip_if_only AND al menos 1
    path matchea un TRIGGER_GLOBS entry.

- Emisión additionalContext (4 condiciones simultáneas):
  1. Tier 1 match.
  2. Tier 2 confirmado.
  3. touched_paths no None y len ≥ MIN_FILES_CHANGED (=2).
  4. match_triggers() devuelve lista no vacía.
  Mensaje: triggers matcheados + touched paths (cap 3 con "... (+N more)") +
  sugerencia literal "/pos:compound". NUNCA intenta dispatch.

- Double log (espejo D1/D3/D4):
  * .claude/logs/post-action.jsonl (shape propio por status).
  * .claude/logs/phase-gates.jsonl evento "post_merge" SÓLO en
    confirmed_triggers_matched + confirmed_no_triggers (Tier 2 ok).
    tier2_unconfirmed y diff_unavailable van sólo al hook log.

- Safe-fail PostToolUse (no es blocker D1 canonical):
  stdin vacío / JSON inválido / top-level no-dict / tool_input no-dict /
  command no-string / shlex error → exit 0, sin log, sin stdout.

- Reuso hooks/_lib/: append_jsonl + now_iso. Sin helpers nuevos (regla #7 —
  añadir sólo si ≥2 hooks consumen el nuevo helper).

### Archivos

Este commit (kickoff RED):
- hooks/tests/test_post_action.py — 111 tests (38 failed + 73 skipped en
  RED; los 38 subprocess fallan porque el hook no existe, los 73 in-process
  @needs_hook skippean hasta que el módulo se pueda importar).
- hooks/tests/fixtures/payloads/git_merge.json
- hooks/tests/fixtures/payloads/git_merge_no_ff.json
- hooks/tests/fixtures/payloads/git_merge_abort.json
- hooks/tests/fixtures/payloads/git_pull.json
- hooks/tests/fixtures/payloads/git_pull_rebase.json
- hooks/tests/fixtures/payloads/gh_pr_merge.json
- hooks/tests/fixtures/payloads/git_rebase.json

Fase 2 (implementación GREEN):
- hooks/post-action.py — el hook (classify_command, reflog_message,
  reflog_confirms, touched_paths, match_triggers, emit helpers, main).

Fase N+3 (docs-sync):
- ROADMAP.md — fila D5 ✅.
- HANDOFF.md — §1 fase actual → D5 cerrada; §9 próxima rama → D6;
  §10 renombrada Estado D5.
- MASTER_PLAN.md § Rama D5 — status ✅ + ajustes vs plan original (B out,
  matchers A+C confirmados, emission tiered).
- docs/ARCHITECTURE.md §7 — post-action.py como 5º hook canónico en Capa 1
  (primer PostToolUse; variante del shape blocker: no blocker, exit 0).
- .claude/rules/hooks.md — sección "Quinto hook entregado — post-action".
- policy.yaml — no tocado (mirror hardcoded; la sección ya existe).

### Tests (matriz que la suite fija)

- TestMatcherDetection (21 casos): Tier 1 para A/C + exclusiones (abort,
  quit, continue, skip, rebase, rebase shorthand, gh pr merge, cherry-pick,
  rebase, status, push, strings vacíos, shlex unparsable).
- TestTier2Reflog (15 casos): reflog_message sobre repos reales after_merge
  / after_ff_merge / after_pull / clean_repo / non-repo; reflog_confirms
  truth-table por kind × mensaje.
- TestTouchedPaths (5 casos): git diff HEAD@{1} HEAD en cada tipo de repo +
  edge case no reflog previo.
- TestPolicyConstants (3 casos): verificación literal del mirror.
- TestMatchTriggers (15 casos): min_files, skip_if_only semántica (all vs
  any), orden policy-driven, dedupe, templates con/sin subdir.
- TestIntegrationMergeTriggersMatch (4 casos): end-to-end merge real.
- TestIntegrationPullTriggersMatch (3 casos): end-to-end pull real (topo
  upstream/src/local).
- TestIntegrationMergeFF (1 caso): ff-merge también emite.
- TestIntegrationTier2Unconfirmed (2 casos): mismatch command vs reflog.
- TestIntegrationConfirmedNoTriggers (2 casos): docs-only merge +
  single-file merge (min_files).
- TestIntegrationDiffUnavailable (1 caso, delega a TestMainInProcess).
- TestNonMatcherPassthrough (6 casos): gh pr merge, git rebase, pull
  --rebase, merge --abort, git status, non-Bash tool.
- TestSafeFail (10 casos): empty / malformed JSON / top-level list o string
  / missing tool_name / non-Bash / tool_input no-dict / command no-string o
  vacío / shlex error.
- TestAdditionalContextShape (5 casos): contenido emitido en stdout.
- TestMainInProcess (13 casos): cobertura fina de main() vía monkeypatch
  (incluye diff_unavailable forzado).
- TestLogShape (3 casos): shape del jsonl por status.
- TestIdempotence (2 casos): 2 runs → 2 entries, ambos emiten context.

Total: 111 tests. RED estado inicial: 38 failed (subprocess) + 73 skipped
(in-process @needs_hook, se des-skippean cuando post-action.py se pueda
importar vía importlib).

### Docs plan (Fase N+3)

- ROADMAP.md — fila D5 ✅ (fase D cerrada tras merge: 5/5 hooks).
- HANDOFF.md — §1 fase actual, §9 próxima rama → D6, §10 Estado D5.
- MASTER_PLAN.md § Rama D5 — Status ✅ + ajustes (B out, matchers
  jerárquicos, emission tiered).
- docs/ARCHITECTURE.md §7 — 5º bloque Capa 1 (post-action). Primera variante
  PostToolUse no-blocking documentada.
- .claude/rules/hooks.md — "Quinto hook entregado — post-action (D5)".
- policy.yaml — intacto (sección L105-120 ya existente, mirrorada).

Trazabilidad de contexto: sesión arrancada desde main post-merge D4 PR #15.
No se usó /clear ni /compact — sin resume prompt que referenciar.

Marker: .claude/branch-approvals/feat_d5-hook-post-action-compound.approved
(gitignored por diseño, igual que D1/D2/D3/D4).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(d5): impl hooks/post-action.py — PostToolUse compound trigger (GREEN)

Quinto hook del plugin pos. Primera aplicación del patrón PostToolUse
non-blocking: exit 0 siempre, nunca emite permissionDecision.

Detección jerárquica:
- Tier 1: shlex-parse del comando Bash. Matcher A = `git merge <ref>`
  (excluye --abort/--quit/--continue/--skip). Matcher C = `git pull`
  (excluye --rebase/-r).
- Tier 2: confirmación post-hoc vía `git reflog HEAD -1 --format=%gs`.
  A espera prefijo "merge "; C espera "pull:" o "pull " (y no
  "pull --rebase"). Evita disparar en `git merge --abort` o cuando el
  pull fue rebase real aunque el shell no lo marcara.

Cuando ambos tiers confirman, deriva paths tocados vía
`git diff --name-only HEAD@{1} HEAD` y hace fnmatch contra mirror literal
de policy.yaml.lifecycle.post_merge.skills_conditional[0].trigger:
TRIGGER_GLOBS (generator/lib, generator/renderers, hooks, skills,
templates/**/*.hbs), SKIP_IF_ONLY_GLOBS (docs/**, *.md,
.claude/patterns/**), MIN_FILES_CHANGED=2. Si matchea, emite
additionalContext sugiriendo `/pos:compound`. Nunca dispatcha la skill.

Double log canonical (D1..D4 shape): post-action.jsonl + phase-gates.jsonl
(evento `post_merge`). Cuatro status distinguidos: tier2_unconfirmed y
diff_unavailable loguean sólo hook log; confirmed_no_triggers y
confirmed_triggers_matched loguean ambos.

Reusa `_lib/jsonl.append_jsonl` y `_lib/time.now_iso`. Hardcode mirror
de policy.yaml (regla #7 CLAUDE.md: dos repeticiones D4+D5 cumplen
precondición para policy-loader en rama dedicada).

Coverage 97% líneas sobre hooks/post-action.py (target ≥90%). Suite
global hooks/**: 432 pasados (D1+D2+D3+D4+D5, 110 nuevos). Cierra el
kickoff de D5.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d5): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/rules/hooks.md

Docs-sync dentro de la rama D5 (Fase N+3, CLAUDE.md regla #2).

- ROADMAP: D5 marcada en tabla + entrada feat/d5-hook-post-action-compound
  con entregables completos, contrato de 4 status distinguidos y
  ajustes vs plan original.
- HANDOFF: seccion 1 snapshot (D5 cerrada, proxima D6); seccion 7
  gotchas anade el bullet post-action (PostToolUse non-blocking,
  tiers, 4 status, advisory-only); seccion 9 proxima rama D6 con
  lectura minima actualizada; seccion 10 renombrada a "Estado D5
  (cerrada en rama)" con resumen ejecutable.
- MASTER_PLAN seccion Rama D5 expandida: status cerrado, contexto a
  leer, decisiones clave (deteccion jerarquica, gh pr merge
  descartado, advisory-only, segunda repeticion policy.yaml),
  contrato por status, ajustes, criterio de salida cumplido.
- docs/ARCHITECTURE seccion 7: Capa 1 pasa de "dos variantes
  canonicas" a "tres variantes" (anade PostToolUse non-blocking).
  Nuevo bloque "Implementacion canonica PostToolUse non-blocking".
- .claude/rules/hooks.md: seccion "Quinto hook entregado" con shape
  del patron, contrato completo, diferencias vs blocker/informative,
  nota simplify pass pre-PR.

Tests intactos (docs-only): 432 passed + 1 skipped en hooks/**.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(d5): address PR #16 review — 2 Copilot issues fixed

1. _log_hook / _log_phase ahora pasan por _safe_append (try/except
   OSError). Sin el wrapper un disk-full / RO fs lanzaba OSError y
   rompia el contrato "exit 0 siempre" del patron PostToolUse
   non-blocking. Mirror directo de hooks/session-start.py::_safe_append
   (D2). Consistency con el 2o patron canonico.

2. match_triggers pasa de fnmatch.fnmatch a fnmatch.fnmatchcase.
   fnmatch.fnmatch aplica os.path.normcase, que es case-insensitive en
   Windows, introduciendo no-determinismo cross-OS en la evaluacion de
   TRIGGER_GLOBS / SKIP_IF_ONLY_GLOBS. fnmatchcase elimina esa
   dependencia.

Tests intactos: 110 passed + 1 skipped. Sin cambios de contrato en la
suite (_safe_append es privado; fnmatchcase es drop-in para paths
POSIX lowercase que ya usan los tests).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 22, 2026
…#17)

* chore(d5b): kickoff refactor/d5-policy-loader

## Kickoff

**Rama**: refactor/d5-policy-loader
**Fase MASTER_PLAN**: D5b — policy-loader (insertada entre D5 y D6)
**Tipo**: refactor (sin cambio de comportamiento observable, salvo convergencia `hooks/tests/`)

### Scope (Alt γ aprobada en Fase -1)

Unificar lectura de `policy.yaml` en los 3 hooks D3/D4/D5 sobre un loader central
en `hooks/_lib/policy.py`. D4 + D5 cumplieron las 2 repeticiones que CLAUDE.md
regla #7 exige antes de abstraer; D6 nacerá ya sobre el loader.

### Archivos

Nuevos:
- hooks/_lib/policy.py
- hooks/tests/test_lib_policy.py
- hooks/tests/fixtures/policy/{minimal,full,malformed,missing-section}.yaml

Modificados:
- hooks/pre-write-guard.py — consume pre_write_rules()
- hooks/pre-pr-gate.py — consume docs_sync_rules() + advisory_checks()
- hooks/post-action.py — consume post_merge_trigger()
- policy.yaml — añade lifecycle.pre_write + campo `excludes` opcional en docs_sync_conditional
- requirements-dev.txt — pin exacto pyyaml
- ROADMAP.md + HANDOFF.md + MASTER_PLAN.md + docs/ARCHITECTURE.md + .claude/rules/hooks.md

### Decisiones Fase -1 (congeladas)

- Alt γ (migrar los 3 hooks, no scope-cut).
- (b.1) strings/globs a YAML; derivación de test-pair queda en código.
- (c.2) failure mode: policy no cargable → pass-through advisory + log, nunca deny.
- pyyaml pin exacto (no rango).
- Ubicación MASTER_PLAN: Rama D5b, sub-sección de Fase D.
- templates/policy.yaml.hbs NO se toca → drift temporal meta↔template documentado
  explícitamente en docs/plan/PR. Esta rama NO debe leerse como "el template ya
  refleja el shape nuevo". Convergencia diferida a rama propia con señal de ≥1
  proyecto generado que requiera el shape.

### Risks

- pyyaml es primera dep runtime de hooks — primer cambio en supply chain.
  Mitigado con pin exacto + tests en CI.
- policy.yaml extendida con campos nuevos — consumers fuera de hooks (si los
  hubiera) deben tolerar campos desconocidos. Hoy: skills/audit-session/ y
  skills/audit-plugin/ no existen. Sin impacto real.
- Divergencia hooks/tests/ convergida vía campo `excludes` — test explícito
  asegura comportamiento D4 idéntico tras migración.

### Test plan

- hooks/tests/test_lib_policy.py (~40-60 tests): happy path, secciones faltantes,
  YAML inválido, archivo ausente, shape validation, cache in-process, `excludes`.
- Regresión: 432 tests D1..D5 corren idénticos — cero cambios al contrato
  observable. Test cross-hook verifica outputs con `policy.yaml` real del repo.
- Failure mode (c.2): policy corrupto → pass-through + log `policy_unavailable`.
- Coverage: _lib/policy.py ≥90% lines / ≥85% branches; global hooks/** sin
  regresión.

### Docs plan

- ROADMAP.md — fila D5b + entrada "Progreso Fase D".
- HANDOFF.md §1, §7, §9, §10 — quitar nota "policy.yaml declarado pero no
  enforced"; apuntar D6 como "nace sobre loader".
- MASTER_PLAN.md — nueva sección "Rama D5b — policy-loader" con Ajustes.
- docs/ARCHITECTURE.md §7 — sub-sección "Loader declarativo".
- .claude/rules/hooks.md — sección "Policy loader" + ajuste en D3/D4/D5 (quitar
  "hardcoded", apuntar al loader). Nota dep pyyaml en "Runtime".
- Drift meta↔template documentado explícitamente.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* refactor(d5b): declarative policy-loader + migrate D3/D4/D5 consumers

Closes the CLAUDE.md regla #7 precondition opened by D4 + D5 (two hardcoded
mirrors of policy.yaml inside hooks). Adds hooks/_lib/policy.py as the
single source of truth and migrates pre-write-guard / pre-pr-gate /
post-action to consume it in the same PR.

Shape (Fase -1 decisions):

- (b.1) strings/globs declarative in YAML, derivation in Python keyed by
  the pattern's `label`. derive_test_pair(rel_path, label) covers two
  labels: hooks_top_level_py and generator_ts (two YAML entries share the
  generator_ts label because fnmatch's middle `/` in `**` is literal, not
  recursive — one entry covers top-level, the other recursive subdirs).
- (c.2) policy.yaml missing/corrupt → loader returns None → consumer
  hooks degrade to pass-through advisory with a `status: policy_unavailable`
  log entry. Never deny blindly (avoids bricking the repo on a bad YAML
  edit).

policy.yaml changes:

- New lifecycle.pre_write.enforced_patterns (3 entries).
- lifecycle.pre_pr.docs_sync_conditional.hooks/** now carries
  excludes: ["hooks/tests/**"] — closes the deliberate D4 hook↔policy
  divergence.

Dependency: pyyaml==6.0.2 (exact pin). First non-stdlib line in
hooks/_lib/; justified in the kickoff commit.

Templates intentionally NOT touched in this branch — drift meta-repo ↔
template is documented in the docs-sync commit and in the PR body.

Tests: 462 passed + 1 skipped. New hooks/tests/test_lib_policy.py (57
cases); redundant TestIsEnforcedUnit / TestExpectedTestPairUnit /
TestPolicyConstants removed (coverage moved into the loader suite).
Coverage: _lib/policy.py 97%, pre-write-guard 93%, pre-pr-gate 93%,
post-action 94%.

Simplify pass: classify() in pre-write-guard now returns `label: str`
instead of `(label, match_glob)` — the second element was dead.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d5b): sync ROADMAP/HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md + drift note

Docs-sync for refactor/d5-policy-loader (CLAUDE.md regla #2). Captures:

- ROADMAP: new rama row refactor/d5-policy-loader (✅) + Progreso Fase D
  entry (loader shape, 3 hooks migrated, test counts, coverage).
- HANDOFF: snapshot now points at D5b in-flight; new gotchas for loader
  and drift meta↔template; §11 Estado D5b; §9 Próxima rama updated so D6
  starts consuming the loader (no new hardcode permitted).
- MASTER_PLAN: new § Rama D5b sub-section under Fase D with scope,
  decisions, contract, ajustes, drift note and exit criteria.
- docs/ARCHITECTURE §7: loader canonicalized as single source of truth
  for hooks consuming policy.yaml; failure mode (c.2) documented as
  third safe-fail variant; explicit drift note.
- .claude/rules/hooks.md: new § Policy loader with consumer contract,
  failure-mode table, shape, dependency note, fnmatch middle-slash note,
  loader test summary, drift note.

Drift meta-repo ↔ template explicitly documented in all five locations
(explicit user request): templates/policy.yaml.hbs, generator/renderers/
policy.ts and snapshots were NOT touched in this branch. Projects
generated with `pos` today still emit a policy.yaml with the pre-D5b
shape. Reconciliation (template + renderer + snapshots + pyyaml in
requirements-dev for generated Python stacks) deferred to a dedicated
rama post-D6. This rama must not be read as "the template already
reflects the new shape".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(d5b): address PR #17 review — align cache contract + log status + hygiene

Applied after Copilot review surfaced concrete mismatches between docs,
code and log shape. Per user direction on the cache contract: option 2
(correct docs to match code) — the in-process cache is small enough and
the hooks are ephemeral enough that implementing mtime/size keying would
be abstraction ahead of need.

Cache contract (6 Copilot comments + user's primary ask):

- PR body and 5 docs said "cache keyed by path + mtime + size" with
  implicit invalidation on edits. Reality: load_policy() keys the cache
  by absolute path only. Updated hooks/_lib/policy.py docstrings to
  sharpen the "no implicit invalidation on edits" note, and corrected
  ROADMAP.md, HANDOFF.md, MASTER_PLAN.md, docs/ARCHITECTURE.md and
  .claude/rules/hooks.md to match. PR body edited via `gh pr edit`.

Log status alignment (2 Copilot comments):

- pre-pr-gate.py:_log_skip() was hardcoding status: "skipped" for every
  skip reason, including the policy-unavailable case — which the loader
  contract (and pre-write-guard / post-action siblings) emit as
  status: "policy_unavailable". Added optional `status` kwarg to
  _log_skip and pass "policy_unavailable" at the one relevant call
  site. Other skip reasons keep the default.
- .claude/rules/hooks.md § Policy loader — consumer-contract example
  updated to reflect the new kwarg shape; aligns with the failure-mode
  table directly below.

_safe_str_list stricter shape (1 Copilot comment):

- Was silently dropping non-string entries (`["ROADMAP.md", 123]` →
  `["ROADMAP.md"]`), producing partial under-enforcement while still
  treating the policy as valid. Now returns None if any element is not
  a string — consistent with the "wrong-shape → None" contract the
  module docstring already claimed.

Test-fixture hygiene (3 Copilot comments):

- Three autouse `_reset_policy_cache` fixtures (test_pre_write_guard,
  test_pre_pr_gate, test_post_action) did unconditional
  sys.path.insert(0, ...) without guard or teardown. Switched to the
  guarded "insert only if missing + remove in teardown" pattern that
  test_lib_policy.py already uses.

One test adjusted:

- test_pre_pr_gate.py::TestGitUnavailable::test_not_a_git_repo_...
  was implicitly exercising both the no-policy.yaml path and the
  no-git path simultaneously. It now writes POLICY_YAML_FOR_TESTS so
  it actually tests what its name claims (git-unavailable path
  reaching the skip log).

Tests: 462 passed + 1 skipped (unchanged). Dogfooding: pre-pr-gate
with the updated status field passes this PR through its own gate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(d5b): address PR #17 review round 2 — wrong-shape guards + (c.2) coverage

Second review round surfaced edge cases on the loader's failure-mode
contract. All FIX, no SKIPs. 12 new regression tests.

Loader wrong-shape guards (3 Copilot comments, high-value):

- Each of the three accessors (docs_sync_rules, post_merge_trigger,
  pre_write_rules) could raise AttributeError if `lifecycle` or the
  section itself was present but not a mapping (e.g. `lifecycle: not_a_dict`
  or `lifecycle.pre_pr: 42`). That broke the "never propagate exception"
  contract. Extracted `_lifecycle_section()` helper with isinstance
  checks; all three accessors now return None on wrong shape.

Optional list fields — missing vs wrong-type (1 Copilot comment, medium):

- `excludes` / `skip_if_only` / `exclude_globs` previously used
  `_safe_str_list(...) or []`, which silently coerced wrong-type values
  (e.g. `excludes: "hooks/tests/**"` as a string) to empty lists —
  potentially disabling a declared exclusion. Added `_optional_str_list`
  that distinguishes absent key (`→ []`) from present-but-wrong-shape
  (`→ None`, signalling the caller to skip the rule/pattern, or to
  return None for the whole accessor when the field is required-inside-
  trigger like `skip_if_only`).

post-action.py docstring drift (1 Copilot comment):

- Docstring still said "hardcoded mirror of policy.yaml" — outdated
  since D5b kickoff. Rewritten to reference the loader path and document
  the (c.2) pass-through behavior explicitly.

pre-write-guard (c.2) for unknown labels (1 suppressed low-confidence comment):

- `derive_test_pair` returning None (policy.yaml label typo or a new
  `enforced_patterns[*].label` added without a matching code branch in
  the derivation switch) previously fell through to a deny with an empty
  expected-path. That violated the (c.2) contract ("never deny blindly
  on policy issues"). Now treated the same as "policy unavailable":
  log `status: policy_unavailable` + pass-through exit 0. Preserves the
  "YAML typo cannot brick the repo" invariant.

Tests: 474 passed + 1 skipped (was 462 + 1, +12 new cases):

- TestWrongShapeGuards — 7 cases covering non-mapping lifecycle / non-
  mapping section across the three accessors.
- TestOptionalListShape — 4 cases covering wrong-type optional-list on
  each accessor + a `_safe_str_list` mixed-type propagation test that
  locks in the strict contract introduced in round 1.
- TestMainInProcess::test_unknown_label_passes_through_with_policy_unavailable
  — integration test for pre-write-guard's (c.2) handling of an unknown
  label injected via policy.yaml.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI pushed a commit that referenced this pull request Apr 22, 2026
…ontract

Three substantive fixes on top of Copilot review:

1. Scope skills.jsonl reads by session_id (review concern #1).
   _extract_invoked_skills(repo_root, session_id) now streams line-by-line
   and only counts entries whose session_id matches the Stop payload.
   Entries without session_id, with non-string session_id, or from prior
   sessions are silently ignored — the log is append-only and accumulates
   across sessions. Payload Stop without session_id -> safe-fail deny
   (enforcement cannot scope safely). Tests: new TestSessionScoping class
   (6 cases incl. 5-session mixed log) + safe-fail cases for
   missing/empty/non-string session_id.

2. Tri-state skills_allowed_list — stop collapsing absent vs invalid
   (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in
   _lib/policy.py. None = section absent (deferred, prod state),
   sentinel = present but wrong-shape (misconfigured, observable),
   () = explicit deny-all, tuple = live enforcement. Stop hook emits
   status: policy_misconfigured on sentinel with literal reason. A typo
   in policy.yaml no longer silently turns enforcement off. Tests: new
   TestMisconfiguredPolicy class + test_three_states_are_all_distinct
   + test_invalid_sentinel_distinct_from_none in loader suite.

3. Remove exact-string quotes of pre-compact output from docs (review
   concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin
   the literal advisory wording — the suite validates shape + presence,
   not the string. Frees the hook to refine copy without doc drift.

Suite: 575 passed + 1 skipped (+20 new tests). No regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 23, 2026
* chore(d6): kickoff — pre-compact.py + stop-policy-check.py

## Kickoff

### Scope

Sexto + séptimo hook Python. Cierra Fase D antes de abrir Fase E.

- hooks/pre-compact.py  — PreCompact  informative (shape D2)
- hooks/stop-policy-check.py — Stop    blocker-scaffold (shape D1, deferred)

D6 consume hooks/_lib/policy.py desde el primer commit (loader vivo tras
D5b). Nuevo hardcode de policy = regresion explicita.

### Archivos

Nuevos:
- hooks/pre-compact.py
- hooks/stop-policy-check.py
- hooks/tests/test_pre_compact.py
- hooks/tests/test_stop_policy_check.py

Modificados:
- hooks/_lib/policy.py         (+ accessors pre_compact_rules, skills_allowed_list)
- hooks/tests/test_lib_policy.py (+ casos accessors nuevos)
- ROADMAP.md  HANDOFF.md  MASTER_PLAN.md
- docs/ARCHITECTURE.md §7
- .claude/rules/hooks.md

### Decisiones Fase -1 (aprobadas)

- (A2) pre-compact INFORMATIVE, no blocker.
  Razon: bloquear /compact intencional es destructivo; el valor del hook
  es emitir additionalContext con la checklist persist del policy para
  que el modelo persista antes del compact.

- (c.3) stop-policy-check BLOCKER-SCAFFOLD.
  Shape D1 (safe-fail deny + double log + permissionDecision disponible),
  pero ZERO enforcement real hoy. policy.yaml.skills_allowed no existe
  todavia - skills_allowed_list() devuelve None y el hook degrada a
  log status=deferred. Activable sin refactor cuando E1a aterrice
  skills_allowed. Framing estricto: "puede bloquear por contrato, pero
  hoy esta en modo deferred salvo safe-fail + tests future-proof".

- Ambos en la misma rama. Reuso loader + docs-sync compartido.

- Failure mode canonico (c.2) reaplicado: accessor None -> pass-through
  advisory + log status=policy_unavailable. Nunca deny blind.

### Framing explicito (anti-sobrerrepresentacion)

En docs + PR body, stop-policy-check.py NO se presenta como enforcement
util en produccion. Se describe como:
- hook con shape blocker listo
- modo deferred mientras skills_allowed no exista en policy.yaml
- safe-fail activo (deny ante payload malformado)
- tests cubren el enforcement futuro para que E1a solo tenga que
  declarar skills_allowed en el policy

### Tests

TDD estricto. Orden:
1. commit rojo: tests que fallan por accessors + hooks ausentes
2. accessors en _lib/policy.py (verde accessor tests)
3. pre-compact.py (verde pre_compact tests)
4. stop-policy-check.py (verde stop tests)
5. docs-sync
6. simplify
7. review

Coverage objetivo: >=80% lines / >=75% branches por hook,
>=90% sobre accessors. Suite global hooks/** >=500 tests verdes.

### Docs plan

Dentro del mismo PR (docs-sync docs_sync_conditional activo por hooks/**):
- ROADMAP.md: fila D6 marcada.
- HANDOFF.md: §9 proxima rama (E1a), §12 estado D6, §7 contador 5->7 hooks.
- MASTER_PLAN.md § Rama D6: cerrar con ajustes vs plan original.
- docs/ARCHITECTURE.md §7: hook counter + eventos phase-gates (pre_compact, stop).
- .claude/rules/hooks.md: Sexto hook + Septimo hook + ampliar Policy loader con
  pre_compact_rules + skills_allowed_list.

### NO incluye

- No persistencia real de estado del LLM (pre-compact emite prompt, no escribe).
- No enforcement activo de skills_allowed (scaffold).
- No tocar templates/policy.yaml.hbs (drift meta-repo vs template
  documentado desde D5b, rama reconciliadora post-D6).
- No skills, no runtime.

* test(d6): red tests — accessors + hooks ausentes

Fallan por diseño (TDD estricto, Fase 1 de rama):
- _lib.policy.pre_compact_rules() no existe
- _lib.policy.skills_allowed_list() no existe
- hooks/pre-compact.py no existe (collection error)
- hooks/stop-policy-check.py no existe (collection error)

Lock-down de contrato pre-impl:
- PreCompactRules frozen dataclass; persist: tuple[str, ...]
- skills_allowed_list: tuple[str, ...] | None (None=deferred, ()=deny-all)
- Pre-compact hook: shape informative (exit 0 always, no permissionDecision)
- Stop hook: shape blocker-scaffold; c.3 deferred until skills_allowed
  declared; safe-fail blocker (deny exit 2 on malformed payload)

23 fails en test_lib_policy; 2 collection errors en los tests de hooks.
Ningún test verde pre-impl.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(d6): impl pre-compact + stop-policy-check hooks (+ 2 accessors)

Sexto hook — hooks/pre-compact.py (shape D2 informative):
- Lee lifecycle.pre_compact.persist vía pre_compact_rules() y emite
  hookSpecificOutput.additionalContext como checklist para el modelo.
- Exit 0 siempre; nunca permissionDecision. Nunca bloquea /compact.
- Failure-mode (c.2): policy None → additionalContext mínimo +
  status: policy_unavailable en hook log. Pass-through advisory canónico.
- Double log: pre-compact.jsonl (siempre) + phase-gates.jsonl (event:
  pre_compact sólo en happy path; policy_unavailable queda sólo en hook).
- Safe-fail informative: malformed payload → additionalContext con
  "(error reading payload: ...)" + status: payload_error, exit 0.

Séptimo hook — hooks/stop-policy-check.py (shape D1 blocker-scaffold):
- Lee skills_allowed_list() + .claude/logs/skills.jsonl.
- c.3 Scaffold: skills_allowed absent → status: deferred pass-through;
  meta-repo no declara el campo hoy, así que enforcement es DEFERRED en
  prod — la cadena entera existe para cuando E1a añada el campo.
- Activable: skills_allowed declarado → _validate(invoked, allowed),
  deny exit 2 con primer violador en decisionReason; allow exit 0.
- Failure-mode (c.2): policy None → status: policy_unavailable
  pass-through. Safe-fail blocker canónico: malformed payload → deny exit 2.
- Double log sólo en decisiones reales (allow/deny). Deferred y
  policy_unavailable quedan sólo en hook log.
- _extract_invoked_skills y _validate son helpers privados pero
  testeables como unidad (aserciones `sp._extract_invoked_skills(...)` y
  `sp._validate(...)` en la suite).

Loader — hooks/_lib/policy.py:
- pre_compact_rules(repo_root) → PreCompactRules | None (dataclass
  frozen con persist: tuple[str, ...]).
- skills_allowed_list(repo_root) → tuple[str, ...] | None (None=deferred
  absent; ()=explicit deny-all).

555 pasados (+1 skip intencional D5) en hooks/**. Sin regresión.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d6): sync ROADMAP + HANDOFF + MASTER_PLAN + ARCHITECTURE + hooks.md

Cierra docs-sync de D6 (feat/d6-hook-pre-compact-stop). Dos entregas:
pre-compact.py (PreCompact informative, shape D2) + stop-policy-check.py
(Stop blocker scaffold — NO enforcement en produccion hoy: skills_allowed
ausente en policy.yaml del meta-repo → status: deferred, pass-through).
Contrato None/() documentado como distincion semantica del scaffold.
Dos accessors nuevos en hooks/_lib/policy.py: pre_compact_rules +
skills_allowed_list (5 accessors totales tras D5b+D6).

Framing anti-sobrerrepresentacion (MASTER_PLAN + ARCHITECTURE + hooks.md):
el hook Stop valida su propio shape, no enforcement real hasta E1a
poblando skills_allowed. Precondicion lista: activacion sin cambio de
codigo cuando la primera skill /pos:* exista.

* refactor(d6): simplify pre-compact — inline _log_hook/_log_phase wrappers

Los wrappers _log_hook / _log_phase en pre-compact.py eran triviales (3 + 1
call sites). Inline directo a _safe_append(cwd / HOOK_LOG, ...) /
_safe_append(cwd / PHASE_LOG, ...): -8 lineas de wrappers, mismo contrato.

Gana consistencia estilistica con stop-policy-check.py (otro hook D6) que
ya usaba el shape inline. No afecta tests ni comportamiento.

555 passed + 1 skipped (sin regresion).

* fix(d6): address PR #18 review — session scoping + tri-state policy contract

Three substantive fixes on top of Copilot review:

1. Scope skills.jsonl reads by session_id (review concern #1).
   _extract_invoked_skills(repo_root, session_id) now streams line-by-line
   and only counts entries whose session_id matches the Stop payload.
   Entries without session_id, with non-string session_id, or from prior
   sessions are silently ignored — the log is append-only and accumulates
   across sessions. Payload Stop without session_id -> safe-fail deny
   (enforcement cannot scope safely). Tests: new TestSessionScoping class
   (6 cases incl. 5-session mixed log) + safe-fail cases for
   missing/empty/non-string session_id.

2. Tri-state skills_allowed_list — stop collapsing absent vs invalid
   (review concern #2). Added SKILLS_ALLOWED_INVALID sentinel in
   _lib/policy.py. None = section absent (deferred, prod state),
   sentinel = present but wrong-shape (misconfigured, observable),
   () = explicit deny-all, tuple = live enforcement. Stop hook emits
   status: policy_misconfigured on sentinel with literal reason. A typo
   in policy.yaml no longer silently turns enforcement off. Tests: new
   TestMisconfiguredPolicy class + test_three_states_are_all_distinct
   + test_invalid_sentinel_distinct_from_none in loader suite.

3. Remove exact-string quotes of pre-compact output from docs (review
   concern #3). HANDOFF/MASTER_PLAN/ARCHITECTURE/hooks.md no longer pin
   the literal advisory wording — the suite validates shape + presence,
   not the string. Frees the hook to refine copy without doc drift.

Suite: 575 passed + 1 skipped (+20 new tests). No regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(d6): strip "PR #18" from in-repo cross-refs; keep rationale

Copilot flagged 7 bullets across hooks.md / MASTER_PLAN.md / ARCHITECTURE.md
that cite "post-review PR #18" inline. For long-lived rules docs the PR
number is not a stable rendered identifier (forks/rebases lose it), while
the rationale ("post-review") carries the same meaning. Drop the number.

No contract change. Tests untouched.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 26, 2026
…icy.yaml vs .claude/logs/ (#25)

* test(f1): RED — extend ALLOWED_SKILLS 13->14 + behavior tests for audit-session

Fase 0 kickoff + Fase 1 RED-first per CLAUDE.md regla #3 and .claude/rules/tests.md.

Plan ratificado por usuario: decisiones A1.a..A6.a + 3 ajustes obligatorios.
Scope: skill /pos:audit-session — read-only advisory main-strict que compara
3 superficies de policy.yaml contra .claude/logs/ reales:
  1. policy.yaml.skills_allowed vs skills.jsonl invocations.
  2. policy.yaml.lifecycle.*.hooks_required vs logs por hook
     (existencia + nonempty del archivo log esperado).
  3. policy.yaml.audit.required_logs vs existencia/edad/no-vacio.

RED state confirmado: 16 failures esperados.
  - 10 parametrizados [audit-session] en TestStructure / TestFrontmatter
    / TestBody. No existe .claude/skills/audit-session/SKILL.md.
  - 5 TestAuditSessionBehavior:
      * test_body_declares_three_audit_surfaces
      * test_body_declares_advisory_only
      * test_body_declares_main_strict_no_delegation
      * test_body_declares_30day_review_window
      * test_body_declares_prefix_normalization_assumption
  - 1 test_real_skills_allowed_populated_by_f1. policy.yaml todavia
    declara 13; ALLOWED_SKILLS ya crecio a 14.

Tests behavior siguen el patron de TestPatternAuditBehavior E3a — la
referencia mas cercana: read-only advisory main-strict.

Ajuste 3 del usuario aplicado: el test del 30-day window valida
DECLARACION del body, no ejecucion de date math.

Renames:
  - test_real_skills_allowed_populated_by_e3b -> _by_f1.
    Tupla 13 -> 14 via ALLOWED_SKILLS shared.
  - test_all_thirteen_e1_e3b_skills_end_to_end ->
    test_all_fourteen_e1_e3b_f1_skills_end_to_end.

GREEN phase proxima: crear SKILL.md + bump policy.yaml.skills_allowed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(f1): GREEN - audit-session skill + bump skills_allowed 13->14

GREEN phase per CLAUDE.md regla #3 and .claude/rules/tests.md.
RED commit (5d6091d ancestor + RED) introduced 16 failures; this commit
turns all 16 green without touching any unrelated test.

Skill body (.claude/skills/audit-session/SKILL.md ~110 lines):
  - Frontmatter minimal canonical: name=audit-session, description starts
    "Use when ...", allowed-tools list of 6 entries (Glob, Grep, Read,
    Bash(find:*), Bash(wc:*), Bash(.claude/skills/_shared/log-invocation.sh:*)).
    No Bash(git log:*) per ajuste 2 del usuario.
  - Read-only advisory main-strict: scope explicito MAY/MUST NOT.
  - Three audit surfaces declared (Fase -1 decision A1.a):
      Bucket 1: skills_allowed vs skills.jsonl invocations.
      Bucket 2: lifecycle.*.hooks_required vs per-hook log files.
      Bucket 3: audit.required_logs vs file existence/nonempty/mtime.
  - 30-day review window declared as textual guidance (A2.a +
    ajuste 3 del usuario): the skill does NOT execute date math, the
    human applies the lens when reading the report.
  - Prefix normalization assumption (A3.a): pos:<slug> stripped before
    cross-comparing with policy.yaml.skills_allowed.
  - Pre-existing drift expected (A4.a): hooks.jsonl declared in
    audit.required_logs but no such file exists. Skill reports it as
    Bucket 3 candidate, does NOT auto-fix.
  - Report structured by surface (A5.a): three sections + summary line.
  - audit.session_audit.schedule (e.g. weekly) explicitly NOT enforced
    (A6.a): documental cadence, no cron/CI hook in F1.
  - Out of scope: external fork delegation (main-strict by design),
    cross-session aggregation, date arithmetic, mutating policy or logs.

Body satisfies all 5 TestAuditSessionBehavior tests literally:
  - skills_allowed + lifecycle + hooks_required + required_logs tokens.
  - "advisory"/"read-only"/"does not modify"/"no modifica" tokens.
  - No "subagent"/"code-architect"/"agent(" tokens (uses "fork" for
    external delegation refusal).
  - "30" + "day"/"review window" tokens.
  - "pos:" + "normaliz" tokens.

policy.yaml:
  - skills_allowed: 13 -> 14 entries (audit-session appended).
  - Comment line 268 updated: "E3b 13 skills -> F1 14 skills".

Test deltas (793 passed + 1 skipped, zero regression):
  - 10 parametrized [audit-session] in TestStructure / TestFrontmatter
    / TestBody pass.
  - 5 TestAuditSessionBehavior pass.
  - test_real_skills_allowed_populated_by_f1 passes (tuple is now 14).
  - test_all_fourteen_e1_e3b_f1_skills_end_to_end passes (logger ->
    Stop hook end-to-end with all 14 skills allowlisted).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(f1): docs-sync - ROADMAP + HANDOFF + MASTER_PLAN + skills-map

Phase N+3 docs-sync per CLAUDE.md regla #2 (docs-sync within branch)
and pre-pr-gate.py canonical baseline (ROADMAP + HANDOFF mandatory)
plus conditional `.claude/rules/skills-map.md` for `skills/` paths.

ROADMAP.md:
  - Top table: Fase F status `pendiente` -> `1/4 (F1 ok, F2..F4 pending)`.
  - F1 row: status `pending` -> `done (PR pending)` with concrete scope.
  - New section "Progreso Fase F" with feat/f1 detail block:
    entregables, allowed-tools rationale, contract locked by suite,
    A1.a..A6.a decisions, 3 mandatory user adjustments, criterio salida.

HANDOFF.md:
  - Section 1 snapshot: Rama actual F1 (PR pendiente); next branch F2;
    F1 entregables one-liner.
  - Section 9 Proxima rama: F2 feat/f2-agents-subagents with scope
    (3 subagent definitions, naming-conflict question, agents_allowed
    evaluation).
  - New section 19 "Estado F1": full closure block parallel to E3a/E3b,
    with entregables + contract + 3 mandatory adjustments + YAML gotcha
    avoided + resultado (793 + 1 skip) + cross-references.

MASTER_PLAN.md Rama F1:
  - Replaced 1-line stub with full closing block: scope concrete (3
    surfaces), A1.a..A6.a decisions, 3 mandatory adjustments, contexto
    a leer, criterio de salida, carry-overs to F2..F4. Branch marker
    set to "PR pendiente".

.claude/rules/skills-map.md:
  - Audit + Release section: audit-session row populated with concrete
    contract (3 surfaces + main-strict + 30-day textual guidance + no
    auto-fix + allowed-tools list). Replaces the 1-line stub from F0.

Tests: 793 passed + 1 skipped (D5 intentional subprocess-no-cover).
Zero regression D1..D6 + E1a..E3b. Behavior contract for audit-session
locked across 5 tests.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
javiAI added a commit that referenced this pull request Apr 27, 2026
…#28)

* test(f4): RED — marketplace.json + release.yml + plugin.json version pin

Fase 0 (kickoff) + Fase 1 (RED tests) for feat/f4-marketplace-public-repo.

Scope (per Fase -1 ratificada con 8 ajustes del usuario):
- Aterrizar infra local del marketplace + release flow sin depender de
  que javiAI/pos-marketplace exista todavía (A1.b).
- .claude-plugin/marketplace.json con schema oficial Claude Code:
  top-level {name, owner, plugins}; owner.name; plugin {name, source}
  con source.{source=github, repo, ref="v"+version}.
- .github/workflows/release.yml trigger tag:v*, jobs version-match /
  selftest / build-bundle / publish-release / mirror-marketplace
  (mirror condicional/skippable hasta repo público).
- Bump plugin.json.version 0.0.1 → 0.1.0 (primer release público).

Archivos en este commit (RED tests, expected failures):
- bin/tests/test_marketplace_json_schema.py (12 tests)
- bin/tests/test_release_workflow_smoke.py (6 tests)
- bin/tests/test_plugin_json_version_bump.py (3 tests)

Tests verifican:
- marketplace.json schema oficial mínimo (top-level + owner + plugin)
- plugin name/version/ref sync entre marketplace ↔ plugin.json
- release.yml trigger v*, jobs esperados, publish-release.needs ⊇
  {version-match, selftest, build-bundle}, mirror-marketplace
  conditional/skippable
- plugin.json.version pin = "0.1.0"

Estado RED actual: 19 failed + 12 passed (12 = F3 baseline 9 + 3
plugin.json existe/parses). Sin regresión en F3.

Diferidos en F4 (regla #7 CLAUDE.md):
- audit.yml nightly (sin consumer hoy; rama propia post-F4).
- /pos:pr-description, /pos:release skills (sin repetición demostrada).
- CHANGELOG.md enforced (auto-generated from git log entre tags).
- refactor/template-policy-d5b-migration (drift independiente).
- Fase G (Knowledge Plane).

GREEN impl + docs (RELEASE.md/ARCHITECTURE/ci-cd/MASTER_PLAN/ROADMAP/
HANDOFF) entran en commits siguientes dentro de esta rama.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* feat(f4): GREEN — marketplace.json + release.yml + plugin.json 0.1.0

Fase 2 (GREEN) — flippea los 19 RED del commit previo.

.claude-plugin/marketplace.json (NEW):
- Schema oficial Claude Code marketplace.
- top-level: name="pos-marketplace", owner.name="javiAI", plugins[].
- plugins[0]: name=pos, source={source:github, repo:javiAI/
  project-operating-system, ref:v0.1.0}, version=0.1.0.
- metadata.{description, version} para humans.

.claude-plugin/plugin.json:
- version 0.0.1 → 0.1.0 (primer release público; pre-1.0).
- Single source of truth; tag git debe ser v${version}.

.github/workflows/release.yml (NEW):
- Trigger: push tags v*.
- Jobs:
  - version-match: assert plugin.json.version == ${tag#v}.
  - selftest: pytest bin/tests -q (reusa contrato F3).
  - build-bundle: tar.gz curated plugin-only (.claude-plugin/,
    .claude/skills/, .claude/rules/, hooks/, agents/, policy.yaml,
    bin/pos-selftest.sh, bin/_selftest.py, docs/RELEASE.md). Excluye
    generator/, tools/, templates/, questionnaire/.
  - publish-release: needs [version-match, selftest, build-bundle];
    gh release create con bundle como asset.
  - mirror-marketplace: condicional vía vars.POS_MARKETPLACE_REPO;
    si vacío skippea sin fallar release. Abre PR contra repo público
    cuando esté configurado.
- Actions pinneadas por SHA (ci-cd.md regla #2).
- permissions.contents=write para gh release create.

Tests post-GREEN: 21 passed (12 marketplace + 6 release.yml + 3 plugin
version), suite total 644 passed + 1 skipped (skip D5 intencional F3).
Sin regresión.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs(f4): sync — RELEASE runbook + ARCH §13 + ci-cd + ROADMAP/HANDOFF/MASTER_PLAN

Fase N+3 — docs-sync dentro de la rama (CLAUDE.md regla #2,
docs.md § Docs-sync en cada rama).

docs/RELEASE.md (NEW):
- Runbook user-facing de versionado + bundle + flujo + recovery.
- Contrato de versionado: plugin.json.version source of truth;
  tag = v${version}; marketplace.json.source.ref espeja.
- Bundle scope plugin-only curated (incluye/excluye explícitos).
- Flujo en 5 pasos: bump → tag → workflow → verify → recovery.
- Activación del mirror cuando exista repo público (3 pasos:
  crear repo + gh variable set POS_MARKETPLACE_REPO + gh secret
  set POS_MARKETPLACE_TOKEN).
- Instalación user-facing (/plugin marketplace add + /plugin
  install pos).
- Diferidos enumerados.

docs/ARCHITECTURE.md § 13 (Marketplace + Release flow):
- Reescrita de placeholder de 6 líneas a sub-sección completa.
- Manifest, source of truth de versión, jobs del workflow,
  bundle scope curated, deferral del repo público,
  determinismo del flujo, instalación user-facing, deferrals.

.claude/rules/ci-cd.md:
- Bullet release.yml promovido de "Diferidos" a "Aterrizado"
  (entregado en F4).
- Nuevo H3 "### Job release (entregado en F4)" con scope
  completo (5 jobs + bundle curated + source of truth).

ROADMAP.md:
- Tabla: F4 marcada ✅ (PR pendiente).
- Nueva sección § feat/f4-marketplace-public-repo en Progreso
  Fase F: scope, entregables, decisiones Fase -1 (A1.b..A8),
  contrato fijado, carry-overs, criterio de salida (665 passed
  + 1 skipped).

HANDOFF.md:
- §1 Snapshot: rama actual F4 (entrega + suite update).
- §9 Próxima rama: Fase F cerrada; carry-overs (template-policy
  d5b migration, marketplace activación, skills diferidas,
  audit.yml).
- §22 nuevo: Estado F4 con entregables + contrato + decisiones
  + carry-overs (paralelo a §19 F1, §20 F2, §21 F3).

MASTER_PLAN.md § Rama F4:
- Expandida de 3 líneas a sección completa: scope realizado,
  archivos entregados con detalle por path, decisiones Fase -1
  (A1.b..A8), contexto leído, criterio de salida, carry-overs.

Simplify pre-commit:
- Recortados 3 bullets de "Ajustes durante implementación"
  (heredoc syntax glitch + rtk wrapper output filter) —
  debug ephemera, pertenecen a commit history.
- Mantenido el único gotcha persistente: PyYAML 1.1 parsea
  `on:` como Python bool True (patrón reutilizable para tests
  futuros de workflow YAML).

Sin tocar (per A8 ratificado en Fase -1): policy.yaml, hooks/**,
.claude/skills/**, agents/**, generator/**, templates/**,
.claude/rules/skills-map.md.

Tests: sin cambios (GREEN ya verde con 665 passed + 1 skipped).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(f4): release.yml — gating de version-match + idempotencia mirror (PR #28 review)

Aplica las 4 findings de Copilot review de PR #28. Las 4 son real
correctness/idempotency bugs, no estilo. Triage value/effort: todas
high/trivial-low → FIX.

Gating de version-match (findings 1 + 4):
- selftest: needs [version-match].
- build-bundle: needs [version-match].

Antes corrían en paralelo con version-match → CI gastaba tiempo en
tags mismatched y contradecía el "orden estricto" documentado.
Ahora: version-match → (selftest + build-bundle) → publish-release
→ mirror-marketplace.

Idempotencia mirror-marketplace (findings 2 + 3):
- Tras `git add marketplace.json`, si `git diff --cached --quiet`
  no hay cambios → exit 0. Antes `git commit` no-op fallaba la
  re-run del workflow.
- Antes de `gh pr create`, `gh pr list --head $branch --state open`.
  Si ya existe un PR abierto → skip create con mensaje. Antes
  `gh pr create` con PR existente fallaba la re-run.

Tests: bin/tests 31/31 verde; full explicit run 850 passed + 1
skipped (skip D5 intencional F3). Sin regresión.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Javier <javier.abril@glassnode.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants