feat: add methodology and evaluation pages for EDPA website by jurby · Pull Request #4 · technomaton/edpa

jurby · 2026-03-21T18:30:21Z

Summary

Add complete Astro web scaffold (web/) with Layout, Header, Footer, global.css dark theme, and project config
Add methodology.astro: Full EDPA v2.2 specification page with all 13 sections covering the complete governance model (terminology, architecture, hierarchy, EDPA model with dual-view, cadence options, GitHub implementation, reporting pipeline, risks, and implementation plan)
Add evaluation.astro: Tests and evaluation showcase with 10/10 invariant test cards, auto-calibration (Karpathy loop) docs, CW heuristics visualization with bar charts, method comparison table, and a worked demo calculation verifying the mathematical guarantee

Test plan

cd web && npm install && npm run build succeeds
dist/methodology/index.html exists with all 13 sections
dist/evaluation/index.html exists with 10/10 test PASS badges
All content sourced from docs/methodology-cs.md, portal HTML, tests/test_invariants.py, docs/auto-calibration.md, and docs/evidence-detection.md

Add the complete web/ Astro project scaffold (Layout, Header, Footer, global.css with dark theme) plus two content pages: - methodology.astro: Full EDPA v2.2 specification with all 13 sections (summary, terminology, architecture, hierarchy, model, dual-view, cadence, learning loop, GitHub implementation, reporting, risks, comparison, implementation plan) - evaluation.astro: Tests & evaluation showcase with 10/10 invariant test cards, auto-calibration (Karpathy loop) documentation, CW heuristics visualization, method comparison, and a worked demo calculation verifying the mathematical guarantee Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(backlog): extend `backlog add --type` choices to all 7 types The MCP server's `_handle_item_create` accepts all 7 types via TYPE_DIRS (Initiative, Epic, Feature, Story, Defect, Event, Risk), but the CLI argparse rejected Defect/Event/Risk. Users had to write YAML by hand and bump id_counters.yaml manually to seed those types — exactly what Wave B Unit 7 of the V2 E2E run had to work around. Extends `choices=[...]` to mirror TYPE_DIRS and clarifies `--parent` help text (Initiative/Defect/Event/Risk have parent=None per PARENT_RULES). No behavior change in cmd_add itself — it already delegates to the MCP handler which is type-agnostic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(close-iteration): make Stage 2b (refresh contributors[]) explicitly mandatory Wave B Unit 10 of the 2026-05 V2 E2E run had to add a one-shot `detect_contributors.py --all-items` step before the engine produced correct output — without it, engine read empty `contributors[]` blocks (only `evidence[]` had been materialized by `sync_pr_contributions.py`) and silently returned 0h derived per item. No error, just zeros. The Stage 2b text already described the step but read as a conditional ("if evidence has accumulated"). Rewrites it as REQUIRED with an explicit warning about the silent 0h failure mode and a closing "Always run, never skip." line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(mcp): document single-project scope limitation MCP server is a persistent process with a fixed cwd at launch. Subsequent `cd` calls from tool invocations don't change it. Tools resolve `.edpa/` from that fixed location regardless of where the assistant just navigated. The 2026-05 V2 E2E run hit this directly — workers driving a sandbox under /tmp couldn't reach it via MCP and had to fall back to direct script invocation. Documents the constraint, lists workarounds (EDPA_ROOT pre-launch, client restart, direct script CLI), and notes that multi-project routing is intentionally unimplemented. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(e2e fixture): use portfolio gate ladder for Initiative/Epic transitions work_plan.yaml had Initiative I-1 and Epics E-1/E-2 transitioning to status=Validating at end-of-PI. That's invalid: validate_syntax.py's PORTFOLIO_STATUSES set doesn't include Validating — it's a delivery-only gate (Feature/Story/Defect). cw_heuristics.yaml.tmpl's gate_weights mirror this: Initiative + Epic go Ready -> Implementing -> Done with no QA step. Switches I/E transitions to Implementing (portfolio level: MVP in build). Adds a header comment in gate_transitions explaining both ladders so future fixture authors don't repeat the mistake. Surfaced by Wave C Unit 13 of the 2026-05 V2 E2E run — backlog.py validate exited 1 because I-1/E-1/E-2 carried invalid status. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(e2e): add Skill-tool subagent gotcha + cross-link fixes Wave B Unit 6 of the 2026-05 V2 E2E run discovered that the \`Skill\` tool returns the SKILL.md content (an instructions template), not the side-effects of running the skill. Subagents driving EDPA programmatically must follow the returned instructions as if they were a user prompt; they cannot treat the response as documentation. Adds this as limitation #4. Cross-links the MCP single-project doc. Updates the "Findings from initial run" entries for gate divergence, CI gap, and backlog.py limit to point at their fixes in this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): hybrid E2E run + parameterize verify/cleanup scripts Full 2 PI × 5 iter run on fix/e2e-v2-findings against fresh sandbox technomaton/edpa-e2e-20260527-181051-2c56a6a0 (now archived). Verifies all three fix-branch fixes hold end-to-end: - c1cbbc2: backlog.py add --type accepts all 7 types (Defect/Event/Risk went through CLI path in Wave B Unit 7; no manual YAML fallback needed). - 85cd439: close-iteration Stage 2b (detect_contributors.py --all-items) produces correct derived hours on the first engine pass — 0 _rev2 snapshots (previous run had 2 due to discovery-time recompute). - 3cb8ff1: Initiative/Epic gate transitions use portfolio ladder (Implementing, not Validating). backlog.py validate now reports 0 errors (previously had 3). Wave structure (per /batch): - Wave B sequential (Units 6-10): install → seed → PI-1 (real CI) → PI-2 (synthetic) → close + engine + reports - Wave C parallel (Units 11-13): invariants, reports, backlog state - Wave D: cleanup (archive + rm sandbox) + this commit Results: 14/14 PRs merged, 14/14 CI workflow runs success, 10/10 iterations closed with all_invariants_passed=true on first engine pass, 10 frozen snapshots, 50 per-person timesheets, 33 backlog items in expected end-states. Script fixes uncovered during Wave C (verify scripts had stale hard-coded constants from the previous run): - 10_verify_invariants.py: SANDBOX hard-coded → env-driven _resolve_sandbox() with /tmp/edpa-e2e-current-run-tag fallback. - 11_verify_reports.py: DEFAULT_SANDBOX/REPO derived from current-run-tag; EXPECTED_MERGED_PRS now CI-mode-aware (24 real / 14 hybrid / 0 synthetic) via EDPA_E2E_CI_MODE env var. - 12_verify_backlog.py: EXPECTED_COUNTS updated for post-3cb8ff1 portfolio ladder (Initiative/Epic at Implementing). Fixed iteration status lookup to read root data["status"] (lifecycle) instead of data["iteration"]["status"] (planning). - 99_cleanup.sh: env-driven SANDBOX_DIR fallback; accepts either gh_repo or repo_full_name in .e2e_state.json. 8 phase run logs (01, 04, 07-12) refreshed with current run's results, timestamps, sandbox SHAs, and the script-fix findings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jurby merged commit a78f092 into main Mar 21, 2026
1 check failed

jurby mentioned this pull request May 27, 2026

fix(v2): apply E2E findings + verify on hybrid run #47

Merged

5 tasks

jurby deleted the worktree-agent-a960c96b branch June 1, 2026 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add methodology and evaluation pages for EDPA website#4

feat: add methodology and evaluation pages for EDPA website#4
jurby merged 1 commit into
mainfrom
worktree-agent-a960c96b

jurby commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jurby commented Mar 21, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant