Skip to content

feat: add methodology and evaluation pages for EDPA website#4

Merged
jurby merged 1 commit into
mainfrom
worktree-agent-a960c96b
Mar 21, 2026
Merged

feat: add methodology and evaluation pages for EDPA website#4
jurby merged 1 commit into
mainfrom
worktree-agent-a960c96b

Conversation

@jurby
Copy link
Copy Markdown
Contributor

@jurby jurby commented Mar 21, 2026

Summary

  • Add complete Astro web scaffold (web/) with Layout, Header, Footer, global.css dark theme, and project config
  • Add methodology.astro: Full EDPA v2.2 specification page with all 13 sections covering the complete governance model (terminology, architecture, hierarchy, EDPA model with dual-view, cadence options, GitHub implementation, reporting pipeline, risks, and implementation plan)
  • Add evaluation.astro: Tests and evaluation showcase with 10/10 invariant test cards, auto-calibration (Karpathy loop) docs, CW heuristics visualization with bar charts, method comparison table, and a worked demo calculation verifying the mathematical guarantee

Test plan

  • cd web && npm install && npm run build succeeds
  • dist/methodology/index.html exists with all 13 sections
  • dist/evaluation/index.html exists with 10/10 test PASS badges
  • All content sourced from docs/methodology-cs.md, portal HTML, tests/test_invariants.py, docs/auto-calibration.md, and docs/evidence-detection.md

Generated with Claude Code

Add the complete web/ Astro project scaffold (Layout, Header, Footer,
global.css with dark theme) plus two content pages:

- methodology.astro: Full EDPA v2.2 specification with all 13 sections
  (summary, terminology, architecture, hierarchy, model, dual-view,
  cadence, learning loop, GitHub implementation, reporting, risks,
  comparison, implementation plan)

- evaluation.astro: Tests & evaluation showcase with 10/10 invariant
  test cards, auto-calibration (Karpathy loop) documentation, CW
  heuristics visualization, method comparison, and a worked demo
  calculation verifying the mathematical guarantee

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jurby jurby merged commit a78f092 into main Mar 21, 2026
1 check failed
jurby added a commit that referenced this pull request May 27, 2026
* fix(backlog): extend `backlog add --type` choices to all 7 types

The MCP server's `_handle_item_create` accepts all 7 types via TYPE_DIRS
(Initiative, Epic, Feature, Story, Defect, Event, Risk), but the CLI
argparse rejected Defect/Event/Risk. Users had to write YAML by hand and
bump id_counters.yaml manually to seed those types — exactly what Wave B
Unit 7 of the V2 E2E run had to work around.

Extends `choices=[...]` to mirror TYPE_DIRS and clarifies `--parent`
help text (Initiative/Defect/Event/Risk have parent=None per PARENT_RULES).
No behavior change in cmd_add itself — it already delegates to the MCP
handler which is type-agnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(close-iteration): make Stage 2b (refresh contributors[]) explicitly mandatory

Wave B Unit 10 of the 2026-05 V2 E2E run had to add a one-shot
`detect_contributors.py --all-items` step before the engine produced
correct output — without it, engine read empty `contributors[]` blocks
(only `evidence[]` had been materialized by `sync_pr_contributions.py`)
and silently returned 0h derived per item. No error, just zeros.

The Stage 2b text already described the step but read as a conditional
("if evidence has accumulated"). Rewrites it as REQUIRED with an
explicit warning about the silent 0h failure mode and a closing
"Always run, never skip." line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(mcp): document single-project scope limitation

MCP server is a persistent process with a fixed cwd at launch. Subsequent
`cd` calls from tool invocations don't change it. Tools resolve `.edpa/`
from that fixed location regardless of where the assistant just navigated.

The 2026-05 V2 E2E run hit this directly — workers driving a sandbox
under /tmp couldn't reach it via MCP and had to fall back to direct
script invocation. Documents the constraint, lists workarounds
(EDPA_ROOT pre-launch, client restart, direct script CLI), and notes
that multi-project routing is intentionally unimplemented.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e fixture): use portfolio gate ladder for Initiative/Epic transitions

work_plan.yaml had Initiative I-1 and Epics E-1/E-2 transitioning to
status=Validating at end-of-PI. That's invalid: validate_syntax.py's
PORTFOLIO_STATUSES set doesn't include Validating — it's a delivery-only
gate (Feature/Story/Defect). cw_heuristics.yaml.tmpl's gate_weights
mirror this: Initiative + Epic go Ready -> Implementing -> Done with no
QA step.

Switches I/E transitions to Implementing (portfolio level: MVP in
build). Adds a header comment in gate_transitions explaining both
ladders so future fixture authors don't repeat the mistake.

Surfaced by Wave C Unit 13 of the 2026-05 V2 E2E run — backlog.py
validate exited 1 because I-1/E-1/E-2 carried invalid status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(e2e): add Skill-tool subagent gotcha + cross-link fixes

Wave B Unit 6 of the 2026-05 V2 E2E run discovered that the \`Skill\` tool
returns the SKILL.md content (an instructions template), not the
side-effects of running the skill. Subagents driving EDPA programmatically
must follow the returned instructions as if they were a user prompt; they
cannot treat the response as documentation.

Adds this as limitation #4. Cross-links the MCP single-project doc.
Updates the "Findings from initial run" entries for gate divergence,
CI gap, and backlog.py limit to point at their fixes in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(e2e): hybrid E2E run + parameterize verify/cleanup scripts

Full 2 PI × 5 iter run on fix/e2e-v2-findings against fresh sandbox
technomaton/edpa-e2e-20260527-181051-2c56a6a0 (now archived).

Verifies all three fix-branch fixes hold end-to-end:
- c1cbbc2: backlog.py add --type accepts all 7 types (Defect/Event/Risk
  went through CLI path in Wave B Unit 7; no manual YAML fallback needed).
- 85cd439: close-iteration Stage 2b (detect_contributors.py --all-items)
  produces correct derived hours on the first engine pass — 0 _rev2
  snapshots (previous run had 2 due to discovery-time recompute).
- 3cb8ff1: Initiative/Epic gate transitions use portfolio ladder
  (Implementing, not Validating). backlog.py validate now reports 0
  errors (previously had 3).

Wave structure (per /batch):
- Wave B sequential (Units 6-10): install → seed → PI-1 (real CI) →
  PI-2 (synthetic) → close + engine + reports
- Wave C parallel (Units 11-13): invariants, reports, backlog state
- Wave D: cleanup (archive + rm sandbox) + this commit

Results: 14/14 PRs merged, 14/14 CI workflow runs success, 10/10
iterations closed with all_invariants_passed=true on first engine pass,
10 frozen snapshots, 50 per-person timesheets, 33 backlog items in
expected end-states.

Script fixes uncovered during Wave C (verify scripts had stale
hard-coded constants from the previous run):
- 10_verify_invariants.py: SANDBOX hard-coded → env-driven
  _resolve_sandbox() with /tmp/edpa-e2e-current-run-tag fallback.
- 11_verify_reports.py: DEFAULT_SANDBOX/REPO derived from
  current-run-tag; EXPECTED_MERGED_PRS now CI-mode-aware (24 real /
  14 hybrid / 0 synthetic) via EDPA_E2E_CI_MODE env var.
- 12_verify_backlog.py: EXPECTED_COUNTS updated for post-3cb8ff1
  portfolio ladder (Initiative/Epic at Implementing). Fixed iteration
  status lookup to read root data["status"] (lifecycle) instead of
  data["iteration"]["status"] (planning).
- 99_cleanup.sh: env-driven SANDBOX_DIR fallback; accepts either
  gh_repo or repo_full_name in .e2e_state.json.

8 phase run logs (01, 04, 07-12) refreshed with current run's results,
timestamps, sandbox SHAs, and the script-fix findings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jurby jurby deleted the worktree-agent-a960c96b branch June 1, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant