feat: add methodology and evaluation pages for EDPA website#4
Merged
Conversation
Add the complete web/ Astro project scaffold (Layout, Header, Footer, global.css with dark theme) plus two content pages: - methodology.astro: Full EDPA v2.2 specification with all 13 sections (summary, terminology, architecture, hierarchy, model, dual-view, cadence, learning loop, GitHub implementation, reporting, risks, comparison, implementation plan) - evaluation.astro: Tests & evaluation showcase with 10/10 invariant test cards, auto-calibration (Karpathy loop) documentation, CW heuristics visualization, method comparison, and a worked demo calculation verifying the mathematical guarantee Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5 tasks
jurby
added a commit
that referenced
this pull request
May 27, 2026
* fix(backlog): extend `backlog add --type` choices to all 7 types
The MCP server's `_handle_item_create` accepts all 7 types via TYPE_DIRS
(Initiative, Epic, Feature, Story, Defect, Event, Risk), but the CLI
argparse rejected Defect/Event/Risk. Users had to write YAML by hand and
bump id_counters.yaml manually to seed those types — exactly what Wave B
Unit 7 of the V2 E2E run had to work around.
Extends `choices=[...]` to mirror TYPE_DIRS and clarifies `--parent`
help text (Initiative/Defect/Event/Risk have parent=None per PARENT_RULES).
No behavior change in cmd_add itself — it already delegates to the MCP
handler which is type-agnostic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(close-iteration): make Stage 2b (refresh contributors[]) explicitly mandatory
Wave B Unit 10 of the 2026-05 V2 E2E run had to add a one-shot
`detect_contributors.py --all-items` step before the engine produced
correct output — without it, engine read empty `contributors[]` blocks
(only `evidence[]` had been materialized by `sync_pr_contributions.py`)
and silently returned 0h derived per item. No error, just zeros.
The Stage 2b text already described the step but read as a conditional
("if evidence has accumulated"). Rewrites it as REQUIRED with an
explicit warning about the silent 0h failure mode and a closing
"Always run, never skip." line.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(mcp): document single-project scope limitation
MCP server is a persistent process with a fixed cwd at launch. Subsequent
`cd` calls from tool invocations don't change it. Tools resolve `.edpa/`
from that fixed location regardless of where the assistant just navigated.
The 2026-05 V2 E2E run hit this directly — workers driving a sandbox
under /tmp couldn't reach it via MCP and had to fall back to direct
script invocation. Documents the constraint, lists workarounds
(EDPA_ROOT pre-launch, client restart, direct script CLI), and notes
that multi-project routing is intentionally unimplemented.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(e2e fixture): use portfolio gate ladder for Initiative/Epic transitions
work_plan.yaml had Initiative I-1 and Epics E-1/E-2 transitioning to
status=Validating at end-of-PI. That's invalid: validate_syntax.py's
PORTFOLIO_STATUSES set doesn't include Validating — it's a delivery-only
gate (Feature/Story/Defect). cw_heuristics.yaml.tmpl's gate_weights
mirror this: Initiative + Epic go Ready -> Implementing -> Done with no
QA step.
Switches I/E transitions to Implementing (portfolio level: MVP in
build). Adds a header comment in gate_transitions explaining both
ladders so future fixture authors don't repeat the mistake.
Surfaced by Wave C Unit 13 of the 2026-05 V2 E2E run — backlog.py
validate exited 1 because I-1/E-1/E-2 carried invalid status.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(e2e): add Skill-tool subagent gotcha + cross-link fixes
Wave B Unit 6 of the 2026-05 V2 E2E run discovered that the \`Skill\` tool
returns the SKILL.md content (an instructions template), not the
side-effects of running the skill. Subagents driving EDPA programmatically
must follow the returned instructions as if they were a user prompt; they
cannot treat the response as documentation.
Adds this as limitation #4. Cross-links the MCP single-project doc.
Updates the "Findings from initial run" entries for gate divergence,
CI gap, and backlog.py limit to point at their fixes in this PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(e2e): hybrid E2E run + parameterize verify/cleanup scripts
Full 2 PI × 5 iter run on fix/e2e-v2-findings against fresh sandbox
technomaton/edpa-e2e-20260527-181051-2c56a6a0 (now archived).
Verifies all three fix-branch fixes hold end-to-end:
- c1cbbc2: backlog.py add --type accepts all 7 types (Defect/Event/Risk
went through CLI path in Wave B Unit 7; no manual YAML fallback needed).
- 85cd439: close-iteration Stage 2b (detect_contributors.py --all-items)
produces correct derived hours on the first engine pass — 0 _rev2
snapshots (previous run had 2 due to discovery-time recompute).
- 3cb8ff1: Initiative/Epic gate transitions use portfolio ladder
(Implementing, not Validating). backlog.py validate now reports 0
errors (previously had 3).
Wave structure (per /batch):
- Wave B sequential (Units 6-10): install → seed → PI-1 (real CI) →
PI-2 (synthetic) → close + engine + reports
- Wave C parallel (Units 11-13): invariants, reports, backlog state
- Wave D: cleanup (archive + rm sandbox) + this commit
Results: 14/14 PRs merged, 14/14 CI workflow runs success, 10/10
iterations closed with all_invariants_passed=true on first engine pass,
10 frozen snapshots, 50 per-person timesheets, 33 backlog items in
expected end-states.
Script fixes uncovered during Wave C (verify scripts had stale
hard-coded constants from the previous run):
- 10_verify_invariants.py: SANDBOX hard-coded → env-driven
_resolve_sandbox() with /tmp/edpa-e2e-current-run-tag fallback.
- 11_verify_reports.py: DEFAULT_SANDBOX/REPO derived from
current-run-tag; EXPECTED_MERGED_PRS now CI-mode-aware (24 real /
14 hybrid / 0 synthetic) via EDPA_E2E_CI_MODE env var.
- 12_verify_backlog.py: EXPECTED_COUNTS updated for post-3cb8ff1
portfolio ladder (Initiative/Epic at Implementing). Fixed iteration
status lookup to read root data["status"] (lifecycle) instead of
data["iteration"]["status"] (planning).
- 99_cleanup.sh: env-driven SANDBOX_DIR fallback; accepts either
gh_repo or repo_full_name in .e2e_state.json.
8 phase run logs (01, 04, 07-12) refreshed with current run's results,
timestamps, sandbox SHAs, and the script-fix findings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
web/) with Layout, Header, Footer, global.css dark theme, and project configTest plan
cd web && npm install && npm run buildsucceedsdist/methodology/index.htmlexists with all 13 sectionsdist/evaluation/index.htmlexists with 10/10 test PASS badgesdocs/methodology-cs.md, portal HTML,tests/test_invariants.py,docs/auto-calibration.md, anddocs/evidence-detection.mdGenerated with Claude Code