Skip to content

feat: Phase 4 capstone - eval summary + final report + demoFeature/phase4 demo#24

Merged
AD2000X merged 3 commits into
mainfrom
feature/phase4-demo
Jun 3, 2026
Merged

feat: Phase 4 capstone - eval summary + final report + demoFeature/phase4 demo#24
AD2000X merged 3 commits into
mainfrom
feature/phase4-demo

Conversation

@AD2000X
Copy link
Copy Markdown
Owner

@AD2000X AD2000X commented Jun 3, 2026

Phase 4 - capstone (eval summary + final report + demo)

Aggregates the per-phase evaluation artifacts into one summary, a written report,
and an artifact-backed Gradio demo. Assembles existing metrics - no new research
(GriTS / Ragas / DeepEval are future work).

Contents (12 files)

  • src/phase4_summary.py, scripts/build_phase4_summary.py, tests/test_phase4_summary.py
    • pure summary backbone + inline layout-CSV aggregation; writes
      outputs/evaluation/phase4_summary.json (gitignored) + reports/phase4_metrics.md
      (committed, no-drift)
  • reports/final_report.md, notebooks/07_final_report.ipynb - report (numbers read
    from the generated table)
  • scripts/run_demo.py, notebooks/06_demo.ipynb - key-optional 6-tab Gradio demo
    (BM25 always-on; dense/RRF if embedding stack; answer-gen if OPENROUTER_API_KEY)
  • docs/phase4_brief.md - implementation brief
  • README/DEVLOG/PLAN - Phase 4 status; removed stale "Phase 2 active" README wording

Verification

  • Full pytest 246 green (+10)
  • build_phase4_summary reproduces the DEVLOG layout numbers exactly; no-drift on
    reports/phase4_metrics.md
  • Demo imports without gradio; degraded path (no key, no embedding stack) works

Artifact policy: outputs/ stays gitignored; only reports/*.md are committed.

AD2000X added 3 commits June 3, 2026 16:14
The FUNSD-headline print had a literal newline inside an f-string (invalid before Python 3.12); use a separate print() for the blank line.
- scripts/run_demo.py: add gradio_allowed_paths() and pass allowed_paths to
  demo.launch() so the Layout tab serves crop images when outputs live outside
  the repo (e.g. Colab Drive)
- notebooks/07_final_report.ipynb: build the FUNSD headline into a variable
  before print() (clearer; identical output)
@AD2000X AD2000X merged commit c363cb1 into main Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant