Pipeline, eval/checkers, experiments, and execution runner for an agentic OCR + semantic-segmentation system targeting humanities/philosophy libraries.
This is the system-under-test repo of a three-repo topology (see
PLAN.md §11.1); it pins the schema and corpus-generator repos and
records their pins on every eval result.
This README holds pointers only — no claims that can go stale (PLAN §11.2).
| Document | Authority |
|---|---|
PLAN.md |
Strategy. Edited only at phase gates. |
STATE.md |
What is true now. Read this first. |
ledger.md |
Append-only predict→verdict log. |
experiments/ |
Pre-registrations + results, immutable once verdict-labeled. |
docs/prior-findings.md |
Distilled empirical record carried from scholardoc. |
eval/— checker suite +eval/lib/scoring core (ported from scholardoc).eval/fixtures/— JSON eval fixtures (no PDFs ever; PLAN §11.5).tests/— pytest suite foreval/.experiments/E1…E7/— one pre-registered experiment each (PLAN §9).runner/— SSH-over-Tailscale + rsync execution skeleton (PLAN §7.1).docs/— prior findings + ADRs.
uv sync
uv run pytest
uv run ruff check .
uv run mypyPhase 0 status (apparatus, no models yet) is tracked in STATE.md.