Add manual verifier UI with jobs.db persistence and request-o-matic check#53
Merged
Conversation
Adds a serverless verifier UI that shows each row's cropped image strip next to its model-detected text in editable fields. Three new pieces wired together: - scripts/make_verifier_bundle.py — pre-processor: PageResult JSON + page PNG into a bundle.json with per-quadrant and per-row pixel bboxes. Continuation-merged entries get a physical-row span so their crops cover the wrapped lines instead of nudging subsequent rows out of alignment; double_height entries get span=2 inherently. - scripts/derive_truth.py — verified.json into tests/golden/<stem>.truth.json by extracting short uppercased substrings (whitespace-tokenized date, 4-char jock prefix, 24-char artist portion via parse_artist_track). Substring rules live in Python so they're testable in one place. - verifier/ — static HTML/JS/CSS SPA (no build step). Loads a bundle via ?bundle= URL param or file picker, canvas-crops each row, lets the user edit raw_text / type_raw / notes / hour_raw / jock_raw / page meta, mark hallucinations (x), and add missed rows (+). Export emits two files: <stem>.verified.json (PageResult-shaped, plugs back into the pipeline) and <stem>.corrections.json (delta vs the immutable bundle snapshot — page/quadrant/row corrections, added_rows, deleted_rows). Also lifts a public partition_row_lines_by_quadrant helper out of core/page_layout's private detection internals — same row-line detector, partitioned per quadrant by body_mid_y and column-side ink density. Bundle layout: data/verifier/<stem>.bundle.json with image_path computed as os.path.relpath to data/pages/<rel-pdf>/<stem>.png so bundles are portable. SCHEMA_VERSION = 1 hardcoded; UI rejects unknown versions. 465 tests pass, ruff/mypy clean.
…fixes DB-backed Save replaces file-download Export. The UI now POSTs to /api/save (verifier/serve.py — a small FastAPI server that also same-origin-proxies request-o-matic and serves the static SPA + data + tests dirs). Save writes <stem>.verified.json and <stem>.corrections.json into data/verifier/ and, when the bundle carries pdf_path + page_number, records the verification in jobs.db via the new JobStore.mark_verified method. jobs.db gains verified_at, verified_path, corrections_path columns plus a partial index on verified_at; init() runs ALTER TABLE for legacy DBs so existing data is preserved. Bundles bump to SCHEMA_VERSION=2 with optional pdf_path/page_number auto-detected when the result JSON lives under data/results/<rel-pdf>/page-NN.json (null for test fixtures, where Save falls back to file-only persistence). Check artists button looks each row up via request-o-matic's /request endpoint through the same-origin proxy. Per-row badges show resolved artist + matched release with three gating signals stacked on top of the raw confidence: postdates (release_year > the year parsed from bundle.page_date_raw), artist-only fallback (search_type=song_as_artist or song_not_found=true — request-o-matic found the artist but not the played track, and the release shown is one of theirs picked arbitrarily), and disjoint-artist tokens (no shared tokens after stop-word and trailing-s normalization — catches request-o-matic fuzzy-matching on a track word and returning an unrelated artist, e.g. "Pure Joy - Pieces" -> "Coldcut - More Beats & Pieces"). The badge labels resolved release names with "album:" or "sample release:" prefix so they can't be mistaken for a track-level match (the flowsheet records artist - track, but the library matches at release level). partition_row_lines_by_quadrant gets a correction pass for the bottom-block hour-jock-cell baseline. _detect_body_mid_y's gap-by-anchor heuristic sometimes lands body_mid_y BELOW the bottom block's hour-jock baseline, which misattributes that line to the top quadrant — shifting every bottom-quadrant row crop up by one (a quadrant's row 0 crop showed row 1 content). Fix: when the top quadrant's last spacing exceeds 1.3x the median row spacing, that line moves to the corresponding bottom quadrant. _merge_with_spans propagates notes="double_height" to entries that absorbed a continuation, so the notes dropdown reflects multi-physical-row entries instead of showing (none). UI polish: page-view side panel opens by default on bundle load (verifiers need the full-page reference); notes select shows "(none)" instead of blank and the row gets a tinted background when notes is non-null; type_raw is a free-text input matching the schema's str | None (covers doodles like "hand-drawn smiley" and compound values like "O/std"); each row stacks crop above editable field so the layout reflows cleanly when the page-view panel is open. 474 tests, ruff/mypy clean.
Pre-PR review caught three issues. Fixed:
- verifier/README.md bundle-schema example showed schema_version=1 — code requires 2, and the UI rejects any other value. Updated the example to v2, added the new pdf_path and page_number fields, and revised the version-history paragraph.
- README documented a verified_rows key in corrections.json plus a per-row "verified" checkbox with auto-set semantics that were removed in an earlier UX pass. The actual buildCorrectionsExport emits {page_corrections, quadrant_corrections, row_corrections, added_rows, deleted_rows} with no verified_rows. Stripped the stale docs.
- Added tests/unit/test_verifier_serve.py exercising /api/save: missing-field rejection, PageResult validation rejection, path-traversal guard on stem, both files written on success, overwrite semantics, jobs.db updated when a row matches, db_updated=false when no row matches, db_updated=false when jobs.db doesn't exist. Uses httpx ASGITransport for in-process testing, no live server.
Drive-by: response paths now use relative_to(DATA_ROOT.parent) instead of relative_to(REPO_ROOT) so the response works whether DATA_ROOT lives under the repo (production) or a tmp dir (tests).
482 tests, ruff/mypy clean.
…ic writes Addresses code review feedback on #53. HIGH: - /api/save now writes the Pydantic-validated round-trip (validated.model_dump_json) instead of the raw client dict. A client that leaks bundle-only fields (schema_version, stem, image_path, per-quadrant bbox) no longer pollutes the on-disk verified.json — Pydantic's default extra='ignore' strips them. The on-disk file becomes a canonical representation that bit-matches what core/pipeline.py writes. New test exercises this with deliberately-polluted input. - /data static mount now honors DATA_ROOT (matching the write side), so a moved DATA_ROOT doesn't cause the UI to read from REPO_ROOT/data while the save endpoint writes elsewhere. StaticFiles uses check_dir=False so an empty fresh DATA_ROOT doesn't blow up at server start. MEDIUM: - Atomic file writes via .tmp + os.replace. A failed second write or process kill leaves either both files at their pre-save state or both at the new state, never a half-updated state where verified.json reflects the edit but corrections.json doesn't. New test asserts no .tmp files leak after successful saves. - page_number now rejects bool. isinstance(x, int) is True for bool in Python, so a malformed page_number: true previously coerced to 1 and looked up the wrong job row. Defensive `not isinstance(page_number, bool)` guard; new test confirms files still write but db_updated stays False on bool input. - JobStore.init() result cached per (db_path, process). _open_jobs_store re-checks is_file() each call so a DB created mid-session is picked up without restart, but the migration round trip only runs once. LOW: - _safe_stem rejects whitespace-only stems (would produce confusing " .verified.json" files). Test extended with empty-string, all-spaces, tab. - 1.3 magic threshold in partition_row_lines_by_quadrant promoted to module-level _BOTTOM_BASELINE_REATTRIBUTION_RATIO alongside the other detection constants. - app.js header comment refreshed: references Save not Export, drops the removed _verified flag. 485 tests pass; ruff/mypy clean.
CI install path (.[dev]) didn't pick these up — they were transitive in my local venv but not declared as project dependencies. Result: ModuleNotFoundError on `import uvicorn` at import-time in verifier/serve.py, cascading to every tests/unit/test_verifier_serve.py case. These are runtime deps of the verifier feature now that verifier/ is in the repo: fastapi + uvicorn run the dev server, httpx powers the /api/lookup proxy and httpx.ASGITransport in the test suite. Putting them under main `dependencies` mirrors how library-metadata-lookup declares the same trio. Versions pinned to floors compatible with current pins of pydantic v2 and starlette.
jakebromberg
added a commit
that referenced
this pull request
May 12, 2026
…ic writes Addresses code review feedback on #53. HIGH: - /api/save now writes the Pydantic-validated round-trip (validated.model_dump_json) instead of the raw client dict. A client that leaks bundle-only fields (schema_version, stem, image_path, per-quadrant bbox) no longer pollutes the on-disk verified.json — Pydantic's default extra='ignore' strips them. The on-disk file becomes a canonical representation that bit-matches what core/pipeline.py writes. New test exercises this with deliberately-polluted input. - /data static mount now honors DATA_ROOT (matching the write side), so a moved DATA_ROOT doesn't cause the UI to read from REPO_ROOT/data while the save endpoint writes elsewhere. StaticFiles uses check_dir=False so an empty fresh DATA_ROOT doesn't blow up at server start. MEDIUM: - Atomic file writes via .tmp + os.replace. A failed second write or process kill leaves either both files at their pre-save state or both at the new state, never a half-updated state where verified.json reflects the edit but corrections.json doesn't. New test asserts no .tmp files leak after successful saves. - page_number now rejects bool. isinstance(x, int) is True for bool in Python, so a malformed page_number: true previously coerced to 1 and looked up the wrong job row. Defensive `not isinstance(page_number, bool)` guard; new test confirms files still write but db_updated stays False on bool input. - JobStore.init() result cached per (db_path, process). _open_jobs_store re-checks is_file() each call so a DB created mid-session is picked up without restart, but the migration round trip only runs once. LOW: - _safe_stem rejects whitespace-only stems (would produce confusing " .verified.json" files). Test extended with empty-string, all-spaces, tab. - 1.3 magic threshold in partition_row_lines_by_quadrant promoted to module-level _BOTTOM_BASELINE_REATTRIBUTION_RATIO alongside the other detection constants. - app.js header comment refreshed: references Save not Export, drops the removed _verified flag. 485 tests pass; ruff/mypy clean.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
verifier/for row-by-row manual verification of pipeline output; each row shows a cropped image strip beside an editable text field, with mark-as-deleted, add-row, and per-page meta editing.POST /api/savepersists the session: writes<stem>.verified.jsonand<stem>.corrections.jsontodata/verifier/, and (when the bundle carriespdf_path/page_number) records the verification injobs.dbvia the newJobStore.mark_verifiedmethod. Schema migration addsverified_at,verified_path,corrections_pathcolumns with an idempotentALTER TABLEpass.request-o-matic /request(proxied same-origin). Per-row badges layer three gating signals on top of confidence:postdates(release year > page year),artist-onlyfallback (search_type=song_as_artistorsong_not_found=true), anddifferent artist(zero shared tokens between input and resolved artist). Badge labels distinguishalbum:/sample release:from the flowsheet'sArtist - Trackshape.partition_row_lines_by_quadrantgains a correction pass for the bottom-block hour-jock-cell baseline (_detect_body_mid_ysometimes landsbody_mid_ybelow it, misattributing the line to the top quadrant and shifting bottom-quadrant row crops up by one)._merge_with_spanspropagatesnotes="double_height"to continuation-merged entries.scripts/derive_truth.pyproducestests/golden/<stem>.truth.jsonfrom a verifiedPageResultwith substring rules pinned by parametrized tests.Closes #52.
Test plan
pytest -q— 482 pass, ruff/mypy clean.scripts/make_verifier_bundle.py; load each in the SPA viapython verifier/serve.py; confirm row crops align (the page25 bottom_left row 0 regression was the original symptom),notes="double_height"pre-selects on continuation-merged rows, page-view panel opens by default.POST /api/saveround-trip — file lands indata/verifier/;db_updated: falsefor test goldens (no job row), payload validates asPageResultserver-side before write.Beatnigslands strong (model right, library hit);Pure Joy → Coldcutflags⚠ different artist;Beastie Boyflags⚠ artist-only+⚠ postdates;Little John → Little Joyflags⚠ postdates.derive_truthon a generated verified.json parses back viaGoldenTruth.load.Notes for review
SCHEMA_VERSIONis 2; the UI rejects unknown versions. Re-generate any in-flight bundles after merge.data/jobs.dbmigration is idempotent — firstinit()against a pre-verification DB picks up the columns; no manual intervention.python verifier/serve.py(FastAPI), not barepython -m http.server. TheCheck artistsbutton is the load-bearing reason — request-o-matic doesn't emit CORS.