v0.10.0
Single-source surname keys + self-checking oracles to prevent comparison asymmetry
The recurring "subtle wrong verdict" bugs (particle surnames, DBLP homonym suffix, venue collapsing) shared one root cause: comparison asymmetry — the BibTeX-entry side and the API-record side reduced the same surname/venue through different normalization, producing false AUTHOR_MISMATCH / HALLUCINATED. This release makes the asymmetry structurally impossible and adds self-checking oracles.
Changed
PublishedRecord.surname_keys()is now the single source of truth for record-side surname keys, routing eachfamilythrough the samelast_name_from_personthe entry side uses. All 7 comparison sites consume it (includingFieldFillerandWorkingPaperVerifier, which were still keying the record side raw). The drift-prone_record_surnameshelper was removed. No thresholds, weights, or verdict logic changed.
Added
PublishedRecord.canonical_venue— single record-side venue accessor mirroringsurname_keys.
Fixed
last_name_from_personstrips a trailing 4-digit DBLP homonym suffix ("Sun 0020"→sun) at the key level — defense-in-depth alongside the existing ingestion strip, guarded so an all-digits name is never emptied.
Tests
- +54 tests across two new oracles:
tests/test_record_roundtrip.py(a record→entry must verify against itself with zero field mismatches and clearMATCH_THRESHOLD) andtests/test_metamorphic_symmetry.py(each past bug stated as an invariance: name-order, diacritics, DBLP suffix, particle placement, score symmetry, sibling-journal non-collapse). - Full suite: 839 passed, 1 skipped (785 baseline + 54 new, zero regressions).
Full changelog: v0.9.2...v0.10.0