v0.9.0 — Cascading source verification + non-generative-AI mode
Ports two recent citation-verification papers to bibtex-check:
- CheckIfExist (Abbonato 2026, arXiv:2602.15871) Algorithm 1 — cascading multi-source verification
- HalluCiteChecker (Sakai et al. 2026, arXiv:2604.26835) — venue-policy-compliant non-generative-AI flag
Highlights
Cascading verification (--cascade)
Explicit CrossRef → Semantic Scholar → OpenAlex order with high-confidence short-circuit. New OpenAlexClient adds a fourth source. Top-K candidate retrieval (--top-k N, default 3) re-ranks per source by RapidFuzz Levenshtein title similarity before expensive cross-checks.
Cross-source author intersection
cross_source_author_intersection() validates author family names across sources: confirmed = ∩ of normalized names, suspect = union \ confirmed. Multi-source bonus β_ms ∈ [0, 10] when ≥2 sources confirm. Catches swapped_authors / chimeric citation hallucinations that single-source verification misses.
Numeric confidence_score (0–100)
Additive in JSONL output. Two formulas:
- Case A asymmetric (high-title-low-author chimeric detector):
S_title − 0.5 × (100 − S_author) - Case B average + bonus
- Explicit penalty constants at module level:
PENALTY_TITLE_MISMATCH = PENALTY_AUTHOR_MISMATCH = 20,PENALTY_JOURNAL_MISMATCH = 15, fabricated-author−10capped at−20
Rich VerificationResult
New struct with similarity_breakdown, confirmed_authors, suspect_authors, sources_consulted, sources_confirmed, issues, matched_metadata. Built via build_verification_result() from a classic FactCheckResult — purely additive, no schema break.
Non-generative-AI mode
--non-generative CLI flag and BIBTEX_CHECK_NON_GENERATIVE=1 env var refuse to load any LLM backend at runtime. Forward-compat guard for ACL ARR and ICML 2026 LLM-in-review policy compliance.
New CLI flags
`--cascade`, `--top-k N`, `--openalex-mailto EMAIL`, `--non-generative`.
Backward compatibility
All existing JSONL keys retained; new fields are additive. Default behavior unchanged unless --cascade is set. All 673 pre-existing tests still pass; +35 new tests (708 total).
Known issues
- The numeric
confidence_scoreproduces conservative values for some venue-mismatch / author-mismatch combinations. The categoricalstatusfield remains the authoritative verdict; the numeric score is a tunable additive layer.
See CHANGELOG.md for the full entry.
🤖 Release prepared with Claude Code