Skip to content

v0.9.0 — Cascading source verification + non-generative-AI mode

Choose a tag to compare

@rpatrik96 rpatrik96 released this 08 May 12:01
· 76 commits to main since this release

Ports two recent citation-verification papers to bibtex-check:

Highlights

Cascading verification (--cascade)

Explicit CrossRef → Semantic Scholar → OpenAlex order with high-confidence short-circuit. New OpenAlexClient adds a fourth source. Top-K candidate retrieval (--top-k N, default 3) re-ranks per source by RapidFuzz Levenshtein title similarity before expensive cross-checks.

Cross-source author intersection

cross_source_author_intersection() validates author family names across sources: confirmed = ∩ of normalized names, suspect = union \ confirmed. Multi-source bonus β_ms ∈ [0, 10] when ≥2 sources confirm. Catches swapped_authors / chimeric citation hallucinations that single-source verification misses.

Numeric confidence_score (0–100)

Additive in JSONL output. Two formulas:

  • Case A asymmetric (high-title-low-author chimeric detector): S_title − 0.5 × (100 − S_author)
  • Case B average + bonus
  • Explicit penalty constants at module level: PENALTY_TITLE_MISMATCH = PENALTY_AUTHOR_MISMATCH = 20, PENALTY_JOURNAL_MISMATCH = 15, fabricated-author −10 capped at −20

Rich VerificationResult

New struct with similarity_breakdown, confirmed_authors, suspect_authors, sources_consulted, sources_confirmed, issues, matched_metadata. Built via build_verification_result() from a classic FactCheckResult — purely additive, no schema break.

Non-generative-AI mode

--non-generative CLI flag and BIBTEX_CHECK_NON_GENERATIVE=1 env var refuse to load any LLM backend at runtime. Forward-compat guard for ACL ARR and ICML 2026 LLM-in-review policy compliance.

New CLI flags

`--cascade`, `--top-k N`, `--openalex-mailto EMAIL`, `--non-generative`.

Backward compatibility

All existing JSONL keys retained; new fields are additive. Default behavior unchanged unless --cascade is set. All 673 pre-existing tests still pass; +35 new tests (708 total).

Known issues

  • The numeric confidence_score produces conservative values for some venue-mismatch / author-mismatch combinations. The categorical status field remains the authoritative verdict; the numeric score is a tunable additive layer.

See CHANGELOG.md for the full entry.

🤖 Release prepared with Claude Code