Release v0.9.0 — Cascading source verification + non-generative-AI mode · rpatrik96/bibtexupdater

Ports two recent citation-verification papers to bibtex-check:

CheckIfExist (Abbonato 2026, arXiv:2602.15871) Algorithm 1 — cascading multi-source verification
HalluCiteChecker (Sakai et al. 2026, arXiv:2604.26835) — venue-policy-compliant non-generative-AI flag

Highlights

Cascading verification (`--cascade`)

Explicit CrossRef → Semantic Scholar → OpenAlex order with high-confidence short-circuit. New OpenAlexClient adds a fourth source. Top-K candidate retrieval (--top-k N, default 3) re-ranks per source by RapidFuzz Levenshtein title similarity before expensive cross-checks.

Cross-source author intersection

cross_source_author_intersection() validates author family names across sources: confirmed = ∩ of normalized names, suspect = union \ confirmed. Multi-source bonus β_ms ∈ [0, 10] when ≥2 sources confirm. Catches swapped_authors / chimeric citation hallucinations that single-source verification misses.

Numeric `confidence_score` (0–100)

Additive in JSONL output. Two formulas:

Case A asymmetric (high-title-low-author chimeric detector): S_title − 0.5 × (100 − S_author)
Case B average + bonus
Explicit penalty constants at module level: PENALTY_TITLE_MISMATCH = PENALTY_AUTHOR_MISMATCH = 20, PENALTY_JOURNAL_MISMATCH = 15, fabricated-author −10 capped at −20

Rich `VerificationResult`

New struct with similarity_breakdown, confirmed_authors, suspect_authors, sources_consulted, sources_confirmed, issues, matched_metadata. Built via build_verification_result() from a classic FactCheckResult — purely additive, no schema break.

Non-generative-AI mode

--non-generative CLI flag and BIBTEX_CHECK_NON_GENERATIVE=1 env var refuse to load any LLM backend at runtime. Forward-compat guard for ACL ARR and ICML 2026 LLM-in-review policy compliance.

New CLI flags

`--cascade`, `--top-k N`, `--openalex-mailto EMAIL`, `--non-generative`.

Backward compatibility

All existing JSONL keys retained; new fields are additive. Default behavior unchanged unless --cascade is set. All 673 pre-existing tests still pass; +35 new tests (708 total).

Known issues

The numeric confidence_score produces conservative values for some venue-mismatch / author-mismatch combinations. The categorical status field remains the authoritative verdict; the numeric score is a tunable additive layer.

See CHANGELOG.md for the full entry.

🤖 Release prepared with Claude Code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.0 — Cascading source verification + non-generative-AI mode

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Cascading verification (`--cascade`)

Cross-source author intersection

Numeric `confidence_score` (0–100)

Rich `VerificationResult`

Non-generative-AI mode

New CLI flags

Backward compatibility

Known issues

Uh oh!

v0.9.0 — Cascading source verification + non-generative-AI mode

Highlights

Cascading verification (--cascade)

Cross-source author intersection

Numeric confidence_score (0–100)

Rich VerificationResult

Non-generative-AI mode

New CLI flags

Backward compatibility

Known issues

Uh oh!

Cascading verification (`--cascade`)

Numeric `confidence_score` (0–100)

Rich `VerificationResult`