Vet your bibliography. Catches hallucinated citations, wrong years, and fabricated DOIs.
bibvet reads .bib files and verifies every entry against CrossRef, Semantic Scholar, and arXiv. It cross-checks the DOI a paper claims against the title it claims to have — catching the most common LLM citation hallucinations — and can write a cleaned .bib with canonical fields.
# uv (recommended) — installs the `bibvet` command
uv tool install git+https://github.com/dipta007/bibvet.git
# or run without installing
uvx --from git+https://github.com/dipta007/bibvet.git bibvet refs.bib
# or pipx
pipx install git+https://github.com/dipta007/bibvet.git
# or plain pip
pip install git+https://github.com/dipta007/bibvet.gitNo API keys, no config, no setup needed.
bibvet refs.bib # check one file
bibvet refs.bib --strict # high-recall sweep (recommended pre-submission)
bibvet refs.bib --fix # also write refs.fixed.bib with canonical fields- Hallucinated DOIs — fabricated
doi:strings that don't exist - Wrong year — paper exists, but the cited year is off
- Wrong / swapped authors — first author is missing, or author order is wrong
- Mismatched title and DOI — DOI resolves to one paper, title matches another (the LLM mixed two real papers)
- Fully hallucinated entries — paper doesn't exist at any source
- Venue mismatches — softer signal, often cosmetic
- Cosmetic differences —
Y. BengiovsYoshua Bengio(informational only)
bibvet refs.bib # check one file
bibvet refs1.bib refs2.bib refs3.bib # multiple files
bibvet ./papers/ # all .bib files in a directory
bibvet ./papers/ --recursive # also search subdirectories
bibvet refs.bib --fix # additionally write refs.fixed.bib
bibvet refs.bib --explain mycite2024 # detail for one entry
bibvet refs.bib --format md # write report.md (clickable links)
bibvet refs.bib --format json # machine-readable
bibvet refs.bib --skip-source arxiv # disable a source if it's rate-limiting
cat refs.bib | bibvet - # read from stdin0— everything verified or skipped1— warnings only2— at least one entry needs attention
CI-friendly: drop bibvet refs.bib into a pre-submission check and fail the build on 2.
refs.bib · 23 entries · 20 verified · 2 fixable · 1 warning
bad-cite (refs.bib:42)
https://doi.org/10.5555/3295222.3295349
venue: Advances in Neural Information Processing Systems (2017)
error doi: '10.1234/fake' -> '10.5555/3295222.3295349' (DOI does not match canonical record)
error year: '2018' -> '2017' (year mismatch)
--fix writes <name>.fixed.bib next to each input with only the fields bibvet verifies: author, title, year, journal/booktitle, doi, eprint. Unverified fields like pages, volume, number are dropped — they're commonly hallucinated and we can't confirm them. Entries that can't be auto-fixed (cross-check failures, unverified) are left byte-identical with a % bibvet: comment above them — grep '^% bibvet:' refs.fixed.bib to find what needs review.
bibvet refs.bib --strictFor high-recall hallucination sweeps. Promotes warnings to errors:
- Any title difference flagged (was: cosmetic differences silenced)
- Venue mismatches treated as errors (was: warnings)
- Books, theses, and
@miscentries with no canonical match flagged (was: silently skipped) - Single-source matches noted for manual review
Slightly more false positives, but won't miss subtly altered citations.
bibvet works with no config. Optional environment variables:
SEMANTIC_SCHOLAR_API_KEY— higher quota for Semantic Scholar (1 req / 50ms vs. 1 req / second without)CROSSREF_MAILTO— opts into CrossRef's polite pool
If you're verifying more than a few dozen entries, the unauthenticated rate limit (~100 requests / 5 min) will throttle you. The key is free.
-
Go to https://www.semanticscholar.org/product/api#api-key-form and request a key.
-
Approval typically takes a few business days; they email you the key.
-
Set it in your shell:
export SEMANTIC_SCHOLAR_API_KEY=sk-...Or persist it in
~/.zshrc/~/.bashrc. bibvet picks it up automatically — no flags needed.
Cache lives at the platform-default user cache directory (~/Library/Caches/bibvet/ on macOS, ~/.cache/bibvet/ on Linux). 30-day TTL. bibvet cache clear to purge.
MIT.