ref-checker

Extract references from academic PDFs and verify them against the CrossRef API.

Features

PDF reference extraction — automatically finds the References section and parses numbered [n] entries
DOI extraction from hyperlinks — uses embedded PDF links (most reliable) with regex fallback for line-wrapped DOIs
CrossRef verification — checks each reference by DOI resolution or strict title matching
Issue detection — flags year mismatches, broken DOIs, and title discrepancies
Missing DOI suggestions — high-confidence only (near-exact title + year match)
Rich console output — color-coded tables with progress bar
Export — CSV and/or JSON output

Installation

Requires Python 3.10+ and uv.

git clone git@github.com:ademasi/ref-checker.git
cd ref-checker
uv sync

Usage

# Verify references in a PDF
uv run check_references.py paper.pdf

# Export results to CSV
uv run check_references.py paper.pdf --export csv

# Export results to JSON
uv run check_references.py paper.pdf --export json

# Export both CSV and JSON
uv run check_references.py paper.pdf --export both

Output files are saved next to the input PDF as <name>.refs.csv / <name>.refs.json.

Output

The tool displays:

Summary panel — total count with OK / Issues / Not Found / Missing DOI breakdown
All References table — status, year, first author, title, DOI presence, verification method (with similarity %)
Issues table — details on DOI resolution failures, year/title mismatches
Not Found table — references that couldn't be matched on CrossRef (may be correct but not indexed)
Missing DOIs table — high-confidence DOI suggestions for references that omit them

How it works

Extract text and DOI hyperlinks from the PDF using PyMuPDF
Parse the References section into individual entries (authors, year, title, DOI)
Verify each reference:
- If DOI present → resolve via CrossRef API
- Otherwise → search CrossRef by title, rank candidates by normalized title similarity + year match
Report results with color-coded status

Development

# Lint and format
uvx ruff check .
uvx ruff format .

# Run tests
uv run pytest

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
check_references.py		check_references.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ref-checker

Features

Installation

Usage

Output

How it works

Development

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

ademasi/ref-checker

Folders and files

Latest commit

History

Repository files navigation

ref-checker

Features

Installation

Usage

Output

How it works

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages