A Model Context Protocol (MCP) server that enables Claude to read and analyze local PDF files. Supports text extraction, figure extraction, and intelligent navigation for large documents.
- Full Text Extraction - Extract all text content from PDFs
- Figure Extraction - Extract images and vector graphics with captions
- Smart Navigation - Get PDF structure/TOC and read specific sections
- Large PDF Support - Auto-chunking for sections over 10k words
pip install pymupdf pypdf mcpFrom local file:
claude mcp add /path/to/local-pdf-reader.mcpbFrom GitHub release:
claude mcp add https://github.com/YOUR_USERNAME/local-pdf-reader-mcp/releases/download/v1.0/local-pdf-reader.mcpbExtract all text from a PDF file.
read_pdf_text(file_path="/path/to/document.pdf")Extract figures and images with captions.
read_pdf_figures(file_path="/path/to/document.pdf")Get the table of contents with page ranges and word counts. Recommended for large PDFs.
get_pdf_structure(file_path="/path/to/document.pdf")Returns:
- Total pages and word count
- Section hierarchy with IDs, titles, page ranges
- Chunking info for large sections (>10k words)
Read a specific section by ID or title.
# By section ID
read_pdf_section(file_path="/path/to/doc.pdf", section_id=0)
# By title (fuzzy match)
read_pdf_section(file_path="/path/to/doc.pdf", section_title="Introduction")
# Read chunk 2 of a large section
read_pdf_section(file_path="/path/to/doc.pdf", section_id=5, chunk=2)Small PDFs (< 20 pages): Use read_pdf_text directly.
Large PDFs (> 20 pages):
- Call
get_pdf_structureto get the TOC - Use
read_pdf_sectionto read relevant sections - For sections marked
needs_chunking: true, usechunkparameter
Extract figures: Use read_pdf_figures to get all images and charts.
MIT