A Model Context Protocol (MCP) server built with Python. This server exposes OCR (Optical Character Recognition) tools powered by a local PaddleOCR service, capable of extracting text and layout from complex images and PDFs, including Japanese text.
The server provides tools to analyze documents via a dedicated OCR backend service.
-
ocr_document(file_url: str):- Analyzes a single document through its URL (Image/PDF).
- Preserves layout and returns Markdown.
- Example:
ocr_document("https://example.com/invoice.pdf")
-
ocr_batch_documents(file_urls: list[str]):- Analyzes multiple documents through their URLs in parallel.
- efficient for processing multiple files (max 10).
- Example:
ocr_batch_documents(["https://example.com/a.jpg", "https://example.com/b.pdf"])
-
ocr_uploaded_document(file_path: str):- Analyzes a local file by uploading it to the OCR service.
- Note: Requires the file to be accessible on the local filesystem.
- Example:
ocr_uploaded_document("/home/user/docs/scan.png")
- Python 3.12+
uv(for package management):curl -LsSf https://astral.sh/uv/install.sh | shnpm(optional, for Inspector)- CUDA-compatible GPU (Recommended for faster OCR performance)
- Clone the repository.
- Install dependencies for the MCP server:
make install
Running this system requires two components: the OCR Backend Service and the MCP Server.
-
Start the OCR Service (controls the PaddleOCR model):
make ocr
This runs on port 8866.
-
Start the MCP Server (in a new terminal):
make mcp
This runs on port 8001.
Use the MCP Inspector to interactively test the tools:
make inspectThis command starts the inspector UI, where you can list tools and simulate client requests.
- Format code:
make format(usesruff) - Type check:
make mypy
├── main.py # MCP Server entry point
├── mcp_server/ # MCP Server Implementation
│ ├── tools.py # Tool logic (OCR bridge)
├── ocr_service/ # OCR Backend Service (FastAPI + PaddleOCR)
│ ├── main.py # FastAPI app
│ └── ocr.py # PaddleOCR logic
├── settings.py # Configuration
├── Makefile # Command shortcuts
└── pyproject.toml # MCP Server Dependencies
- The OCR service runs locally; no data is sent to external cloud providers for OCR.
- Tools validate URL schemes and file paths.