MCP server for PDF.co API. Comprehensive PDF manipulation, conversion, OCR, text extraction, and document automation with support for barcodes, watermarks, and security features.
- Full API Coverage: Complete implementation of PDF.co API endpoints
- Strongly Typed: All responses use Pydantic models for type safety
- S-Tier Architecture: Production-ready with separated concerns (API client, models, server)
- HTTP Transport: Supports streamable-http with health endpoint
- Async/Await: Built on aiohttp for high performance
- Type Safe: Full mypy strict mode compliance
- Comprehensive Testing: Unit tests with pytest and AsyncMock
- Docker Ready: Production Dockerfile included
pdf_to_text
- Extract text content from PDF documentspdf_to_json
- Extract structured data from PDFspdf_to_html
- Convert PDF to HTML formatpdf_to_csv
- Extract tables from PDF to CSV
pdf_merge
- Combine multiple PDFs into onepdf_split
- Split PDF into separate pages or rangespdf_rotate
- Rotate pages in a PDF documentpdf_compress
- Reduce PDF file size with configurable compressionpdf_add_watermark
- Add text watermarks to PDFs
pdf_protect
- Add password protection to PDFspdf_unlock
- Remove password protection from PDFs
pdf_info
- Get PDF metadata (pages, size, dimensions, etc.)
html_to_pdf
- Convert HTML content to PDFurl_to_pdf
- Convert web pages to PDFimage_to_pdf
- Convert images to PDF documents
barcode_generate
- Generate QR codes and barcodesbarcode_read
- Read and decode barcodes from images
ocr_pdf
- OCR scanned PDFs to make them searchable
# Clone the repository
git clone <repository-url>
cd mcp-pdfco
# Install with uv
uv pip install -e .
# Install with development dependencies
uv pip install -e ".[dev]"
pip install -e .
Get your free API key from PDF.co Dashboard and set it as an environment variable:
export PDFCO_API_KEY=your_api_key_here
Or create a .env
file:
PDFCO_API_KEY=your_api_key_here
Add to your Claude Desktop configuration file:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"pdfco": {
"command": "uvx",
"args": ["mcp-pdfco"],
"env": {
"PDFCO_API_KEY": "your_api_key_here"
}
}
}
}
# Using Python module
uv run python -m mcp_pdfco.server
# Using the Makefile
make run
# Build the Docker image
docker build -t mcp-pdfco .
# Run with Docker
docker run -e PDFCO_API_KEY=your_key -p 8000:8000 mcp-pdfco
# Run with Docker Compose
docker-compose up
The server supports HTTP transport with a health check endpoint:
# Start with uvicorn
uvicorn mcp_pdfco.server:app --host 0.0.0.0 --port 8000
# Check health
curl http://localhost:8000/health
result = await pdf_to_text(
url="https://example.com/document.pdf",
pages="1-5"
)
print(result.text)
result = await pdf_merge(
urls=[
"https://example.com/doc1.pdf",
"https://example.com/doc2.pdf"
],
name="merged_document.pdf"
)
print(f"Merged PDF: {result.url}")
result = await html_to_pdf(
html="<h1>Hello World</h1><p>This is a PDF</p>",
name="hello.pdf",
page_size="A4",
orientation="Portrait"
)
print(f"Generated PDF: {result.url}")
result = await pdf_add_watermark(
url="https://example.com/document.pdf",
text="CONFIDENTIAL",
x=200,
y=400,
font_size=48,
color="FF0000",
opacity=0.3,
pages="0-", # Apply to all pages
name="watermarked_document.pdf"
)
print(f"Watermarked PDF: {result.url}")
result = await barcode_generate(
value="https://example.com",
barcode_type="QRCode",
format="png"
)
print(f"QR Code: {result.url}")
result = await ocr_pdf(
url="https://example.com/scanned.pdf",
pages="1-10",
lang="eng"
)
print(f"OCR'd PDF: {result.url}")
print(f"Extracted text: {result.text}")
make help # Show all available commands
make install # Install dependencies
make dev-install # Install with dev dependencies
make format # Format code with ruff
make lint # Lint code with ruff
make typecheck # Type check with mypy
make test # Run tests with pytest
make test-cov # Run tests with coverage
make check # Run all checks (lint + typecheck + test)
make clean # Clean up artifacts
.
├── src/
│ └── mcp_pdfco/
│ ├── __init__.py
│ ├── server.py # FastMCP server with tool definitions
│ ├── api_client.py # Async PDF.co API client
│ └── api_models.py # Pydantic models for type safety
├── tests/
│ ├── __init__.py
│ ├── test_server.py # Server tool tests
│ └── test_api_client.py # API client tests
├── pyproject.toml # Project configuration
├── Makefile # Development commands
├── Dockerfile # Container deployment
└── README.md # This file
# Run all tests
pytest
# Run with coverage
pytest --cov=src/mcp_pdfco --cov-report=term-missing
# Run specific test file
pytest tests/test_server.py -v
This project uses:
- ruff: Fast Python linter and formatter
- mypy: Static type checker (strict mode)
- pytest: Testing framework with async support
All code must pass:
make check # Runs lint + typecheck + test
This server follows S-Tier MCP architecture principles:
-
Separation of Concerns
api_client.py
: HTTP communication layerapi_models.py
: Data models and type definitionsserver.py
: MCP tool definitions and routing
-
Type Safety
- Full type hints on all functions
- Pydantic models for API responses
- Mypy strict mode compliance
-
Async All the Way
- aiohttp for HTTP requests
- Async/await throughout
- Context managers for resource cleanup
-
Error Handling
- Custom
PDFcoAPIError
exception - Context logging via
ctx.error()
andctx.warning()
- Graceful error messages
- Custom
-
Production Ready
- Docker support
- Health check endpoint
- Environment-based configuration
- Comprehensive logging
- Python 3.13+
- aiohttp >= 3.12.15
- fastapi >= 0.117.1
- fastmcp >= 2.12.4
- pydantic >= 2.0.0
- uvicorn >= 0.34.0
For detailed API documentation, visit PDF.co API Documentation.
- PDF: URL or base64 encoded
- Images: PNG, JPG, GIF, BMP, TIFF
- HTML: Raw HTML string or URL
- PDF: High-quality PDF generation
- Text: Plain text extraction
- JSON: Structured data extraction
- HTML: Formatted HTML output
- CSV: Table data extraction
- Images: PNG, JPG, SVG for barcodes
PDF.co has rate limits based on your subscription plan. Free plans include:
- 100 API calls per month
- 10 API calls per minute
Check your dashboard for current usage.
Issue: PDFCO_API_KEY is not set
warning
Solution: Set the environment variable:
export PDFCO_API_KEY=your_key_here
Issue: Network error
or timeout
Solution: Check your internet connection and increase timeout:
client = PDFcoClient(timeout=180.0) # 3 minutes
Issue: API Error 401: Unauthorized
Solution: Verify your API key is valid at https://app.pdf.co/dashboard
Issue: Docker container won't start
Solution: Ensure the API key is passed correctly:
docker run -e PDFCO_API_KEY=your_key_here -p 8000:8000 mcp-pdfco
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests:
make check
- Submit a pull request
Issue Tracker: GitHub Issues
MIT
Part of the NimbleTools Registry - an open source collection of production-ready MCP servers. For enterprise deployment, check out NimbleBrain.