A Model Context Protocol (MCP) server for extracting text content from office documents. Pure TypeScript, zero system dependencies.
| Extension | Format |
|---|---|
.docx |
Microsoft Word |
.xlsx |
Microsoft Excel |
.pptx |
Microsoft PowerPoint |
.odt |
OpenDocument Text |
.ods |
OpenDocument Spreadsheet |
.odp |
OpenDocument Presentation |
.pdf |
PDF Document |
.rtf |
Rich Text Format |
Add to your .mcp.json:
Add to your Claude Code MCP settings:
{
"mcpServers": {
"document_reader": {
"command": "npx",
"args": ["-y", "@klpanagi/mcp-document-reader"]
}
}
}Use the stdio transport with npx -y @klpanagi/mcp-document-reader as the command.
Extract text content from a document file.
| Parameter | Type | Required | Description |
|---|---|---|---|
file_path |
string | Yes | Absolute path to the document |
include_metadata |
boolean | No | Include format/size header (default: false) |
Get metadata about a document without extracting content.
| Parameter | Type | Required | Description |
|---|---|---|---|
file_path |
string | Yes | Absolute path to the document |
List all supported document formats. No parameters.
Environment variables for tuning:
| Variable | Default | Description |
|---|---|---|
MAX_FILE_SIZE_MB |
50 |
Maximum file size to process |
MAX_OUTPUT_CHARS |
500000 |
Truncation limit for extracted text |
bun install
bun test
bun run buildMIT
{ "mcpServers": { "document_reader": { "command": "npx", "args": ["-y", "@klpanagi/mcp-document-reader"] } } }