Just another web fetching MCP. But mine.
MCP server that reads web pages through a headless Firefox browser and returns clean, structured Markdown content.
Given a URL, the server navigates to the target page in a background Firefox instance, waits for the page content to stabilize, extracts the main article body using Mozilla's Readability API, and converts it into a token-efficient Markdown document. Uses a real browser (not a simple HTTP client) to bypass anti-bot firewalls.
Note
This project was created during experiments with local inference: Qwen3.6 27B (Qwen3.6-27B-UD-Q4_K_XL-MTP.gguf) + llama.cpp + 7900XTX (ROCm) + VS Code Insiders 1.124 + Copilot.
Note
Tested on Linux only. There is no guarantee it works on other systems (especially Windows).
- Headless Firefox - renders pages with a real browser engine, bypassing bot detection
- Content stability detection - waits for dynamic content to fully load before extracting
- Readability extraction - uses Mozilla's Readability API to isolate the main article body
- Markdown output - converts HTML to clean, structured Markdown with links preserved
- Dual mode - runs as an MCP server (stdio) or as a CLI tool for direct Markdown output
- Extensible tool architecture - abstract
McpToolbase class for adding new tools
npm install
npx playwright install firefoxRead a webpage and output Markdown to stdout:
npm run build
node dist/main.js https://example.com| Option | Default | Description |
|---|---|---|
--headless |
true |
Run browser in headless mode |
--no-headless |
- | Run browser in visible (headed) mode |
--no-close |
- | Keep browser open after reading (useful for debugging) |
--user-agent |
Firefox 128 UA | Custom User-Agent header |
--viewport-width |
1280 |
Viewport width in pixels |
--viewport-height |
720 |
Viewport height in pixels |
--user-data-dir |
~/.config/playwright-reader-mcp/firefox-profile |
Persistent browser profile directory |
--no-links |
false |
Disable the "Links on this page" section |
# Read a webpage
node dist/main.js https://example.com
# Read without links section
node dist/main.js https://example.com --no-links
# Run in visible (headed) mode for debugging
node dist/main.js https://example.com --no-headlessStart the MCP server on stdio (no URL argument):
node dist/main.jsConfigure your MCP client to connect via stdio transport. For example:
{
"mcpServers": {
"playwright-reader": {
"command": "node",
"args": ["dist/main.js"]
}
}
}read_webpage - Navigate to a URL, extract the main article, and return clean Markdown.
- Input:
{ url: string } - Output: Markdown text with article title, source URL, body content, and page links
You can test the MCP server interactively using the MCP Inspector:
npx @modelcontextprotocol/inspector node dist/main.jsThis opens a web UI where you can browse and invoke tools.
Alternatively, you can send JSON-RPC messages directly to stdin:
node dist/main.jsThen paste the following JSON-RPC messages (press Enter after each):
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0.0"}}}{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"read_webpage","arguments":{"url":"https://example.com"}}}src/
main.ts # Server entry point (MCP server + CLI mode)
cli_args.ts # Command-line argument parsing with yargs
browser_config.ts # BrowserConfig interface, defaults, and merge helper
browser_manager.ts # Firefox browser lifecycle management
tools.ts # Abstract McpTool base class
stealth_reader.ts # read_webpage tool implementation
content_extractor.ts # HTML-to-Markdown conversion with Turndown
logger.ts # Structured logging with Pino
tests/
content_extractor.test.ts # Unit tests for content extraction
# Build
npm run build
# Watch mode
npm run dev
# Run tests
npm test
# Watch tests
npm run test:watch- @modelcontextprotocol/sdk - MCP server framework
- playwright - Browser automation (Firefox)
- @mozilla/readability - Article extraction
- turndown - HTML-to-Markdown conversion
- zod - Schema validation
- pino - Structured logging
- yargs - CLI argument parsing
MIT