v2.0.0 — Feature-complete technical SEO audit MCP server
librecrawl-mcp v2.0.0 marks feature-complete for technical SEO auditing. Self-hosted SEO crawler for Claude / Cursor / Codex / Windsurf / Continue.dev. 50+ technical checks, chunked-progressive crawling, ephemeral by default, PDF + 7 CSV sidecars per audit.
Highlights
- Chunked-progressive crawling that never hits the MCP client timeout — background runner thread, SQLite WAL state, polling API. Enterprise sites with 10,000+ pages work the same as 50-page blogs.
- WAF / bot-block detection during the crawl — Cloudflare · Akamai · DataDome · Imperva · PerimeterX challenge pages fingerprinted as
bot_block_challenge_detected. No other open-source SEO crawler does this. - Ephemeral by default — after the client downloads the zip bundle, the server deletes the session row, all artifact files on disk, AND the upstream LibreCrawl crawl record. The local client is the only memory of any audited site.
- AIMD adaptive crawl-delay — additive-increase / multiplicative-decrease controller tunes delay live from target's p95 latency + 5xx rate. Polite by construction. Honours robots.txt Crawl-Delay floor.
- Sitemap-orphan fill — URLs in sitemap not reachable via internal-link traversal get a lightweight HTTP fetch and join the inbound-link graph. Closes the LibreCrawl
maxDepthcoverage gap. - PDF + 7 CSV sidecars per audit — branded WeasyPrint PDF, Markdown source, per-page CSV with 30 columns, sitemap-recon CSV, external-links CSV, content-audit CSV, extended-checks CSV.
Compatible AI agents
Claude Code · Claude Desktop · Cursor · OpenAI Codex CLI · Windsurf · Continue.dev · any MCP-compatible client over stdio or streamable-HTTP transport.
What it checks (50+ technical SEO checks)
Security headers · mixed content · WAF detection · sitemap cross-checks · hreflang full audit · canonical chain depth + relative + → 3xx · redirect chains with destination · meta-refresh · JS-redirect · http-refresh · schema.org validation (16 types, schema.org spec + Google Rich Results) · URL quality · anchor text quality · broken bookmarks · internal nofollow patterns · image performance + CLS · HTML structure pathologies · accessibility / metadata · crawl-budget killers · dev leaks · content quality (Flesch · AI-tell tokens · missing punctuation · boilerplate) · external link validation (17 status classes).
Full check inventory in README.md.
Install
```bash
curl -fsSL https://raw.githubusercontent.com/adityaarsharma/librecrawl-mcp/main/install.sh | bash
```
Release-gate smoke test
Full audit on theculinarypeace.com — 460 pages crawled (200 LibreCrawl + 260 sitemap-fill), 8-file zip bundle (320 KB, sha256 verified), 620 findings across 15 distinct check classes, server returned to zero-memory baseline after client downloaded the bundle. Every check class verified to fire on real production HTML or correctly absent on clean pages.
Full CHANGELOG
See CHANGELOG.md for the v1.2.0 → v2.0.0 arc:
v1.2.0 Screaming-Frog parity · v1.4.0 chunked engine · v1.4.1 external-link validator · v1.5.0 PDF + content audit + extended checks + GSC + schema validation · v1.5.1 audit_complete respects sitemap coverage · v1.6.0 sitemap-orphan fill · v1.6.1 false-positive orphans fix · v1.6.2 in-content link extraction · v1.7.0 Tier 1 "fix broken" · v1.8.0 Tier 2 30+ technical checks · v1.9.0 ephemeral mode · v1.9.1 polish.
Reusable Claude Code skill
.claude/skills/librecrawl-audit/SKILL.md — drops into any project's .claude/skills/ directory (or ~/.claude/skills/ globally). Auto-loads when the user asks for site audit / SEO check / broken link / schema validation work.
License
MIT. Built on top of LibreCrawl (MIT).
By Aditya Sharma — github.com/adityaarsharma/librecrawl-mcp