Clone, scrape and crawl any website. Free Firecrawl alternative. No API keys, no rate limits, no subscription.
Cursor (one-click): Click the "Install in Cursor" button above.
Manual MCP config:
{
"mcpServers": {
"deepcrawl-mcp": {
"command": "npx",
"args": ["-y", "deepcrawl-mcp@latest"]
}
}
}CLI:
npx deepcrawl-mcp@latestScrape a single page and return clean markdown. Extracts title, description, links, images, and metadata. Strips navigation, footer, ads, and tracking.
"Scrape https://example.com and give me the main content"
| Parameter | Default | Description |
|---|---|---|
url |
required | Page URL to scrape |
mainContentOnly |
true |
Extract only main content (skip nav/footer) |
includeLinks |
true |
Include discovered links |
includeImages |
true |
Include image URLs |
Clone a full page with all assets: HTML, CSS, JS, images, fonts, favicons. Downloads everything into a local folder, rewrites URLs to relative paths. Open index.html in a browser and it works.
"Clone https://competitor.com into a local folder"
| Parameter | Default | Description |
|---|---|---|
url |
required | Page URL to clone |
outputDir |
~/deepcrawl-clones/<domain> |
Output folder |
depth |
0 |
Link depth: 0 = single page, 1+ = follow links |
Crawl an entire site following internal links. Returns every page as clean markdown. Great for content analysis, SEO audit, or feeding a RAG pipeline.
"Crawl https://docs.example.com and return all pages as markdown"
| Parameter | Default | Description |
|---|---|---|
url |
required | Starting URL |
maxPages |
20 |
Max pages to crawl (max: 100) |
includeImages |
false |
Include image URLs per page |
Discover all URLs from a site via sitemap.xml parsing and homepage link crawling. Run this before a crawl to see the site's scope.
"Map all pages on https://example.com"
| Parameter | Default | Description |
|---|---|---|
url |
required | Site URL |
maxUrls |
200 |
Max URLs to discover |
| deepcrawl | Firecrawl | |
|---|---|---|
| Price | Free | $19+/mo |
| API key | None | Required |
| Rate limits | None | Yes |
| Scrape to markdown | Yes | Yes |
| Full site crawl | Yes | Yes |
| Site map | Yes | Yes |
| Clone with assets | Yes | No |
| JS rendering | Yes (via Playwright) | Yes |
| Anti-bot bypass | Partial (UA rotation, headers, delays) | Yes |
deepcrawl handles static sites out of the box. For JS-heavy SPAs (React, Next.js, SvelteKit), install Playwright for full rendering:
npm install -g playwright
npx playwright install chromiumOnce installed, deepcrawl auto-detects Playwright and enables JS rendering. Use jsRender: true on any tool to activate it. All tools also include UA rotation, realistic browser headers, and random delays to avoid basic bot detection.
- brandcheck - Check brand name availability across 27 platforms
- depsonar - Dependency audit, security scan, license check
MIT