Skip to content

AdametherzLab/dataforge-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataForge MCP

A Model Context Protocol (MCP) server for DataForge, a hosted web-scraping API. Plug it into Claude Desktop, Cursor, Cline, Windsurf, or n8n and your agent gains three new tools: scrape, scrape_structured, and scrape_markdown. Markdown out of the box, structured extraction via CSS selectors, and optional Chromium JS rendering for SPAs.

Install

One-line via Smithery (recommended):

npx -y @smithery/cli@latest install @adametherzlab/dataforge-mcp --client claude

Or run directly with npx:

npx -y github:AdametherzLab/dataforge-mcp

Or install globally:

npm install -g @adametherzlab/dataforge-mcp
dataforge-mcp

Get an API key

Sign up free at dataforge.adametherzlab.com (100 scrapes/mo free, no credit card). Paid Pro is $29/mo for 25K scrapes. You'll get a key like df_live_....

A zero-signup x402 mode (pay-per-call in USDC on Base, $0.002/call) is on the roadmap. See "x402 mode" below.

Configure your MCP host

Claude Desktop

~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "dataforge": {
      "command": "npx",
      "args": ["-y", "@adametherzlab/dataforge-mcp"],
      "env": {
        "DATAFORGE_API_KEY": "df_live_..."
      }
    }
  }
}

Cursor

Settings -> MCP -> Add new MCP server:

{
  "dataforge": {
    "command": "npx",
    "args": ["-y", "@adametherzlab/dataforge-mcp"],
    "env": { "DATAFORGE_API_KEY": "df_live_..." }
  }
}

Cline (VS Code)

cline_mcp_settings.json:

{
  "mcpServers": {
    "dataforge": {
      "command": "npx",
      "args": ["-y", "@adametherzlab/dataforge-mcp"],
      "env": { "DATAFORGE_API_KEY": "df_live_..." },
      "disabled": false,
      "autoApprove": []
    }
  }
}

Windsurf

~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "dataforge": {
      "command": "npx",
      "args": ["-y", "@adametherzlab/dataforge-mcp"],
      "env": { "DATAFORGE_API_KEY": "df_live_..." }
    }
  }
}

n8n (community MCP node)

Same shape, supplied via the node's "Server config" JSON field.

Tools

scrape(url, format?, js_render?)

Fetch a URL and return its content. format is one of markdown (default, LLM-friendly), html, or text. Set js_render=true for SPAs (slower, uses Playwright Chromium).

Example invocation by an agent:

{ "url": "https://news.ycombinator.com", "format": "markdown" }

Returns:

{ "status": 200, "content": "# Hacker News...", "duration_ms": 115, "remaining": 999977 }

scrape_structured(url, schema, js_render?)

Extract structured data via a JSON schema of CSS selectors.

{
  "url": "https://news.ycombinator.com",
  "schema": {
    "stories": {
      "selector": ".athing",
      "fields": {
        "title": ".titleline a",
        "link":  { "selector": ".titleline a", "attr": "href" }
      }
    }
  }
}

Returns:

{ "status": 200, "extracted": { "stories": [ {"title": "...", "link": "..."}, ... ] }, "duration_ms": 412 }

scrape_markdown(url, js_render?)

Convenience wrapper around scrape with format=markdown. Returns a markdown field instead of content.

Why DataForge over Firecrawl MCP?

  • Zero signup (coming soon): pay $0.002 per scrape in USDC on Base via x402. Your agent can pay its own way without an account.
  • Built-in markdown: format=markdown is a first-class output; no separate "convert" step.
  • Structured extraction via plain CSS selectors: no LLM call needed for HN/Reddit/listing pages.
  • Generous free tier: 100 scrapes/mo, no credit card. Pro $29/mo for 25K.
  • JS rendering on demand: flip js_render=true for SPAs; cheap pages stay cheap.

x402 mode (zero-signup, roadmap)

Set both env vars to skip the signup flow and pay per scrape:

{
  "env": {
    "DATAFORGE_X402_WALLET": "0x...",
    "DATAFORGE_X402_PRIVATE_KEY": "0x..."
  }
}

When enabled, the server will route to https://x402.adametherzlab.com/api/scrape ($0.002/call) or /api/scrape-js ($0.005/call when js_render=true), perform the x402 challenge/sign/resubmit flow, and pay automatically.

In v0.1.0 this mode is stubbed; it returns a clear "coming soon" error so you know to use DATAFORGE_API_KEY for now. The x402 client integration ships in v0.2.0.

Publishing (maintainers only)

cd /opt/dataforge-mcp
npm install
node src/index.js  # smoke test (Ctrl-C to exit)
npm publish --access public

Links

MIT.

About

MCP server for DataForge web scraping API. BYOK or pay-per-call via x402.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors