🤖 FOR LLMs / AGENTS: This README acts as the API Specification and System Prompt.
This is a serverless Web Scraper API built with Node.js and Hono, deployed to Cloudflare Workers. It uses the r.jina.ai engine to bypass captchas and extract clean Markdown from any given URL.
- File Map: Always read
CODEBASE.mdto understand file dependencies and system routing before modifying code. - Session Memory: Always read and update
STATE.mdat the beginning and end of each session to maintain context across chats.
This is a remote Serverless API, but you can use npx to test the connection and get the endpoint details for your MCP client (like Claude Desktop or Cursor):
npx api_scraper_markdown📚 Read the Full Integration Guide for Cursor and Claude Desktop
Returns the Model Context Protocol (MCP) JSON manifest. Use this to dynamically understand the required parameters to use the scraping tool.
Requires Authentication: Authorization: Bearer <token>
Este endpoint utiliza o protocolo HTTP 402 Payment Required.
- Se você não fornecer um token ou o token não tiver saldo, a API retornará um erro 402.
- A resposta do erro 402 conterá uma
paymentUrl(Dodo Payments) onde você pode adquirir créditos. - Após o pagamento, você receberá um token que deve ser enviado no header
Authorization.
Request Body:
{
"url": "https://example.com"
}Success Response (200 OK):
{
"success": true,
"data": {
"title": "Page Title",
"url": "https://example.com",
"content": "# Markdown extracted..."
}
}# Start the local Cloudflare dev server
npm run devFor generating/synchronizing types based on your Worker configuration run:
npm run cf-typegenPass the CloudflareBindings as generics when instantiation Hono:
// src/index.ts
const app = new Hono<{ Bindings: CloudflareBindings }>()