Markdown Scraper API (MCP Enabled)

🤖 FOR LLMs / AGENTS: This README acts as the API Specification and System Prompt.

This is a serverless Web Scraper API built with Node.js and Hono, deployed to Cloudflare Workers. It uses the r.jina.ai engine to bypass captchas and extract clean Markdown from any given URL.

🧠 System Architecture & Context Engineering

File Map: Always read CODEBASE.md to understand file dependencies and system routing before modifying code.
Session Memory: Always read and update STATE.md at the beginning and end of each session to maintain context across chats.

📦 Installation & Usage

This is a remote Serverless API, but you can use npx to test the connection and get the endpoint details for your MCP client (like Claude Desktop or Cursor):

npx api_scraper_markdown

📚 Read the Full Integration Guide for Cursor and Claude Desktop

📡 API Specification

1. `GET /mcp/manifest` (Discovery)

Returns the Model Context Protocol (MCP) JSON manifest. Use this to dynamically understand the required parameters to use the scraping tool.

2. `POST /scrape` (Action)

Requires Authentication: Authorization: Bearer <token>

Este endpoint utiliza o protocolo HTTP 402 Payment Required.

Se você não fornecer um token ou o token não tiver saldo, a API retornará um erro 402.
A resposta do erro 402 conterá uma paymentUrl (Dodo Payments) onde você pode adquirir créditos.
Após o pagamento, você receberá um token que deve ser enviado no header Authorization.

Request Body:

{
  "url": "https://example.com"
}

Success Response (200 OK):

{
  "success": true,
  "data": {
    "title": "Page Title",
    "url": "https://example.com",
    "content": "# Markdown extracted..."
  }
}

🚀 How to Run Locally

# Start the local Cloudflare dev server
npm run dev

For generating/synchronizing types based on your Worker configuration run:

npm run cf-typegen

Pass the CloudflareBindings as generics when instantiation Hono:

// src/index.ts
const app = new Hono<{ Bindings: CloudflareBindings }>()

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
bin		bin
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
API.md		API.md
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Markdown Scraper API (MCP Enabled)

🧠 System Architecture & Context Engineering

📦 Installation & Usage

📡 API Specification

1. `GET /mcp/manifest` (Discovery)

2. `POST /scrape` (Action)

🚀 How to Run Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Markdown Scraper API (MCP Enabled)

🧠 System Architecture & Context Engineering

📦 Installation & Usage

📡 API Specification

1. GET /mcp/manifest (Discovery)

2. POST /scrape (Action)

🚀 How to Run Locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `GET /mcp/manifest` (Discovery)

2. `POST /scrape` (Action)

Packages