Skip to content

NormVg/webfn

Repository files navigation

Webfn Banner

Agent-oriented CLI for browser-backed search, fetch, and crawl workflows.

Node Version License PRs Welcome


Overview

Webfn is the ultimate web data extraction tool designed specifically for AI Agents and Automated Pipelines.

Whether you need to bypass anti-bot protections, extract clean markdown from heavily javascript-rendered pages, or perform deep site crawls, Webfn provides a seamless and robust interface.

  • Search the web and extract structured JSON results.
  • Fetch rendered SPA pages and extract human-readable markdown.
  • Crawl entire websites from sitemaps or internal links.
  • Screenshot pages natively or via the cloud.
  • Auto-Detect agent mode — outputs pure JSON when piped, rich CLI UI when in a terminal.

Architecture

Webfn Banner

Benchmarks

Webfn supports both Chrome and Lightpanda engines out of the box. End-to-end fetch benchmark (news.ycombinator.com):

Engine Average Time (ms) Speedup
Chrome 2630ms Baseline
Lightpanda 2537ms 1.04x faster

Official Engine Benchmarks (Raw Execution):

  • Memory Footprint: Lightpanda uses 9x less memory than Chrome.
  • Raw Execution Speed: Lightpanda is up to 11x faster than Chrome.

Installation

# Install globally via npm
npm install -g webfn-cli

# Or via pnpm
pnpm add -g webfn-cli

Once installed, the webfn command is available globally:

webfn fetch https://example.com
webfn search "ai agents"

No install needed with npx:

npx webfn-cli fetch https://example.com
npx webfn-cli screenshot https://example.com --full

Prerequisites

Webfn works out of the box with Chrome/Chromium (auto-detected). For the best experience, you should also set up Lightpanda — a fast, lightweight headless browser that webfn uses by default.

Lightpanda (Recommended)

Lightpanda is the default headless engine for search, fetch, and crawl commands. Without it, webfn falls back to your local Chrome installation.

Install Lightpanda:

Please refer to the official docs for installation instructions: https://lightpanda.io/docs/open-source/installation

After installing, verify it works:

webfn doctor

If you skip Lightpanda, all commands still work — webfn will automatically fall back to your local Chrome. Use --engine chrome to always force Chrome.

Chrome / Chromium

Make sure Google Chrome or Chromium is installed. Webfn will auto-detect it. You can also point to a custom binary:

webfn fetch https://example.com --chrome /path/to/chrome

Getting Started

Ensure you have Node.js installed, then:

pnpm install
pnpm build

Verify your environment dependencies:

pnpm dev doctor

Agent-First Design

Webfn is built to be controlled by LLMs and Agents.

Official Agent Skill

If you use an agent that supports the Skills CLI (like Claude Code, Cursor, Antigravity, or Trae), you can install the official webfn skill. This automatically provides your agent with full context on how to use the CLI efficiently, bypassing bot protections and extracting clean data.

# Install the webfn skill for your AI agent
npx skills add NormVg/webfn

Once installed, simply ask your agent to "look something up", "fetch a webpage", or "do research", and it will autonomously invoke webfn.

Piped Outputs

When stdout is piped (e.g., an AI agent calling the CLI), webfn automatically suppresses all progress bars and interactive UI, outputting a single, strictly formatted JSON envelope.

# Agent mode (auto-detected):
webfn fetch https://example.com | jq .page.title

# Force JSON in a TTY:
webfn fetch https://example.com --json

# Human-friendly rich output (default in terminal):
webfn fetch https://example.com

All JSON output follows a consistent format:

{
  "ok": true,
  "command": "fetch",
  "page": { "title": "Example", "markdown": "..." },
  "browser": { "engine": "lightpanda", "headless": true },
  "files": [ ... ],
  "storage": { ... }
}

Core Commands

# Search for a topic
webfn search "ai agents"

# Search and download the full HTML/Markdown of the top results
webfn collect "ai agents"

# Fetch a specific page and convert to markdown
webfn fetch https://example.com

# Take a full-page screenshot
webfn screenshot https://example.com --full

# Crawl a site up to a specific depth
webfn crawl https://example.com --depth 2 --max-pages 50

For comprehensive configuration and documentation, see docs.md.

Built With

Webfn leverages powerful open-source libraries:

Contributing

Contributions are always welcome!

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/amazing-feature).
  3. Commit your changes (git commit -m 'Add some amazing feature').
  4. Push to the branch (git push origin feature/amazing-feature).
  5. Open a Pull Request.

Please ensure you run pnpm run format and verify your changes don't break the agent JSON output structures.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Agent-oriented CLI for browser-backed search, fetch, crawl, and scrape workflows.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors