webpull

Pull any public docs site into local markdown files.

$ webpull https://docs.example.com

  ⚡ webpull · 16 workers
  docs.example.com → ./docs.example.com

  ●●●·●●●●·●●●●●●●·
  ├─ ✓ getting-started/installation.md
  ├─ ✓ api/authentication.md
  ├─ ✓ guides/deployment.md
  █████████████░░░░░░░ 68% 102/150 · 6p/s · 17.2s

Install

bun install -g webpull

Usage

webpull <url> [options]

Options:
  -o, --out <dir>   Output directory (default: ./<hostname>)
  -m, --max <n>     Max pages to pull (default: 500)

Examples

# Pull React docs
webpull https://react.dev/reference

# Custom output dir, limit to 100 pages
webpull https://docs.python.org -o ./python-docs -m 100

How it works

Discovers pages via sitemap.xml, nav link extraction, or link crawling
Fetches in parallel using a worker pool sized to your CPU cores
Converts to markdown using Defuddle for intelligent content extraction
Writes to disk preserving the URL path structure with YAML frontmatter

Each markdown file includes metadata:

---
title: "Getting Started"
url: "https://docs.example.com/getting-started"
---

Requirements

Bun runtime

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

webpull

Install

Usage

Examples

How it works

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

webpull

Install

Usage

Examples

How it works

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages