Curated AI search for agents. Bring your own sources, your own LLM. The self-hosted, agent-ready alternative to Tavily and Perplexity for trusted-source research.
vouch is a Python library that lets you register a list of trusted sites with metadata (category, description, tags), and then runs intelligent searches across them using any LLM — including local models via Ollama. The AI dynamically figures out how to search each site (no manual selectors), caches what it learns, and returns structured results ready to feed an agent.
from vouch import SearchEngine, Site
engine = SearchEngine(llm="ollama/qwen2.5:14b")
engine.add(Site("cvm.gov.br", category="regulação financeira BR"))
engine.add(Site("arxiv.org", category="papers acadêmicos", tags=["ml", "ai"]))
results = engine.search("LLM agents in 2026", depth=2)
for r in results.chunks:
print(f"[{r.source_url}] {r.title}\n{r.snippet}\n")- Why vouch
- How it compares
- Installation
- Quick start
- Three ways to use
- Core concepts
- The Site model
- Search depth
- Smart Site Routing
- Selector cache
- Profile registry
- LLM configuration (BYOK)
- Using as an agent tool
- Optional features
- Configuration reference
- Architecture
- Cost and performance
- Limits and what's out of scope
- Roadmap
- Contributing
- References and prior art
- CSS selector pinning — the AI now examines DOM context and emits reusable
{container, title, url, snippet, date, author}selectors. Cached in SQLite; subsequent calls replay vialxml.cssselect— no LLM, milliseconds. - Auto-escalation chain —
http → browser → stealth. The engine learns the cheapest tier each site needs and persists it. - Browser pool — one shared Chromium across all searches in a session. Eliminates the per-call launch overhead and the Windows-on-fresh-fork instability.
- Plugin model —
pip install vouch-adapter-arxiv(or anything matchingvouch-adapter-*,vouch-router-*,vouch-profiles-*) and the nextSearchEnginepicks it up automatically via Python entry_points. Seedocs/plugins.md. - Profile registry + auto-update — 24 curated profiles ship in the box (arxiv, github, huggingface, gov.uk, irs.gov, sede.agenciatributaria, jusbrasil, MDN, pypi, …).
vouch profiles updatepulls the latest community registry. - Auto-DNS resolution — fixes wrong-host catalog entries automatically (
agenciatributaria.es→sede.agenciatributaria.gob.es). - Async API —
await engine.asearch(query, ...)alongside the sync API. - Result formatting —
result.to_markdown(),to_json(). - Smarter quality detector — flags megamenu shells, recursive search-page links, no-results sentinels in PT/ES/EN.
- Probe crawl on
add()— opt-in: site profile is learned the moment you add it, not on the first user query. - Package renamed from
curiotovouch(thecurioname on PyPI is taken by David Beazley's async library since 2016).
See CHANGELOG.md for the full list.
The agentic AI ecosystem has two kinds of search tools:
- Open-web search (Tavily, Exa, Perplexica, Firecrawl
/search): great for broad questions, but you can't tell them "only use these sources I trust." - Browser agents (Stagehand, Skyvern, browser-use): great at navigating one site at a time, but they don't know which site to navigate to and don't keep a registry of your trusted sources.
vouch fills the gap in between: a persistent catalog of sites you trust, with an LLM that reads your descriptions to route each query to the right sources, a browser layer that learns how to search each site once and caches the selectors, and a clean tool interface for CrewAI / LangChain / PydanticAI / MCP.
It's what you'd build if you wanted Tavily, but for your own list of sources, using your own LLM, without paying anyone.
| vouch | Tavily / Exa | Perplexica | browser-use | Stagehand | Kagi Lenses | |
|---|---|---|---|---|---|---|
| Curated source list with metadata | ✅ | ❌ | ❌ | ❌ | ✅ | |
| LLM routing by description | ✅ | ❌ | ❌ | ❌ | ❌ | |
| Dynamic search-bar discovery | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ |
| Cached selectors after first visit | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ |
| Adjustable depth (0–3) | ✅ | ❌ | ❌ | ❌ | ||
| BYOK + local LLM (Ollama) | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
Self-hosted Python pip install |
✅ | ❌ | ❌ TS | ✅ | ❌ | |
| First-class agent tool | ✅ | ✅ | ❌ | ❌ |
Minimal install (HTTP-only, no browser):
pip install vouchMost users want the browser layer too:
pip install "vouch[browser]"
vouch install-browser # downloads Chromium for PlaywrightOptional extras:
pip install "vouch[browser,stealth]" # patchright for harder sites
pip install "vouch[browser,ocr]" # Tesseract CAPTCHA tier (CPU, no GPU)
pip install "vouch[browser,vision]" # image handling for vision-LLM CAPTCHA
pip install "vouch[browser,monitor]" # APScheduler for change tracking
pip install "vouch[browser,server]" # FastAPI dashboard
pip install "vouch[mcp]" # MCP server (Claude Desktop, Cursor)
pip install "vouch[all]" # everything above
pip install "vouch[crewai]" # framework integrations (separate)
pip install "vouch[langchain]"
pip install "vouch[pydantic-ai]"Requirements: Python 3.10+. Linux, macOS, or Windows. ~500MB disk after install-browser.
# 1. Make sure Ollama is running with a capable model
ollama pull qwen2.5:14b
# 2. Install vouch
pip install "vouch[browser]"
vouch install-browser
# 3. One-liner search
python -c "from vouch import search; print(search('arxiv attention transformers', sites=['arxiv.org']))"export ANTHROPIC_API_KEY=sk-ant-...
pip install "vouch[browser]"
vouch install-browser
python -c "from vouch import search; print(search('FDA drug approvals 2025', sites=['fda.gov'], llm='anthropic/claude-haiku-4-5'))"vouch has three progressive levels of API. Pick the one that fits your case — they all coexist and use the same engine underneath.
For quick scripts or trying it out:
from vouch import search
results = search(
query="regulamentação fundos 2026",
sites=["cvm.gov.br", "valor.globo.com"],
llm="ollama/qwen2.5:14b",
depth=1,
)No catalog, no metadata, no persistence. Ephemeral. Good for python -c "...".
The typical workflow. Build a catalog in code:
from vouch import SearchEngine, Site
engine = SearchEngine(
llm="ollama/qwen2.5:14b",
cache_dir="~/.vouch", # selectors and metadata persist here
)
engine.add(Site(
url="cvm.gov.br",
category="regulação financeira BR",
description="Comissão de Valores Mobiliários — regulamentos, deliberações, consultas públicas",
tags=["gov", "financeiro", "regulação"],
))
engine.add(Site(
url="arxiv.org",
category="papers acadêmicos",
description="Preprints em CS, math, physics, biology",
tags=["research", "ml", "ai"],
))
# All metadata fields are optional. The LLM router uses whatever is provided.
engine.add(Site("news.ycombinator.com"))
# Search with automatic routing
results = engine.search("transformer architectures", depth=2)The catalog persists in SQLite (~/.vouch/catalog.db). You can engine.list_sites(), engine.remove("arxiv.org"), engine.update("arxiv.org", tags=[...]). (engine.list() still works as a deprecated alias through v0.x.)
For projects you want to version-control and share:
# sites.yaml
defaults:
llm: ollama/qwen2.5:14b
depth: 1
humanize: true
sites:
- url: cvm.gov.br
category: regulação financeira BR
description: Comissão de Valores Mobiliários
tags: [gov, financeiro]
- url: arxiv.org
category: papers acadêmicos
description: Preprints em CS, math, physics
tags: [research, ml, ai]
- url: news.ycombinator.com
# no metadata — that's fine, LLM will infer from URL
- url: site-protegido.com
behavior: stealth # use patchright for this site
rate_limit: "1/30s" # be extra politefrom vouch import SearchEngine
engine = SearchEngine.from_yaml("sites.yaml")
results = engine.search("LLM regulation")Same engine, same SQLite cache, just a different way to declare the catalog. Good for CI, sharing with teammates, or templating per-project.
vouch is built around four primitives. Understanding them helps you customize what you need:
┌──────────────────────────────────────────────────────────────┐
│ SearchEngine │
│ the orchestrator — you mostly talk to this │
└──────────────────────────────────────────────────────────────┘
│
├─── Catalog ──────── persistent registry of Sites
│
├─── Router ────────── picks relevant Sites for a query
│ (uses Site metadata + LLM)
│
├─── SiteAdapter ───── per-site search executor
│ ├─ discovers selectors (LLM, first time)
│ ├─ caches selectors (SQLite, every time after)
│ └─ executes search + extraction
│
└─── Aggregator ────── ranks, dedups, returns SearchResult
Each is a Protocol — you can replace any of them. Want a custom router that uses a vector DB instead of LLM? Implement Router. Want to plug a paid scraping service? Implement SiteAdapter.
from vouch import Site
Site(
url: str, # required: "cvm.gov.br" or "https://cvm.gov.br"
category: str | None = None, # optional: "regulação financeira"
description: str | None = None, # optional: free-text, fed to router LLM
tags: list[str] | None = None, # optional: ["gov", "financeiro"]
# Behavioral overrides (all optional)
behavior: str = "natural", # "natural" | "stealth" | "external"
rate_limit: str | None = None, # "10/min", "1/30s", etc
requires_login: bool = False, # tells engine to handle auth flow
search_url_template: str | None = None, # bypass discovery: "/search?q={query}"
# Custom hooks (advanced)
pre_search: Callable | None = None,
post_extract: Callable | None = None,
)- Category and description are concatenated and fed to the router LLM when there are too many sites to search them all. The LLM reads your descriptions and picks the most relevant subset for the query. With no description, the router falls back to the URL and any cached info.
- Tags can be used as filters (
engine.search(query, only_tags=["ml"])) and as routing hints. - Behavior controls the browser strategy (see Optional features).
search_url_templateis the escape hatch: if you already know how the site's search works (https://github.com/search?q={query}), provide it and vouch skips discovery entirely.
The depth parameter is the central knob for cost vs. quality:
| Depth | Behavior | Latency | Cost (typical) |
|---|---|---|---|
| 0 | Fetch homepage + extract recent content. No search interaction. | ~2s/site | ~$0 |
| 1 | Discover & execute search, return result snippets. | ~5–10s/site | $0.005–0.02 |
| 2 | Enter top-N results, extract full content. | ~20–40s/site | $0.05–0.15 |
| 3 | Follow internal links from results, deep crawl. | ~1–3min/site | $0.20–0.60 |
# Quick check — just see what's on the homepages
engine.search("breaking news", depth=0)
# Standard — get search results
engine.search("LLM benchmarks", depth=1)
# Research mode — full content extraction
engine.search("GPT-5 architecture details", depth=2)
# Deep dive — agent navigates internal links
engine.search("complete history of CVM rule 175", depth=3)Costs above assume Claude Haiku or Gemini Flash. With Ollama: $0 for everything except hardware/electricity.
When your catalog grows beyond ~5 sites, searching all of them for every query wastes time and tokens. vouch uses an LLM to route: given a query and your catalog metadata, pick the top-K most relevant sites.
engine = SearchEngine(
llm="ollama/qwen2.5:14b",
router_top_k=3, # search top 3 sites per query
router_strategy="llm", # "llm" | "embedding" | "all" | "tags"
router_explain=True, # log why each site was chosen
)llm(default): single LLM call reads catalog + query, returns ranked sites. ~200 tokens, ~$0.0001 with Haiku, free with Ollama. Most accurate.embedding: pre-embeds catalog descriptions; routes via cosine similarity. ~10ms, no LLM call. Cheaper at scale, slightly less accurate. Usessentence-transformersif installed.all: searches every site every time. Honest fallback for small catalogs (≤5).tags: filter by tag match. Useful when query domain is explicit.
# Examples of router behavior
engine.search("Resolução CVM 175")
# → routes to: cvm.gov.br, valor.globo.com (skips arxiv, hacker news)
engine.search("attention is all you need paper")
# → routes to: arxiv.org (skips financial sites)
engine.search("python async best practices")
# → routes to: news.ycombinator.com (skips arxiv if no description match)Inspect what the router did:
results = engine.search("LLM eval methods")
print(results.routing_decisions)
# [
# RouteDecision(site="arxiv.org", score=0.92, reason="academic ML papers"),
# RouteDecision(site="news.ycombinator.com", score=0.61, reason="developer discussions"),
# ]The first time vouch visits a site, it spends an LLM call to figure out the search interface (where's the input? where's the submit? how do results render?). It caches what it learns in SQLite. Subsequent searches on the same site use Playwright directly, with no LLM in the loop — milliseconds, $0.
The cache is keyed by (domain, dom_fingerprint). If the site redesigns and the cached selectors fail, vouch invalidates the entry and re-discovers automatically.
# First search on cvm.gov.br: ~8s, $0.01 (LLM discovery)
# Second search on cvm.gov.br: ~2s, $0 (cached)
# After 1000 searches: still ~2s, still $0
engine.cache_stats()
# {
# "cvm.gov.br": {"discovered": "2026-05-09", "hits": 47, "fails": 0},
# "arxiv.org": {"discovered": "2026-05-10", "hits": 12, "fails": 1},
# }
# Force re-discovery for a site
engine.invalidate_cache("cvm.gov.br")The cache lives at ~/.vouch/selectors.db — a tiny SQLite file you can ship in version control if you want a head-start for teammates.
vouch ships with curated profiles for popular sites — preconfigured Site objects with the right search_url_template, tags, and category. You don't have to figure out arxiv's search URL or how to phrase HuggingFace's description.
from vouch import SearchEngine, get_profile, list_profiles
engine = SearchEngine(llm="ollama/qwen2.5:14b")
# List what's bundled
list_profiles()
# → ['arxiv.org', 'bbc.com', 'crates.io', 'developer.mozilla.org',
# 'docs.python.org', 'elpais.com', 'en.wikipedia.org', 'es.wikipedia.org',
# 'folha.uol.com.br', 'github.com', 'gov.uk', 'huggingface.co', 'irs.gov',
# 'jusbrasil.com.br', 'news.ycombinator.com', 'npmjs.com',
# 'paperswithcode.com', 'pkg.go.dev', 'planalto.gov.br', 'pt.wikipedia.org',
# 'pypi.org', 'reddit.com', 'sede.agenciatributaria.gob.es',
# 'stackoverflow.com']
# Use a profile directly
engine.add(get_profile("arxiv.org"))
engine.add(get_profile("github.com"))
engine.add(get_profile("jusbrasil.com.br")) # for BR legal/tax queries
results = engine.search("LLM benchmarks 2026")Or via CLI:
vouch profiles list # show bundled profiles
vouch profiles show arxiv.org # see one in detail
vouch profiles import arxiv.org,github.com,huggingface.co
vouch profiles import all # add everything to your local catalogThe profile YAML lives at vouch/profiles/builtin.yaml. Community contributions via PR welcome — see the new_profile issue template.
A separate community-maintained registry (vouch-profiles repo) is on the roadmap; teams can publish their domain-specific profiles there and run vouch profiles update to pull them. The update command needs internet access — it falls back silently to the bundled profiles on any network failure, so air-gapped installs keep working.
vouch uses LiteLLM underneath, which means any of 100+ providers work with the same syntax.
Recommended models:
ollama/qwen2.5:14b— best quality/size balance, runs in 16GB RAMollama/qwen2.5:7b— faster, lower hardwareollama/qwen2.5-vl:7b— multimodal, needed for vision featuresollama/llama3.2:8b— alternative
engine = SearchEngine(llm="ollama/qwen2.5:14b")
# Or split: cheap model for routing, capable model for extraction
engine = SearchEngine(
llm="ollama/qwen2.5:14b", # discovery + extraction
router_llm="ollama/qwen2.5:7b", # routing decisions
)# Anthropic
engine = SearchEngine(llm="anthropic/claude-haiku-4-5")
# requires ANTHROPIC_API_KEY env var
# OpenAI
engine = SearchEngine(llm="openai/gpt-4.1-mini")
# Google
engine = SearchEngine(llm="gemini/gemini-2.5-flash")
# Mix providers
engine = SearchEngine(
llm="anthropic/claude-sonnet-4-6",
router_llm="gemini/gemini-2.5-flash-lite", # cheaper routing
vision_llm="anthropic/claude-haiku-4-5", # for CAPTCHA / visual extraction
)engine = SearchEngine(
llm="anthropic/claude-haiku-4-5",
api_keys={
"anthropic": "sk-ant-...",
"openai": "sk-...",
},
)engine = SearchEngine(
llm=["anthropic/claude-haiku-4-5", "openai/gpt-4.1-mini", "ollama/qwen2.5:14b"],
)
# Tries each in order if previous failsvouch's primary audience is agentic frameworks. First-class adapters:
from crewai import Agent, Task, Crew
from vouch.integrations.crewai import VouchSearchTool
search_tool = VouchSearchTool(catalog="sites.yaml", default_depth=2)
researcher = Agent(
role="Financial Regulation Researcher",
goal="Find recent CVM rules on fund management",
tools=[search_tool],
llm="anthropic/claude-sonnet-4-6",
)from langchain.agents import AgentExecutor
from vouch.integrations.langchain import VouchSearchTool
tools = [VouchSearchTool.from_yaml("sites.yaml")]
agent = AgentExecutor(tools=tools, ...)from pydantic_ai import Agent
from vouch.integrations.pydantic_ai import vouch_tool
agent = Agent("anthropic:claude-sonnet-4-6", tools=[vouch_tool("sites.yaml")])vouch mcp serve --catalog sites.yamlConfigure in claude_desktop_config.json:
{
"mcpServers": {
"vouch": {
"command": "vouch",
"args": ["mcp", "serve", "--catalog", "/path/to/sites.yaml"]
}
}
}Now Claude Desktop can call search_curated_sources(query, depth) directly.
Reduces bot-detection signals on sites with simple behavioral analytics. Does not bypass Cloudflare/DataDome/PerimeterX — see limits.
engine = SearchEngine(
llm="ollama/qwen2.5:14b",
humanize=True, # default: lognormal pauses, variable typing speed
typing_speed="natural", # "fast" | "natural" | "slow"
business_hours_only=False, # restrict to user-local business hours
)What it does:
- Lognormal pause distribution between actions (60–250ms typing, 0.5–3s reading)
- Random typo injection with backspace correction (~2% rate)
- Optional business-hours-only execution (
business_hours_only=True) - Single, recent Chrome user-agent applied per session (no rotation)
Drop-in Patchright replacement for Playwright. Helps with mild Cloudflare Bot Fight Mode, fingerprint checks. Does not defeat Turnstile invisible or DataDome.
engine.add(Site("portal-protegido.com", behavior="stealth"))
# OR globally:
engine = SearchEngine(default_behavior="stealth")Optional, opt-in, multi-backend. Two tiers used in order from cheapest to heaviest:
1. Tesseract OCR — for distorted-text CAPTCHAs (Receita, CVM, Jusbrasil, agencia tributaria, gov.uk). CPU-only, ~50-200 ms per image, $0, no GPU, no model download.
pip install "vouch[ocr]"
# plus the system binary:
# Ubuntu/Debian: sudo apt install tesseract-ocr tesseract-ocr-por tesseract-ocr-spa
# macOS: brew install tesseract tesseract-lang
# Windows: https://github.com/UB-Mannheim/tesseract/wikiThat's already enough for the common text-CAPTCHA case — no vision_llm needed.
2. Vision LLM — upgrade for image-grid CAPTCHAs ("select all traffic lights").
engine = SearchEngine(
llm="ollama/qwen2.5:14b",
vision_llm="ollama/qwen2.5vl:7b", # ~5 GB, enables image-grid solving
captcha_min_confidence=0.7,
captcha_max_attempts=2,
)The CaptchaSolver chains them automatically: Tesseract first → vision LLM on low confidence → return best. result.solver tells you which one produced the answer.
Supported types: text OCR, reCAPTCHA v2 image grid, hCaptcha image grid (~50–80% success rate).
Not supported: reCAPTCHA v3 (invisible), Cloudflare Turnstile, Arkose 3D puzzles. These return status="blocked".
# When solver fails or type is unsupported:
result.status # "blocked"
result.reason # "captcha_unsupported"
result.suggestion # "Use the official API: https://..."Watch sites for changes, get notified.
from vouch import Monitor
monitor = Monitor(engine)
monitor.watch(
site="cvm.gov.br",
query="resoluções recentes",
interval="1h",
notify_via="email", # or "slack", "discord", "webhook"
)
monitor.start()Uses 3-tier change detection: hash → diff → embedding similarity. Avoids LLM calls when content is unchanged, keeping monitoring cheap (~$1–3/month for 100 sites hourly with Gemini Flash-Lite).
vouch serve --port 8080
# → web UI at http://localhost:8080
# → manage catalog, view cache, run searches, see costsSearchEngine(
# LLM
llm: str | list[str] = "ollama/qwen2.5:14b",
router_llm: str | None = None, # defaults to llm
vision_llm: str | None = None, # disables CAPTCHA features if None
api_keys: dict | None = None,
# Routing
router_strategy: str = "llm", # "llm" | "embedding" | "all" | "tags"
router_top_k: int = 3,
router_explain: bool = False,
# Search
default_depth: int = 1,
parallel_sites: int = 3, # how many sites to search concurrently
# Browser
default_behavior: str = "natural", # "natural" | "stealth" | "external"
humanize: bool = True,
typing_speed: str = "natural",
headless: bool = True,
# Caching
cache_dir: str = "~/.vouch",
cache_ttl_days: int = 30,
# Politeness
respect_robots_txt: bool = True,
default_rate_limit: str = "2/min",
user_agent: str | None = None, # defaults to recent Chrome
# Captcha (only if vision_llm is set)
captcha_min_confidence: float = 0.7,
captcha_max_attempts: int = 2,
# Misc
cache_dir: str = "~/.vouch",
verbose: bool = False,
)engine.search(
query: str,
depth: int = None, # overrides default_depth
sites: list[str] | None = None, # restrict to specific sites
only_tags: list[str] | None = None,
exclude_tags: list[str] | None = None,
max_results: int = 10,
timeout: float = 60.0,
) -> SearchResultSearchResult(
query: str,
chunks: list[Chunk],
sites_searched: list[str],
routing_decisions: list[RouteDecision],
duration_ms: float,
cost_estimate_usd: float,
tokens_used: dict[str, int], # {"input": ..., "output": ...}
cache_stats: dict, # {"hits": ..., "misses": ...}
)
Chunk(
source_url: str,
site: str, # "cvm.gov.br"
site_category: str | None,
title: str,
snippet: str, # ~200 chars
content: str | None, # full markdown if depth >= 2
relevance_score: float, # 0–1
extracted_at: datetime,
metadata: dict, # extra fields per adapter
)┌──────────────────────────────────────────────────────────────────────┐
│ SearchEngine │
└──────────────────────────────────────────────────────────────────────┘
│
┌──────────────────┴──────────────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Catalog │ │ Router │
│ (SQLite) │ ─── metadata ────▶ │ (LLM/emb) │
└──────────────┘ └──────────────┘
│
│ ranked sites
▼
┌────────────────────┐
│ parallel fetch │
└────────────────────┘
│
┌───────────────────────────────┼───────────────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SiteAdapter │ │ SiteAdapter │ │ SiteAdapter │
│ cvm.gov.br │ │ arxiv.org │ │ news.yc.com │
│ │ │ │ │ │
│ ┌──────────┐ │ │ ┌──────────┐ │ │ ┌──────────┐ │
│ │ selector │ │ │ │ selector │ │ │ │ selector │ │
│ │ cache │ │ │ │ cache │ │ │ │ cache │ │
│ └────┬─────┘ │ │ └────┬─────┘ │ │ └────┬─────┘ │
│ │ MISS │ │ │ HIT │ │ │ HIT │
│ ▼ │ │ ▼ │ │ ▼ │
│ discover │ │ playwright │ │ playwright │
│ via LLM │ │ replay │ │ replay │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└──────────────────────────────┴──────────────────────────────┘
│
▼
┌────────────────────┐
│ Aggregator/Rank │
└────────────────────┘
│
▼
SearchResult
vouch/
├── __init__.py # public API: search, SearchEngine, Site
├── engine.py # SearchEngine orchestrator
├── catalog.py # Catalog (SQLite-backed Site registry)
├── models.py # Site, Chunk, SearchResult, RouteDecision
├── config.py # SearchEngine config dataclass
├── exceptions.py # VouchError + back-compat CurioError alias
├── dns_resolver.py # auto-fixes wrong-host catalog entries
├── plugins.py # entry_points-based plugin discovery
├── cli.py # typer-based CLI (vouch ...)
├── server.py # optional: FastAPI dashboard
├── _lang.py # language detection helpers
├── _llm.py # LiteLLM wrapper (sync + async)
├── router/
│ ├── base.py # Router Protocol
│ ├── llm_router.py # LLM-based routing (default)
│ ├── embedding_router.py# cosine-similarity routing
│ ├── tag_router.py # tag-match routing
│ └── all_router.py # "search every site" fallback
├── adapters/
│ ├── base.py # SiteAdapter Protocol
│ ├── http.py # depth=0, fast HTTP+Trafilatura
│ ├── browser.py # depth=1+, Playwright
│ ├── browser_pool.py # shared Chromium pool across searches
│ └── stealth.py # patchright wrapper
├── discovery/
│ ├── search_bar.py # LLM-powered selector discovery
│ ├── cache.py # SQLite selector cache
│ ├── humanize.py # lognormal pauses, typing, business hours
│ └── probe.py # opt-in profile probe on add()
├── extraction/
│ ├── trafilatura.py # HTML → markdown
│ ├── pdf.py # PDF handling (pypdf)
│ ├── llm.py # structured extraction via Pydantic
│ ├── llm_extract.py # quality detector + LLM-driven extraction
│ └── css_selectors.py # cached-selector replay (lxml.cssselect)
├── captcha/ # optional: Tesseract OCR + vision_llm CAPTCHA
│ ├── solver.py
│ └── tesseract.py
├── monitor/ # optional: change tracking + notifications
│ ├── watcher.py
│ └── notify.py
├── profiles/ # 24 curated site profiles + registry
│ ├── registry.py
│ ├── update.py
│ └── builtin.yaml
├── integrations/
│ ├── _common.py
│ ├── crewai.py # VouchSearchTool (CrewAI)
│ ├── langchain.py # VouchSearchTool (LangChain)
│ ├── pydantic_ai.py # vouch_tool (PydanticAI)
│ └── mcp.py # MCP server (Claude Desktop, Cursor)
- playwright — browser automation
- litellm — multi-LLM, BYOK, fallback chains
- pydantic v2 — models everywhere
- sqlalchemy + sqlite — catalog and cache
- trafilatura — fast HTML→markdown extraction
- typer — CLI
- apscheduler — optional, for monitor
- fastapi + uvicorn — optional, for dashboard
- patchright — optional, for stealth
- sentence-transformers — optional, for embedding router
No CrewAI/LangChain/etc as hard deps. They're integration extras only.
Typical numbers from internal runs. Depend heavily on LLM, network, site, and cache state — treat as rough order-of-magnitude, not a guarantee. Reproducible benchmark scripts under examples/ if you want to validate on your stack.
Typical query, depth=1, 3 sites routed, all cache hits:
| LLM | Routing | Discovery | Extraction | Total |
|---|---|---|---|---|
| Ollama qwen2.5:14b | $0 | $0 (cached) | $0 | $0 |
| Gemini Flash-Lite | $0.0001 | $0 (cached) | $0.001 | ~$0.001 |
| Claude Haiku 4.5 | $0.0005 | $0 (cached) | $0.005 | ~$0.005 |
| GPT-4.1-mini | $0.0003 | $0 (cached) | $0.003 | ~$0.003 |
First search on a new site adds ~$0.01–0.05 for discovery (one-time per site).
- HTTP-only fetch (depth=0): ~1–3s/site
- Cached browser search (depth=1): ~2–5s/site
- First-time discovery (depth=1): ~8–15s/site
- Full content extraction (depth=2): ~15–40s/site
Searches across multiple sites run in parallel (controlled by parallel_sites).
- Pure HTTP mode: any machine with Python
- Browser mode: 4GB RAM minimum, 8GB recommended
- Local LLM mode: depends on model. Qwen2.5:14b needs ~16GB RAM (CPU) or ~10GB VRAM
- Vision mode: Qwen2.5-VL 7B needs ~8GB VRAM
vouch is opinionated about what it tries to do and what it doesn't.
What's in scope:
- Public sites that respond well to standard web requests
- Sites where you want to discover search behavior dynamically
- Sites you've explicitly added to your catalog
- Polite, rate-limited, robots.txt-respecting access
- Educational, research, regulatory, professional use cases
What's explicitly out of scope:
- Bypassing Cloudflare's modern bot protection (Turnstile invisible, Bot Fight Mode aggressive). vouch includes humanize and stealth options that help with mild detection, but no open-source library reliably defeats current Cloudflare. For those cases, plug a commercial bypass service (Browserbase, Bright Data, ZenRows) via the
behavior="external"site option. - Mass scraping — vouch is built for "search across my list of trusted sources", not "scrape the entire web."
- CAPTCHA solving as a sales pitch. The optional vision-LLM CAPTCHA assist is best-effort and only works on simple visual challenges. It won't (and shouldn't) work on adversarial challenges.
- Logged-in personal accounts (LinkedIn, Instagram, etc.) — even when technically possible, this is usually a ToS violation. vouch supports authentication for sites where it's legitimate (gov portals, enterprise dashboards you own), not for this.
This isn't a limitation we apologize for — it's the position. A library that promises to bypass everything breaks every week and isn't sustainable as open-source.
Shipped (see CHANGELOG.md for the full list):
- v0.1 — Levels 1-3 API, HTTP + browser adapters, LLM router, selector cache, BYOK via LiteLLM, CrewAI + LangChain integrations, CLI.
- v0.2 — embedding/tag/all routers, vision-LLM + Tesseract CAPTCHA assist, MCP server, PydanticAI integration, browser pool, plugin model via entry_points, profile registry + auto-update, async API, FastAPI dashboard, stealth mode (patchright), change monitor, auto-escalation ladder, auto-DNS resolution, quality detector, CSS selector pinning, Accept-Language header.
Coming next (no committed dates — direction shaped by feedback and real use cases):
- Auth flow helpers for sites that legitimately need a login (cookie persistence, refresh-token rotation)
- Distributed search workers (multi-machine fanout for large catalogs)
- Built-in vector store for chunk reuse across queries
- Docling-backed PDF extraction
- Workflow chains (search → reason → search → extract)
- Standalone community profile registry repo (
vouch-profiles) - Per-site adapter packages on PyPI (
vouch-adapter-*) as the plugin ecosystem matures
See CONTRIBUTING.md. Quick start for dev:
git clone https://github.com/evertonsjjj/vouch
cd vouch
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,all]"
playwright install chromium
pytestIssues, PRs, ideas welcome. Especially:
- New
SiteAdapterimplementations (specific site optimizations) - New
Routerstrategies - Integrations with other agent frameworks
- Translations of docs
vouch stands on the shoulders of these projects. Read their docs — most of them solve adjacent problems brilliantly.
- Tavily & Exa — defined the "search API for agents" category. vouch is "Tavily, but for your own sources."
- Kagi Lenses & Brave Goggles — proved curated-source search is a real product. vouch brings it to OSS.
- Skyvern — pioneered "discover-then-cache" for browser automation. vouch applies the same pattern to search specifically.
- Stagehand —
act()/observe()API inspired the discovery layer.
- Playwright — browser automation
- browser-use — DOM-to-LLM serialization patterns
- Crawl4AI — extraction strategies, JsonCssExtractionStrategy
- LiteLLM — multi-provider LLM client
- Trafilatura — HTML to clean text
- Patchright — stealth Playwright
- Camoufox — for serious anti-fingerprinting
- semantic-router — embedding-based routing patterns
- APScheduler — for the optional monitor module
- Apprise — 100+ notification channels for monitor
- Perplexica — open-source Perplexity clone (open-web)
- gpt-researcher — research agent (open-web)
- local-deep-research — local research bot (engine-based, not site-based)
- changedetection.io — site change tracking (no LLM extraction)
- bdambrosio/llmsearch — small precedent for source-curated LLM search
- Firecrawl —
/searchendpoint backed by Google/Bing - ScrapeGraphAI — LLM-driven scraping
- SearXNG — federated meta-search engine
- Stagehand caching internals: https://www.browserbase.com/blog/stagehand-caching
- Skyvern adaptive caching: https://github.com/Skyvern-AI/skyvern (recent PRs)
- Anti-detection landscape 2026: https://github.com/pim97/anti-detect-browser-tools-tech-comparison
- LLM Gateway comparison: https://github.com/BerriAI/litellm
MIT — see LICENSE.
Built by @evertonsjjj. Early-stage open source — looking for real-world feedback, contributors, and battle-tested use cases. The gap this fills (curated-source search for agents) is one I needed myself, but I'd rather hear what's wrong with it from people using it than guess in isolation.
If vouch helps you, a ⭐ on GitHub goes a long way.