Structured knowledge for hungry agents.
Your AI agent is only as smart as what you feed it. Brainfood turns your messy knowledge — YouTube transcripts, PDFs, docs, notes, websites — into clean, structured data your agent actually understands.
You have knowledge trapped in files your AI can't use. Brainfood fixes that.
| Source | What brainfood does |
|---|---|
| PDFs | Extracts text, structures it, outputs clean JSON/Markdown/Obsidian notes |
| Local docs (Markdown, HTML, text, DOCX) | Parses, organizes, builds a knowledge graph |
| Websites | Crawls pages, extracts content, maps structure |
| Sitemaps | Reads sitemap XML, fetches and processes all listed pages |
Every source becomes structured, linked, agent-ready output.
# Install
npm install -g brainfood
# Process local files (PDFs, docs, transcripts)
brainfood local ./my-knowledge --format both
# Crawl a website
brainfood crawl https://example.com --depth 2
# Generate Obsidian-ready notes
brainfood local ./research --format obsidian
# Read from a sitemap
brainfood sitemap https://example.com/sitemap.xmlOr run without installing:
npx brainfood local ./docs --format jsonStructured JSON nodes — perfect for AI agent ingestion, pipelines, and APIs.
Clean Markdown files — readable by humans and machines.
Obsidian-ready Markdown with YAML frontmatter, tags, and [[wiki-links]] — drop directly into your vault.
JSON + Markdown together.
Every format also writes a brainfood.json knowledge graph index.
Brainfood speaks Obsidian natively:
brainfood local ./research-papers --format obsidian --output ~/Documents/Obsidian\ Vault/research/Output includes:
- YAML frontmatter — title, date, source, tags, type
- Wiki-links — entity names automatically linked as
[[Entity Name]] - Clean filenames — slugified, no special characters
- Tags — extracted topics become Obsidian tags
Your vault becomes a living knowledge base — searchable, linked, and graph-ready.
- Ingest — point brainfood at files, a folder, a URL, or a sitemap
- Extract — content is parsed, cleaned, and structured using Mozilla Readability + Cheerio
- Structure — topics, entities, and relationships are identified and linked
- Output — clean knowledge nodes in your chosen format
Each document becomes a knowledge node:
{
"id": "a1b2c3d4e5f6",
"title": "Document Title",
"content": "# Clean structured content...",
"summary": "AI-generated or extractive summary",
"topics": ["topic1", "topic2"],
"entities": [
{ "name": "Key Concept", "type": "topic" }
],
"relationships": [],
"metadata": {
"sourceType": "local",
"wordCount": 1250,
"generatedAt": "2026-03-17T00:00:00.000Z"
}
}Process local Markdown, HTML, text, PDF, or DOCX files.
Crawl a website with configurable depth and rate limiting.
Parse a sitemap and fetch all listed pages.
| Option | Default | Description |
|---|---|---|
-o, --output <dir> |
./brainfood-output |
Output directory |
-f, --format <format> |
json |
Output format: json, markdown, obsidian, or both |
--summarize |
off | Generate AI summaries (requires OPENAI_API_KEY) |
--model <model> |
gpt-4.1-mini |
OpenAI model for summaries |
--depth <n> |
2 |
Max crawl depth (crawl mode) |
--max-pages <n> |
50 |
Max pages to process |
--concurrency <n> |
3 |
Concurrent requests (max 10) |
--rate-limit <ms> |
1000 |
Minimum ms between requests |
--exclude <patterns> |
— | Comma-separated URL patterns to skip |
Already running an OpenClaw agent? Just tell it what to process:
"Install brainfood and process my research folder into Obsidian notes"
Your agent will run:
npm install -g brainfood
brainfood local ./research --format obsidian --output ~/Documents/Obsidian\ Vault/research/"Crawl my company website and give me structured data"
brainfood crawl https://yoursite.com --depth 2 --format json"Convert these PDFs into something you can actually read"
brainfood local ./documents --format bothThat's it. One install, one command, your agent gets structured knowledge it can actually use.
Tip for non-technical users: Copy any of the commands above and paste them to your OpenClaw agent in chat. It handles the rest.
Feed your AI agent — Convert your knowledge base into structured data any LLM agent can ingest.
Build an Obsidian vault — Turn PDFs, transcripts, and research into linked, searchable notes.
Audit a website — Extract and map all content from any site for analysis or migration.
Power a knowledge pipeline — Automate ingestion from docs folders, sitemaps, or web sources.
Capxel builds AI-native intelligence infrastructure. Brainfood is open source under MIT.
PRs welcome. See issues for open work.
MIT