A lightweight Flask web app that helps you read the web faster and incrementally build a personal knowledge graph:
- Search the web from a single input bar (DuckDuckGo).
- Summarize any page by pasting its URL and clicking Summarize.
- Summarize pasted text by pasting raw text instead of a URL.
- One-click "Summarize this" on every search result.
- Knowledge Graph — highlight any text in a summary, or paste a chunk in the input and click + KG, to save it (with source metadata, tags, and notes) to a staging area. When you're ready, Integrate curated chunks into the persistent overall graph — the LLM extracts entities and links chunks to existing nodes, so the graph compounds over time the more you read.
Summaries can be generated by any of:
| Provider | Env var for API key | Default model |
|---|---|---|
| Anthropic (Claude) | ANTHROPIC_API_KEY |
claude-haiku-4-5-20251001 |
| OpenAI | OPENAI_API_KEY |
gpt-4o-mini |
| Qwen (DashScope) | DASHSCOPE_API_KEY |
qwen-plus |
| DeepSeek | DEEPSEEK_API_KEY |
deepseek-chat |
Pick the provider and model from the dropdowns in the UI. The model field is free-form, so any model ID the provider supports also works. If no API keys are set, the app falls back to a local extractive summarizer that needs no key.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Optional — set any one or more for higher-quality AI summaries:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export DASHSCOPE_API_KEY=sk-... # Qwen
export DEEPSEEK_API_KEY=sk-...
python app.pyThen open http://localhost:5000.
- Search — Type a query, press Search (or hit
Enter). - Summarize a URL — Paste a URL, press Summarize (or
Shift+Enter). - Summarize text — Paste raw text into the input and press Summarize.
- From any search result, click Summarize this to summarize that page.
- Search — type a query, press Search (or
Enter). - Summarize a URL — paste a URL, press Summarize (or
Shift+Enter). - Save a chunk to KG — either paste text in the input and click
+ KG, or highlight any text in a summary to surface a floating "Save to KG" button. A modal lets you add tags, a note, and source metadata.
- Ingest a corpus — at the top of the tab, three modes:
- Files: drag-drop or pick
.txt,.md,.html,.pdf(up to 50 MB). - URLs: one URL per line — each page is fetched and chunked.
- Text: paste a large document with an optional source title.
- Files: drag-drop or pick
- All modes share chunk size (default 800 chars, ~200 tokens) and
overlap (default 120 chars, ~15%). The chunker splits on paragraph
boundaries first, then sentence boundaries inside long paragraphs, and
carries an overlap (snapped to a word boundary) between chunks. Chunks
land in staging tagged with their source and
part:i/N. - Browse what's in staging (recently saved chunks awaiting integration).
- Browse the integrated graph (chunks + extracted entities + edges).
- Click Integrate → to move all staged chunks into the overall graph. The selected provider/model (from the Read tab) extracts entities and links them. If no API key is set or you select Extractive, the app falls back to a heuristic (capitalized phrase) entity extractor.
- Click a node in the graph or sidebar to see its full text, source, tags, note, and linked nodes.
- Search the graph by free text — matches chunk text, entity names, tags, notes, and source titles.
The graph is stored as plain JSON in data/current.json (staging) and
data/overall.json (integrated). It's just a git-friendly file you can
diff, back up, or edit by hand.
GET /api/providers— list configured providers and their suggested models.POST /api/search—{ "query": "..." }→ DuckDuckGo results.POST /api/summarize—{ "input": "<url-or-text>", "provider": "...", "model": "..." }→ page summary.providermay beauto,anthropic,openai,qwen,deepseek, orextractive.modelis optional and falls back to the provider's default.GET /api/kg/stats—{current: {chunks, entities, edges}, overall: {...}}.GET /api/kg/graph?where=current|overall— full nodes + edges.POST /api/kg/add—{ "text", "source_title?", "source_url?", "tags?", "note?" }adds a chunk to the staging graph.POST /api/kg/integrate—{ "provider?", "model?", "use_ai?" }moves staged chunks into the overall graph, extracting entities and edges.POST /api/kg/query—{ "query", "where?" }returns matching nodes.DELETE /api/kg/node/<id>?where=current|overall— remove a node.POST /api/kg/ingest/text— chunk a pasted document into staging. Body:{ "text", "source_title?", "source_url?", "tags?", "chunk_size?", "overlap?" }.POST /api/kg/ingest/urls— fetch each URL, parse, chunk into staging. Body:{ "urls": [...], "tags?", "chunk_size?", "overlap?" }.POST /api/kg/ingest/files— multipart upload of.txt/.md/.html/.pdf, with form fieldstags,chunk_size,overlap.
| Variable | Purpose |
|---|---|
ANTHROPIC_API_KEY |
Enables Claude-powered summaries. |
OPENAI_API_KEY |
Enables OpenAI-powered summaries. |
DASHSCOPE_API_KEY |
Enables Qwen (DashScope) summaries. |
DEEPSEEK_API_KEY |
Enables DeepSeek summaries. |
OPENAI_BASE_URL |
Override OpenAI base URL (proxy, Azure, etc.). |
QWEN_BASE_URL |
Override DashScope base URL (use China endpoint, etc.). |
DEEPSEEK_BASE_URL |
Override DeepSeek base URL. |
KG_DATA_DIR |
Directory for KG JSON files (default data/). |
PORT |
Port to bind (default 5000). |