An LLM-powered knowledge base with an interactive graph viewer. You drop raw sources into raw/, tell the LLM to ingest them, and it writes and maintains structured wiki pages — summaries, concepts, entities, and synthesis — all cross-linked and indexed. A browser-based graph viewer lets you explore the knowledge base visually.
Built on Andrej Karpathy's "LLM Wiki" pattern.
raw/ Sources you collect (articles, transcripts, notes) — you never edit these
wiki/ LLM-written & maintained pages — you never edit these directly
CLAUDE.md Schema that tells the LLM how to structure everything
src/ Graph viewer — Node.js server + browser frontend
Division of responsibility: You curate raw sources and direct queries. The LLM reads, writes, and links all wiki pages. The graph viewer lets you navigate the result.
Prerequisites: Node.js v18 or later. Python 3.8+ and pip (required for URL ingestion only).
# From the repo root
./start.shstart.sh installs dependencies on first run and starts the server at http://localhost:3000.
Or manually:
cd src
npm install # first run only
npm startThere are six operations. Type them in the chat with your LLM (Claude Code, Claude.ai, or any LLM that can read your repo).
Trigger: ingest <source> — where source is a local file path or a URL
The LLM will:
- Read the source in full (fetching it first if it is a URL — see below)
- Create
wiki/summaries/<source-slug>.md - Identify every concept, entity, and strategy mentioned
- Create a new page for each concept/entity that doesn't have one yet; update existing pages with new information
- Add cross-links in both directions across all touched pages
- Update
wiki/index.mdwith new and changed entries - Append a timestamped entry to
wiki/log.md - Flag any contradictions with existing wiki content
Local file examples:
ingest raw/podcast-transcript-episode-42.txt
I just added raw/q3-earnings-call.txt — please ingest it
URL examples:
ingest https://example.com/article-about-graph-databases
ingest https://signalovernoise.karlekar.cloud/issue-007.html
When given a URL, the LLM automatically invokes the ingest-url skill, which runs src/tools/fetch_md.py to download the page and its images, save the result to raw/, and then proceeds with the standard ingest steps above. Images are saved to raw/images/<slug>/ and embedded with relative paths. No API calls or external services are used — pure local Python.
After ingestion you will see new or updated files in wiki/summaries/, wiki/concepts/, wiki/entities/, and possibly wiki/synthesis/. Click Refresh in the graph viewer to see the changes.
Trigger: Ask any natural-language question
The LLM searches the wiki and synthesises an answer with citations. It will:
- Read
wiki/index.mdto identify relevant pages - Read those pages
- Synthesise a cited answer using wiki links
- If the answer reveals new cross-cutting insight: create a synthesis page in
wiki/synthesis/and update the index and log
Examples:
What are the key differences between community detection and node similarity?
Which sources mention AWS Neptune? What do they say about its limitations?
Summarise everything the wiki knows about vector embeddings and how they relate to graph databases.
What strategies does the wiki recommend for handling high-cardinality graphs?
The LLM answers inline and, when appropriate, writes a new wiki/synthesis/ page capturing the insight for future reference.
Trigger: lint or health check
Audits the entire wiki and fixes what it can automatically. The LLM will:
- Read every wiki page
- Check for:
- Orphan pages (no inbound links)
- Missing cross-links (concept mentioned but not linked)
- Contradictions between pages
- Incomplete required sections
- Low-confidence claims that could be strengthened with existing sources
- Fix issues it can resolve automatically (add missing links, fill incomplete sections)
- Report issues that need human judgement (genuine contradictions, gaps requiring new sources)
- Suggest topics or sources worth investigating
- Append a lint summary to
wiki/log.md
Examples:
lint
health check
Run a lint and tell me which concepts have the least source coverage.
Trigger: research <topic>
Searches the web for credible sources on a topic, evaluates them, extracts attributed claims, and populates the wiki — without you providing a specific source. Use this when you want the LLM to go find and compile knowledge on a subject rather than ingest something you already have.
The LLM will:
- Check existing wiki coverage to avoid duplicating what's already there
- Run web searches to find 5–7 candidate sources
- Evaluate each for credibility (author, publisher, recency, sourcing quality) — accept 3–5, skip the rest
- Extract key claims tagged to their source URL
- Map consensus, disagreement, and gaps across sources
- Save a research log to
raw/research-<topic-slug>-<date>.mdwith full source provenance - Create concept, entity, and synthesis wiki pages from the findings
- Update
wiki/index.mdandwiki/log.md
Examples:
research "transformer attention mechanisms"
research "agentic AI frameworks 2025"
research "graph database performance benchmarks"
The LLM uses the research skill, which handles web search and source evaluation automatically. Contested claims across sources are noted explicitly — never silently merged. A synthesis page is created whenever multiple competing perspectives are found.
Trigger: newsletter <topic>
Transforms the wiki's accumulated knowledge on a topic into a compelling long-form newsletter in the Signal Over Noise style. If wiki coverage on the topic is insufficient, the LLM automatically invokes the research operation first — enriching the wiki as a side effect — then writes the newsletter.
The LLM will:
- Check wiki coverage: look for 3+ substantive pages covering what the topic is, how it works, and its challenges or threats
- If coverage is insufficient: run the research workflow (web search → wiki pages) before proceeding
- Read original source files for direct quotes and specific citations
- Write a 4,000–5,500 word newsletter with: narrative hook, problem/context with comparison table, deep analysis sections, threats, toolscape (open-source and commercial tools), action item audit, and closing signal
- Save to
wiki/newsletters/newsletter-<topic-slug>-<YYYY-MM-DD>.md - Update
wiki/index.mdandwiki/log.md
Examples:
newsletter "harness engineering"
newsletter "graph databases for agentic AI"
newsletter "LLM Wiki pattern"
The newsletter follows the Signal Over Noise voice: energetic, active, direct address, present-tense urgency, inline citations (arXiv IDs, author names), and named Friction Point callouts that explain why adoption is hard — not just technically but organizationally.
Trigger: ./reset-wiki.sh
Wipes all local wiki content and raw sources, then restores the five wiki root files to their pristine template state. Use this to start fresh with a new knowledge domain or to recover from a corrupted wiki state.
./reset-wiki.shThe script will prompt for confirmation before doing anything. What it clears:
raw/— all files and subdirectories except.gitkeepwiki/concepts/,wiki/entities/,wiki/summaries/,wiki/synthesis/,wiki/newsletters/,wiki/presentations/— all.mdfileswiki/journal/— all.mdfiles excepttemplate.mdwiki/index.md,wiki/log.md,wiki/analytics.md,wiki/dashboard.md,wiki/flashcards.md— overwritten with pristine template content
The script is fully self-contained — it does not call git or any external service. Pristine template content is embedded directly in the script.
Every page the LLM creates lives in one of these directories and follows a fixed structure.
One page per raw source. Created automatically during ingest.
---
title: "Source Title"
type: summary
tags: [tag1, tag2]
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: ["raw/filename.txt"]
confidence: high | medium | low
---
## Key Points
- Main claims and ideas from the source
## Relevant Concepts
Links to concept pages this source touches
## Source Metadata
Type, author/speaker, date, URL or identifierOne page per idea, framework, or strategy. Created or updated during ingest; also created on demand.
---
title: "Concept Name"
type: concept
tags: [tag1, tag2]
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: ["raw/source1.txt", "raw/source2.txt"]
confidence: high | medium | low
---
## Definition
Plain-English definition in one paragraph
## How It Works
Mechanics, process, or structure
## Key Parameters
Important variables, dimensions, or factors
## When To Use
Situations and contexts where this applies
## Risks & Pitfalls
Known failure modes, common mistakes, limitations
## Related Concepts
Links to related wiki pages
## Sources
Which raw sources inform this pageOne page per named thing — person, tool, organisation, product, dataset.
---
title: "Entity Name"
type: entity
tags: [tag1, tag2]
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: ["raw/source.txt"]
confidence: high | medium | low
---
## Overview
What this entity is
## Characteristics
Key properties, attributes, structure
## Common Strategies
Links to concept pages for methods associated with this entity
## Related Entities
Links to related entity pagesCross-cutting comparisons and analyses. Created when a query reveals novel insight, or on demand.
---
title: "Comparison or Analysis Title"
type: synthesis
tags: [tag1, tag2]
created: YYYY-MM-DD
updated: YYYY-MM-DD
sources: ["raw/source1.txt", "raw/source2.txt"]
confidence: high | medium | low
---
## Comparison
Table or structured comparison
## Analysis
Cross-cutting insights
## Recommendations
When to prefer which approach
## Pages Compared
Links to all pages involvedEvery page carries a confidence field in its frontmatter.
| Level | Meaning |
|---|---|
high |
Well-established; multiple corroborating sources; demonstrated with concrete examples |
medium |
Supported by sources but limited examples or single-source |
low |
Single mention, anecdotal, or speculative |
When in doubt the LLM sets low and notes the uncertainty inline. The lint workflow surfaces low-confidence pages and suggests how to strengthen them.
The LLM follows these rules when writing pages — useful to know when reading the wiki or navigating the graph:
- Links use standard Markdown relative syntax:
[Display Text](relative/path.md) - Paths are relative to the current file's location, not the wiki root
- Same folder:
[Decision Trace](decision-trace.md) - Sibling folder:
[AWS Neptune](../entities/aws-neptune.md) - From
summaries/toconcepts/:[Context Graph](../concepts/context-graph.md)
- Same folder:
- Every page links to at least one other page — no orphans
- When a concept is mentioned by name in a page, it is always linked if a page exists for it
| Feature | Description |
|---|---|
| Force-directed graph | Nodes coloured by page type (concept, entity, summary, synthesis, journal, …) with a live legend |
| Content panel | Renders Markdown with a metadata bar showing type, tags, confidence, and updated |
| Bidirectional navigation | Click nodes in the graph or links in the content panel — both stay in sync |
| Breadcrumb trail | Last 10 visited nodes, each clickable |
| Search | Instant dropdown search across node names and file paths |
| Type filters | Toggle-button filters that show/hide node types; graph re-stabilizes automatically |
| Graph statistics | Node count, edge count, nodes per type, orphan count |
| Pan / zoom / drag | Scroll to zoom, drag background to pan, drag nodes to reposition |
| Fit to view | One-click "Fit" button to see the whole graph |
| Refresh | Rebuilds the graph from wiki/ without a full page reload; preserves the active node |
Upload to raw/ |
Upload source files directly from the browser to the raw/ directory |
| Shortcut | Action |
|---|---|
Ctrl+/ / Cmd+/ |
Focus the search input |
Escape |
Clear search and close dropdown |
Backspace (search not focused) |
Navigate back |
Home |
Navigate to index.md |
.
├── CLAUDE.md # Schema — the LLM's instructions
├── start.sh # Convenience launcher
├── reset-wiki.sh # Reset raw/ and wiki/ to pristine template state
├── raw/ # Your source documents (immutable, not in git)
├── .claude/
│ └── commands/
│ ├── ingest-url.md # Project skill — fetch URL and save to raw/
│ └── research.md # Project skill — web research, source evaluation, claim extraction
├── docs/
│ ├── specification.md # Full software requirements (EARS format)
│ └── tasks.md # Implementation task list
├── src/
│ ├── package.json
│ ├── tools/
│ │ ├── fetch_md.py # HTML-to-Markdown converter for URL ingest
│ │ └── requirements.txt # Python deps: markdownify, beautifulsoup4
│ ├── server/
│ │ └── index.js # Express server — file API + upload endpoint
│ └── public/
│ ├── index.html
│ ├── css/styles.css
│ ├── js/
│ │ ├── app.js # Entry point — wires modules together
│ │ ├── graph.js # Graph model builder (file discovery, link extraction)
│ │ ├── visualization.js # D3 force-directed graph rendering
│ │ ├── content.js # Markdown renderer + metadata bar
│ │ ├── navigation.js # Breadcrumb, Back, Home
│ │ ├── search.js # Search input + type filter toggles
│ │ └── utils.js # Shared helpers
│ └── lib/ # Vendored dependencies (no CDN at runtime)
│ ├── d3.v7.min.js
│ ├── marked.min.js
│ ├── js-yaml.min.js
│ └── dompurify.min.js
└── wiki/
├── index.md # Master catalog — default selected node
├── log.md # Append-only activity log
├── dashboard.md # Dataview dashboard (Obsidian)
├── analytics.md # Charts View analytics (Obsidian)
├── flashcards.md # Spaced repetition cards
├── summaries/ # One page per source document (not in git)
├── concepts/ # Concept and framework pages (not in git)
├── entities/ # People, tools, organizations, etc. (not in git)
├── synthesis/ # Cross-cutting analyses and comparisons (not in git)
├── newsletters/ # Long-form newsletter issues (not in git)
├── journal/ # Research/session journal entries (not in git)
│ └── template.md
└── presentations/ # Marp slide decks (not in git)
Note:
raw/and allwiki/subdirectory content is excluded from git — these are LLM-generated or user-collected files that live only on your machine. The repo tracks infrastructure only: source code, schema, skills, and the wiki root files (index.md,log.md, etc.) at their initial state.
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/wiki/files |
Returns a JSON array of all .md paths under wiki/ |
GET |
/api/wiki/file?path=<rel> |
Returns the raw content of a wiki file |
POST |
/api/raw/upload |
Accepts multipart/form-data; writes the file to raw/ (rejects overwrites) |
The server binds to 127.0.0.1 only and never modifies files in wiki/.
Edit CLAUDE.md:
- Purpose — Replace the placeholder paragraph with a description of your knowledge domain
- Tagging taxonomy — Replace the placeholder categories with your own (e.g., for a cooking KB:
cuisine,technique,ingredient,equipment) - Confidence levels — Adjust the descriptions to match your domain's evidence standards
- Entity types — Update the entity page description to match what entities mean in your domain
- Journal template — Customize
wiki/journal/template.mdfor your workflow
Page formats, linking conventions, workflows, and graph viewer behaviour are domain-agnostic and work as-is.
| Role | Library |
|---|---|
| Graph visualization | D3.js v7 (d3-force) |
| Markdown rendering | marked |
| HTML sanitization | DOMPurify |
| YAML / frontmatter | js-yaml |
| Server | Express + multer |
All frontend dependencies are bundled locally — no CDN requests at runtime.
MIT