Knowledge Hub

A production-grade personal knowledge base with semantic search, designed to run on a single machine. Index codebases, web pages, PDFs, YouTube transcripts, and notes — then search everything with natural language through an MCP server, HTTP API, or CLI.

Built for developers who want their own searchable knowledge graph without cloud dependencies.

Why This Exists

LLMs are powerful but stateless. Every conversation starts from scratch. Knowledge Hub gives your tools persistent memory:

Index your entire codebase — AST-aware chunking understands Python class/function boundaries, not just line counts
Search with natural language — "how does the exit manager calculate stop losses" finds the exact code
Inject into any Claude Code session — MCP server makes your knowledge base a native tool
RAG queries — ask questions, get synthesized answers with source citations
Zero cloud lock-in — runs entirely on your machine (Ollama embeddings, Qdrant vector DB, SQLite metadata)

Architecture

┌─────────────────────────────────────────────────────┐
│                   Interfaces                         │
│  ┌──────────┐  ┌──────────────┐  ┌───────────────┐  │
│  │ MCP Server│  │ HTTP API     │  │ CLI           │  │
│  │ (stdio)   │  │ (port 8006)  │  │               │  │
│  └─────┬─────┘  └──────┬───────┘  └───────┬───────┘  │
│        └───────────────┼───────────────────┘          │
│                        ▼                              │
│              ┌─────────────────┐                      │
│              │ IngestPipeline   │                      │
│              │ (orchestrator)   │                      │
│              └────────┬────────┘                      │
│        ┌──────────────┼──────────────┐                │
│        ▼              ▼              ▼                │
│  ┌──────────┐  ┌────────────┐  ┌───────────┐         │
│  │ Ingestors │  │ Chunker    │  │ Search    │         │
│  │ Web/PDF/  │  │ AST/Prose/ │  │ Engine    │         │
│  │ YouTube/  │  │ Markdown   │  │ (hybrid)  │         │
│  │ Code/Text │  │            │  │           │         │
│  └──────────┘  └────────────┘  └───────────┘         │
│                        │                              │
│        ┌───────────────┼───────────────┐              │
│        ▼               ▼               ▼              │
│  ┌──────────┐   ┌──────────┐   ┌───────────┐         │
│  │ Ollama    │   │ Qdrant   │   │ SQLite    │         │
│  │ Embeddings│   │ Vectors  │   │ Metadata  │         │
│  │ (768-dim) │   │ (HNSW)   │   │ (WAL)     │         │
│  └──────────┘   └──────────┘   └───────────┘         │
└─────────────────────────────────────────────────────┘

Component	Technology	Purpose
Vector DB	Qdrant (Docker)	Cosine similarity search, HNSW index, mmap disk storage
Embeddings	nomic-embed-text (Ollama)	768-dim vectors, local inference, free
Metadata	SQLite (WAL mode)	Document tracking, dedup via content hashing
Chunking	Python `ast` module	Class/function boundary detection for code
Search	Hybrid scoring	Semantic similarity + time decay + source credibility
RAG	Claude API	Synthesized answers with source citations
MCP	FastMCP (stdio)	Native Claude Code integration
HTTP	FastAPI	REST API for remote access

Quick Start

Prerequisites

Python 3.11+
Docker (for Qdrant)
Ollama with nomic-embed-text model

# Install Ollama and pull the embedding model
ollama pull nomic-embed-text

Install

git clone https://github.com/sowmith95/KnowledgeHub.git
cd KnowledgeHub

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install
pip install -e .

Start Qdrant

cd docker
docker compose up -d qdrant

Index Something

# Index a codebase
knowledge-hub ingest /path/to/your/project --code --name "My Project"

# Index a web page
knowledge-hub ingest https://docs.python.org/3/tutorial/classes.html

# Index a PDF
knowledge-hub ingest /path/to/paper.pdf

# Index a YouTube video
knowledge-hub ingest https://www.youtube.com/watch?v=dQw4w9WgXcQ

# Store a note
knowledge-hub ingest --text "Redis HGETALL returns all fields as strings" --title "Redis Notes"

Search

# Semantic search
knowledge-hub search "how does authentication work"

# Search only code
knowledge-hub search "database connection pooling" --type code

# RAG query (requires ANTHROPIC_API_KEY)
knowledge-hub ask "What is the retry strategy for failed API calls?"

MCP Server (Claude Code Integration)

The primary interface. Add to your Claude Code config (~/.claude.json):

{
  "mcpServers": {
    "knowledge-hub": {
      "command": "/path/to/KnowledgeHub/.venv/bin/python3",
      "args": ["-m", "knowledge_hub"],
      "env": {
        "QDRANT_HOST": "localhost",
        "QDRANT_PORT": "6333",
        "OLLAMA_URL": "http://localhost:11434",
        "KB_SQLITE_PATH": "/path/to/knowledge-hub-data/metadata.db"
      }
    }
  }
}

Once configured, these tools are available in every Claude Code session:

MCP Tools

Tool	Description	Key Parameters
`kb_search`	Semantic search across all indexed content	`query`, `top_k` (default 10), `source_type` filter
`kb_ingest_url`	Ingest a web page, YouTube video, or PDF	`url`, `force` (re-ingest even if unchanged)
`kb_ingest_code`	Index a local codebase directory	`path`, `name`
`kb_ingest_text`	Store raw text or markdown	`text`, `title`, `source_type`
`kb_query`	RAG: search + synthesize answer with citations	`question`, `top_k` (default 8)
`kb_list`	List all indexed documents	`source_type` filter, `limit`
`kb_stats`	System health and document counts	—
`kb_delete`	Remove a document and its vectors	`doc_id`

HTTP API

Start the HTTP server for remote access (e.g., via Tailscale):

knowledge-hub serve --host 0.0.0.0 --port 8006

Method	Endpoint	Description
GET	`/health`	System health check
POST	`/ingest/url`	Ingest URL (web/PDF/YouTube)
POST	`/ingest/text`	Ingest raw text
POST	`/ingest/code`	Index a codebase directory
POST	`/search`	Semantic search
POST	`/query`	RAG query with synthesized answer
GET	`/documents`	List indexed documents
DELETE	`/documents/{doc_id}`	Delete a document
GET	`/stats`	System statistics

Example API Call

curl -X POST http://localhost:8006/search \
  -H "Content-Type: application/json" \
  -d '{"query": "retry logic for API calls", "top_k": 5}'

How It Works

Ingestion Pipeline

Source → Detect Type → Extract Content → Hash Check → Chunk → Embed → Store

Detect: IngestorRegistry tries YouTube → PDF → Code → Web (most specific first)
Extract: Pull text content, title, metadata from source
Hash Check: SHA-256 content hash — skip if unchanged (unless force=True)
Chunk: Split into semantically meaningful pieces (strategy depends on content type)
Embed: Generate 768-dim vectors via Ollama nomic-embed-text (batches of 32)
Store: Upsert vectors to Qdrant + document metadata to SQLite

Chunking Strategies

Python Code (AST-based)

Parses with ast.parse() for exact node boundaries
Imports grouped as one chunk
Each function/class becomes its own chunk
Large classes split into per-method chunks with class signature as context
Decorators stay attached to their definitions

Prose (sentence-aware)

Splits at sentence boundaries (.!? followed by uppercase)
256-word target chunks with 32-word overlap
Never splits mid-sentence

Markdown (heading-aware)

Splits at heading boundaries (# through ######)
Maintains parent heading context for sub-sections
Falls back to sentence chunking for large sections

Safety limits:

Max 5,000 characters per chunk
Max 3,500 characters sent to embedding model (nomic-embed-text has 8,192 token context)
Minimum 10 words per chunk (filters noise)

Search Scoring

Hybrid scoring combines three signals:

final_score = (similarity × 0.70) + (time_score × 0.15) + (source_weight × 0.15)

Signal	Weight	Formula
Semantic similarity	70%	Cosine distance from Qdrant
Time decay	15%	`exp(-0.693 × days_old / 90)` — half-life of 90 days
Source credibility	15%	code: 1.3, pdf: 1.2, markdown: 1.1, article: 1.0, youtube: 0.9, text: 0.8

Deduplication: Max 3 chunks per document in results to prevent one large file from dominating.

RAG Queries

When you use kb_query or the /query endpoint (requires ANTHROPIC_API_KEY):

Search for the top-k most relevant chunks
Build a context window with numbered sources
Send to Claude (claude-sonnet-4-20250514) with a system prompt enforcing context-only answers
Returns synthesized answer with [Source N] citations

Falls back to raw search results if no API key is configured.

Supported Content Types

Type	Ingestor	Details
Codebases	CodeIngestor	25+ file extensions, respects `.gitignore`, skips `node_modules`/`tests`/`.venv`, 1MB file limit
Web pages	WebIngestor	Strips nav/footer/ads, extracts `<article>` or `<main>` content
PDFs	PDFIngestor	Local files or URLs, extracts via PyPDF2
YouTube	YouTubeIngestor	Transcript extraction (manual → auto-generated → any language)
Raw text	Direct	Notes, analysis results, anything you want searchable
Markdown	Direct	Heading-aware chunking with hierarchy context

Supported Code Languages

Python (AST-parsed), JavaScript, TypeScript, TSX/JSX, Go, Rust, Java, C, C++, SQL, Shell/Bash, YAML, TOML, JSON, HTML, CSS, Svelte, Vue, Terraform, HCL, Dockerfile, Makefile.

Configuration

All configuration via environment variables with sensible defaults:

Variable	Default	Description
`QDRANT_HOST`	`localhost`	Qdrant server host
`QDRANT_PORT`	`6333`	Qdrant HTTP port
`QDRANT_GRPC_PORT`	`6334`	Qdrant gRPC port
`KB_COLLECTION`	`knowledge_hub`	Qdrant collection name
`KB_EMBEDDING_DIM`	`768`	Embedding vector dimensions
`OLLAMA_URL`	`http://localhost:11434`	Ollama API endpoint
`KB_EMBEDDING_MODEL`	`nomic-embed-text`	Ollama model name
`KB_CHUNK_SIZE`	`256`	Target words per chunk
`KB_CHUNK_OVERLAP`	`32`	Overlap words between chunks
`KB_SQLITE_PATH`	`~/knowledge-hub-data/metadata.db`	SQLite database path
`KB_API_HOST`	`0.0.0.0`	HTTP server bind address
`KB_API_PORT`	`8006`	HTTP server port
`ANTHROPIC_API_KEY`	—	Required for RAG queries (`kb_query`)

Docker Deployment

For running both Qdrant and the HTTP API server:

cd docker

# Start everything
docker compose up -d

# Or just Qdrant (if running MCP server locally)
docker compose up -d qdrant

The docker-compose.yml binds data to host directories for persistence:

Qdrant data: /Users/srp/knowledge-hub-data/qdrant
SQLite + app data: /Users/srp/knowledge-hub-data

Update the volume paths in docker-compose.yml for your system.

Project Structure

KnowledgeHub/
├── knowledge_hub/
│   ├── __init__.py
│   ├── __main__.py          # Entry point (MCP server)
│   ├── config.py             # All configuration (env vars)
│   ├── models.py             # Domain models (Document, Chunk, SearchResult)
│   ├── pipeline.py           # Orchestrator (ingest, search, query)
│   ├── embeddings.py         # Ollama nomic-embed-text client
│   ├── vector_store.py       # Qdrant wrapper (upsert, search, delete)
│   ├── metadata_db.py        # SQLite metadata (documents, dedup)
│   ├── chunker.py            # AST/sentence/markdown chunking
│   ├── api.py                # FastAPI HTTP server
│   ├── cli.py                # CLI interface
│   ├── ingestors/
│   │   ├── base.py           # BaseIngestor ABC
│   │   ├── web.py            # Web page ingestor
│   │   ├── pdf.py            # PDF ingestor
│   │   ├── youtube.py        # YouTube transcript ingestor
│   │   └── code.py           # Codebase directory ingestor
│   ├── search/
│   │   └── engine.py         # Hybrid search + RAG query
│   └── mcp_server/
│       └── server.py         # FastMCP tool definitions
├── docker/
│   ├── docker-compose.yml
│   └── Dockerfile
├── scripts/
│   ├── start.sh
│   ├── stop.sh
│   ├── configure-claude-code.sh
│   └── index-complextading.sh
├── pyproject.toml
└── README.md

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Hub

Why This Exists

Architecture

Quick Start

Prerequisites

Install

Start Qdrant

Index Something

Search

MCP Server (Claude Code Integration)

MCP Tools

HTTP API

Example API Call

How It Works

Ingestion Pipeline

Chunking Strategies

Search Scoring

RAG Queries

Supported Content Types

Supported Code Languages

Configuration

Docker Deployment

Project Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docker		docker
knowledge_hub		knowledge_hub
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Knowledge Hub

Why This Exists

Architecture

Quick Start

Prerequisites

Install

Start Qdrant

Index Something

Search

MCP Server (Claude Code Integration)

MCP Tools

HTTP API

Example API Call

How It Works

Ingestion Pipeline

Chunking Strategies

Search Scoring

RAG Queries

Supported Content Types

Supported Code Languages

Configuration

Docker Deployment

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages