Skip to content

aitherlab-dev/aitherflow-knowledge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aitherflow-knowledge

MCP server for RAG (Retrieval-Augmented Generation) knowledge base search. Local embeddings, local vector search, no API keys needed.

What it does

A stdio-based Model Context Protocol server that gives Claude agents access to your knowledge bases. Documents are embedded locally via fastembed (ONNX runtime) and stored in LanceDB for fast vector search.

Knowledge bases are created and managed through the aitherflow desktop app. This MCP server provides read and search access to them.

Tools

Tool Description
search_knowledge_base Semantic search across a knowledge base — returns relevant document chunks
list_knowledge_bases List all available knowledge bases
get_document_info Get metadata for a specific document
reindex_document Re-embed and reindex a document

Document parsers

  • PDF — text extraction via pdftotext, fallback to pdf-extract, OCR fallback for scanned pages
  • EPUB — full content extraction
  • TXT / Markdown — plain text, chunked for embedding

OCR

Scanned PDFs are handled via PP-OCR ONNX models (Latin + Cyrillic). Models are downloaded automatically on first use. Pages are rendered to images with pdftoppm, then recognized.

Embedding models

Model Languages Size
all-MiniLM-L6-v2 (default) English 23 MB
multilingual-e5-small 100+ languages 118 MB
multilingual-e5-large 100+ languages 560 MB
nomic-embed-text-v1.5 English 137 MB
bge-m3 100+ languages 567 MB

Models are downloaded on first use and cached locally by ONNX runtime.

Prerequisites

  • Rust 1.75+
  • poppler-utils — provides pdftotext, pdftoppm, pdfinfo (required for PDF parsing and OCR)

Install poppler-utils:

# Arch / Manjaro
sudo pacman -S poppler

# Ubuntu / Debian
sudo apt install poppler-utils

# macOS
brew install poppler

Build

cargo build --release

This will take a while on first build — fastembed and ONNX runtime are large dependencies.

Install

Register the MCP server in Claude Code:

claude mcp add --scope user aitherflow-knowledge /path/to/target/release/aitherflow-knowledge

For Claude Desktop, add to your config:

{
  "mcpServers": {
    "aitherflow-knowledge": {
      "command": "/path/to/target/release/aitherflow-knowledge"
    }
  }
}

Works with any MCP client that supports stdio transport.

Data locations

Path Contents
~/.local/share/aither-flow/rag/ Knowledge base data (LanceDB tables, document metadata)
~/.local/share/aither-flow/rag/settings.json RAG settings (embedding model, chunk size, etc.)

Usage

Once registered, ask any Claude agent:

"Search my knowledge base for information about vector databases"

"What documents are in my knowledge base?"

"Reindex the file report.pdf"

Part of aitherflow

This is a companion MCP server for aitherflow — a desktop GUI wrapper for Claude Code CLI by aitherlab-dev.

License

MIT

About

MCP server for RAG knowledge base search — local embeddings (fastembed/ONNX) + vector search (LanceDB)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages