A SQLite-based Retrieval-Augmented Generation (RAG) system built for efficient document storage, chunking, and hybrid search capabilities.
- Local SQLite: No need to run additional servers
- Support for various embedding providers: You can use Ollama, VoyageAI or add your own
- Hybrid Search: Vector search using
sqlite-vec
combined with full-text searchFTS5
, using Reciprocal Rank Fusion - Multi-format Support: Parse 40+ file formats including PDF, DOCX, HTML, Markdown, audio and more. Or add a url!
uv pip install haiku.rag
By default Ollama (with the mxbai-embed-large
model) is used for the embeddings.
For other providers use:
- VoyageAI:
uv pip install haiku.rag --extra voyageai
If you want to use an alternative embeddings provider (Ollama being the default) you will need to set the provider details through environment variables:
By default:
EMBEDDING_PROVIDER="ollama"
EMBEDDING_MODEL="mxbai-embed-large" # or any other model
EMBEDDING_VECTOR_DIM=1024
For VoyageAI:
EMBEDDING_PROVIDER="voyageai"
EMBEDDING_MODEL="voyage-3.5" # or any other model
EMBEDDING_VECTOR_DIM=1024
haiku.rag
includes a CLI application for managing documents and performing searches from the command line:
# List all documents
haiku-rag list
# Add document from text
haiku-rag add "Your document content here"
# Add document from file or URL
haiku-rag add-src /path/to/document.pdf
haiku-rag add-src https://example.com/article.html
# Get and display a specific document
haiku-rag get 1
# Delete a document by ID
haiku-rag delete 1
# Search documents
haiku-rag search "machine learning"
# Search with custom options
haiku-rag search "python programming" --limit 10 --k 100
All commands support the --db
option to specify a custom database path. Run
haiku-rag command -h
to see additional parameters for a command.
from pathlib import Path
from haiku.rag.client import HaikuRAG
# Use as async context manager (recommended)
async with HaikuRAG("path/to/database.db") as client:
# Create document from text
doc = await client.create_document(
content="Your document content here",
uri="doc://example",
metadata={"source": "manual", "topic": "example"}
)
# Create document from file (auto-parses content)
doc = await client.create_document_from_source("path/to/document.pdf")
# Create document from URL
doc = await client.create_document_from_source("https://example.com/article.html")
# Retrieve documents
doc = await client.get_document_by_id(1)
doc = await client.get_document_by_uri("file:///path/to/document.pdf")
# List all documents with pagination
docs = await client.list_documents(limit=10, offset=0)
# Update document content
doc.content = "Updated content"
await client.update_document(doc)
# Delete document
await client.delete_document(doc.id)
# Search documents using hybrid search (vector + full-text)
results = await client.search("machine learning algorithms", limit=5)
for chunk, score in results:
print(f"Score: {score:.3f}")
print(f"Content: {chunk.content}")
print(f"Document ID: {chunk.document_id}")
print("---")
async with HaikuRAG("database.db") as client:
results = await client.search(
query="machine learning",
limit=5, # Maximum results to return, defaults to 5
k=60 # RRF parameter for reciprocal rank fusion, defaults to 60
)
# Process results
for chunk, relevance_score in results:
print(f"Relevance: {relevance_score:.3f}")
print(f"Content: {chunk.content}")
print(f"From document: {chunk.document_id}")