Skip to content

l1x/qrst

Repository files navigation

QRST

On-device hybrid search engine for markdown files. Single static Rust binary, zero runtime dependencies.

What it does

Indexes markdown documents (notes, transcripts, documentation) and provides keyword, semantic, and hybrid search — all running locally with no external API calls.

  • BM25 full-text search via SQLite FTS5
  • Vector semantic search via usearch (HNSW)
  • Hybrid fusion via ConvexFusion (default, alpha=0.95) and RRF
  • Pluggable chunking with markdown-aware boundary detection
  • Trait-based architecture — swappable embedding backends (Embed), vector stores (VectorStore), and fusion strategies (FusionStrategy)
  • Works for both humans and AI agents through the same CLI

Install

cargo install qrst

Usage

# Index a directory of markdown files
qrst index ~/notes

# Search (keyword mode)
qrst search "quarterly planning"

# Search (hybrid mode — requires embedding model)
qrst search -m hybrid "quarterly planning"

# Check index status
qrst status

# Get a document by ID
qrst get 42

Configuration

TOML config at ~/.config/qrst/config.toml:

data_dir = "~/.qrst"

[indexer]
max_chunk_size = 1500
min_chunk_size = 50
respect_gitignore = true

[search]
fusion_strategy = "convex"   # "convex" (default) or "rrf"
convex_alpha = 0.95
default_limit = 10

Development

Requires Rust toolchain. Uses mise for task running.

mise run build-dev      # debug build
mise run tests          # run all tests
mise run lint           # clippy with -D warnings
mise run verify         # lint + tests
mise run build-prod     # release build

Architecture

The core components are behind traits for extensibility:

Trait Default Impl Purpose
Embed OnnxEmbedder Text → vector embedding
VectorStore UsearchIndex ANN vector search
FusionStrategy ConvexFusion Combine keyword + semantic results

This allows swapping in alternative backends (e.g. llama.cpp for embeddings, sqlite-vec for vectors, RRF for fusion) without touching the core search pipeline.

See the PRD for design decisions and roadmap.

Evaluation caveat

BEIR benchmark results are document-level: qrst indexes and retrieves whole documents rather than the 256–512 token passages used by published BEIR baselines. Scores are therefore not directly comparable to passage-level leaderboard numbers. The custom panzerotti evaluation uses a single annotator; inter-annotator agreement has not yet been measured.

License

MIT

About

Search your repo

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages