Skip to content

Wayy-Research/zrag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zrag

Tiny RAG with implicit knowledge graph, powered by zvec.

Ingest a directory of any file type into a local vector store. Query with natural language. Explore connections through an implicit knowledge graph — no servers, no API keys.

Install

pip install zerag

For PDF support:

pip install "zerag[pdf]"

Quick Start

CLI

# Index a codebase
zrag ingest ./src

# Search it
zrag query "how does authentication work?"

# Explore the knowledge graph
zrag graph "authentication" --depth 1

# Check stats
zrag stats

Python API

from zrag import ZragStore

with ZragStore("./my_index") as store:
    # Ingest files
    store.ingest("./docs")

    # Query
    results = store.query("how does auth work?", topk=5)
    for r in results:
        print(f"{r['file_path']}:{r['chunk_index']} ({r['score']:.3f})")
        print(f"  {r['content'][:100]}")

    # Knowledge graph
    graph = store.graph("auth", depth=1)
    print(f"{len(graph['nodes'])} nodes, {len(graph['edges'])} edges")

How It Works

Vector search — Files are chunked (~512 chars with paragraph-aware boundaries), embedded locally with all-MiniLM-L6-v2 (384-dim), and stored in a zvec HNSW index. Everything runs in-process.

Implicit knowledge graph — No stored graph. Edges are derived at query time:

  • same_file — chunks from the same file (free from metadata)
  • adjacent — consecutive chunks in a file (free from metadata)
  • similar — chunks above a cosine similarity threshold (computed via zvec)

Supported file types.md, .txt, .py, .js, .ts, .tsx, .jsx, .json, .yaml, .yml, .toml, .csv, .html, .rst, .pdf. Unknown extensions are attempted as UTF-8 text; binary files are skipped.

CLI Reference

zrag [--index PATH] <command>

Commands:
  ingest <dir>   Ingest files from a directory
    --chunk-size   Chunk size in chars (default: 512)
    --overlap      Overlap between chunks (default: 64)

  query <text>   Search the index
    --topk         Number of results (default: 5)

  graph <text>   Build implicit knowledge graph
    --depth        Expansion depth (default: 1)
    --topk         Seed results (default: 5)
    --threshold    Similarity threshold for edges (default: 0.5)

  stats          Show index statistics

Requirements

  • Python 3.10-3.12 (zvec constraint)
  • No API keys, no servers — embeddings run locally

License

MIT — Wayy Research

About

Tiny RAG with implicit knowledge graph, powered by zvec

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages