VectorLess Retrieval-Augmented Generation
Powerful document intelligence — no embeddings, no vector databases, no GPU required.
VL.RAG is an open-source document intelligence system that lets you ask questions about large documents — books, research papers, technical manuals, PDFs — without ever generating a single embedding or spinning up a vector database.
Traditional RAG:
Document → Chunking → Embeddings → Vector DB → Semantic Search → Answer
VL.RAG:
Document → Topic Detection → Topic Tree → Keyword Search → Page Retrieval → Answer
By organizing documents into a hierarchical topic tree and using battle-tested information retrieval (BM25, TF-IDF, inverted index), VL.RAG delivers high-quality answers at a fraction of the cost and complexity.
| Traditional RAG | VL.RAG | |
|---|---|---|
| Embeddings required | ✅ Yes | ❌ No |
| Vector database | ✅ Required | ❌ Not needed |
| GPU / paid API for indexing | ✅ Often | ❌ Never |
| Index size | Large | Tiny (metadata only) |
| Indexing speed (500-page PDF) | Minutes | ~10 seconds |
| Runs in browser | ❌ No | ✅ Yes |
| Self-hostable | Complex | Simple |
Well-suited for: books & long-form documents, research papers, technical documentation, large multi-document knowledge bases, and browser-based or offline applications.
Upload PDF → Extract text → Detect topic boundaries → Segment into blocks
→ Generate titles & summaries (LLM) → Extract keywords → Build topic tree → Store index
The result is a compact metadata-only index — no raw text stored.
Content is organized hierarchically rather than as flat chunks:
Machine Learning Book
├── Introduction
├── Supervised Learning
│ ├── Regression
│ └── Classification
└── Neural Networks
├── Feedforward Networks
└── Backpropagation
Each node stores title, summary, keywords, and a page range.
User question → Extract keywords → Search topic tree (BM25/TF-IDF)
→ Identify relevant node → Retrieve page range → Send to LLM → Answer with source
VL.RAG uses a hybrid heuristic + statistical approach:
- Heading detection — Parses structural headings to define topic boundaries.
- Block segmentation — Splits unstructured text into sentence blocks.
- Similarity scoring — TF-IDF cosine similarity between blocks; low similarity signals a boundary.
- LLM title generation — Only final detected blocks are sent to an LLM, keeping API costs minimal.
This enables indexing a 500-page PDF in under ~10 seconds.
┌─────────────────────────────────────┐
│ Frontend (React) │
│ Chat UI · Upload · Topic Explorer │
└────────────────┬────────────────────┘
│
┌────────────────▼────────────────────┐
│ Document Processing │
│ Text Extraction · Topic Detection │
│ Keyword Extraction · Tree Builder │
└────────────────┬────────────────────┘
│
┌────────────────▼────────────────────┐
│ Knowledge Database │
│ documents · sections · tree_nodes │
└────────────────┬────────────────────┘
│
┌────────────────▼────────────────────┐
│ Search Engine │
│ BM25 · TF-IDF · Ranking │
└────────────────┬────────────────────┘
│
┌────────────────▼────────────────────┐
│ LLM API │
│ OpenRouter · OpenAI · Local LLM │
└─────────────────────────────────────┘
Prerequisites: Node.js 18+, Python 3.10+
# Clone the repository
git clone https://github.com/your-org/vl-rag.git
cd vl-rag
# Install dependencies
npm install
# Configure environment
cp .env.example .env
# Add your LLM API key to .env
# Run locally
npm run devOpen http://localhost:3000, upload a PDF, and start asking questions.
Docker:
docker compose upUpload mode — Drag and drop any PDF directly into the interface.
Folder mode — Point VL.RAG at a local directory:
C:/Users/you/Documents/research_papers/
Files are indexed without copying — creating a personal knowledge vault with AI-powered Q&A, similar to Obsidian or Logseq.
VL.RAG uses proven information retrieval methods — no neural nets required at query time:
- BM25 — industry-standard probabilistic ranking
- TF-IDF — term frequency weighting
- Inverted index — fast keyword lookup
For browser deployments, FlexSearch and Lunr.js are supported out of the box.
- Multi-document cross-search
- Document graph linking (topic relationships across files)
- Citation-based answers (exact page + topic attribution)
- Automatic folder sync / watch mode
- Fully offline mode (local LLM via Ollama)
- Browser extension for on-page RAG
- REST API for programmatic access
MIT — free to use, modify, and distribute. See LICENSE for details.