VL.RAG

VectorLess Retrieval-Augmented Generation

Powerful document intelligence — no embeddings, no vector databases, no GPU required.

What is VL.RAG?

VL.RAG is an open-source document intelligence system that lets you ask questions about large documents — books, research papers, technical manuals, PDFs — without ever generating a single embedding or spinning up a vector database.

Traditional RAG:

Document → Chunking → Embeddings → Vector DB → Semantic Search → Answer

VL.RAG:

Document → Topic Detection → Topic Tree → Keyword Search → Page Retrieval → Answer

By organizing documents into a hierarchical topic tree and using battle-tested information retrieval (BM25, TF-IDF, inverted index), VL.RAG delivers high-quality answers at a fraction of the cost and complexity.

Why VL.RAG?

	Traditional RAG	VL.RAG
Embeddings required	✅ Yes	❌ No
Vector database	✅ Required	❌ Not needed
GPU / paid API for indexing	✅ Often	❌ Never
Index size	Large	Tiny (metadata only)
Indexing speed (500-page PDF)	Minutes	~10 seconds
Runs in browser	❌ No	✅ Yes
Self-hostable	Complex	Simple

Well-suited for: books & long-form documents, research papers, technical documentation, large multi-document knowledge bases, and browser-based or offline applications.

How It Works

1. Document Ingestion

Upload PDF → Extract text → Detect topic boundaries → Segment into blocks
    → Generate titles & summaries (LLM) → Extract keywords → Build topic tree → Store index

The result is a compact metadata-only index — no raw text stored.

2. Topic Tree Structure

Content is organized hierarchically rather than as flat chunks:

Machine Learning Book
├── Introduction
├── Supervised Learning
│   ├── Regression
│   └── Classification
└── Neural Networks
    ├── Feedforward Networks
    └── Backpropagation

Each node stores title, summary, keywords, and a page range.

3. Query Pipeline

User question → Extract keywords → Search topic tree (BM25/TF-IDF)
    → Identify relevant node → Retrieve page range → Send to LLM → Answer with source

Topic Detection

VL.RAG uses a hybrid heuristic + statistical approach:

Heading detection — Parses structural headings to define topic boundaries.
Block segmentation — Splits unstructured text into sentence blocks.
Similarity scoring — TF-IDF cosine similarity between blocks; low similarity signals a boundary.
LLM title generation — Only final detected blocks are sent to an LLM, keeping API costs minimal.

This enables indexing a 500-page PDF in under ~10 seconds.

Architecture

┌─────────────────────────────────────┐
│         Frontend (React)            │
│  Chat UI · Upload · Topic Explorer  │
└────────────────┬────────────────────┘
                 │
┌────────────────▼────────────────────┐
│       Document Processing           │
│  Text Extraction · Topic Detection  │
│  Keyword Extraction · Tree Builder  │
└────────────────┬────────────────────┘
                 │
┌────────────────▼────────────────────┐
│         Knowledge Database          │
│   documents · sections · tree_nodes │
└────────────────┬────────────────────┘
                 │
┌────────────────▼────────────────────┐
│           Search Engine             │
│       BM25 · TF-IDF · Ranking       │
└────────────────┬────────────────────┘
                 │
┌────────────────▼────────────────────┐
│             LLM API                 │
│   OpenRouter · OpenAI · Local LLM   │
└─────────────────────────────────────┘

Getting Started

Prerequisites: Node.js 18+, Python 3.10+

# Clone the repository
git clone https://github.com/your-org/vl-rag.git
cd vl-rag

# Install dependencies
npm install

# Configure environment
cp .env.example .env
# Add your LLM API key to .env

# Run locally
npm run dev

Open http://localhost:3000, upload a PDF, and start asking questions.

Docker:

docker compose up

Knowledge Base Modes

Upload mode — Drag and drop any PDF directly into the interface.

Folder mode — Point VL.RAG at a local directory:

C:/Users/you/Documents/research_papers/

Files are indexed without copying — creating a personal knowledge vault with AI-powered Q&A, similar to Obsidian or Logseq.

Search Engine

VL.RAG uses proven information retrieval methods — no neural nets required at query time:

BM25 — industry-standard probabilistic ranking
TF-IDF — term frequency weighting
Inverted index — fast keyword lookup

For browser deployments, FlexSearch and Lunr.js are supported out of the box.

Roadmap

Multi-document cross-search
Document graph linking (topic relationships across files)
Citation-based answers (exact page + topic attribution)
Automatic folder sync / watch mode
Fully offline mode (local LLM via Ollama)
Browser extension for on-page RAG
REST API for programmatic access

License

MIT — free to use, modify, and distribute. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
public		public
server		server
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VL.RAG

What is VL.RAG?

Why VL.RAG?

How It Works

1. Document Ingestion

2. Topic Tree Structure

3. Query Pipeline

Topic Detection

Architecture

Getting Started

Knowledge Base Modes

Search Engine

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VL.RAG

What is VL.RAG?

Why VL.RAG?

How It Works

1. Document Ingestion

2. Topic Tree Structure

3. Query Pipeline

Topic Detection

Architecture

Getting Started

Knowledge Base Modes

Search Engine

Roadmap

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages