Skip to content

QLNI/NodeMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

NodeMind — Binary Document Intelligence

48× smaller online · 32× smaller offline · up to 100× on images. 75× faster search. No GPU. No vector database.

Patent Pending · AU 2026901656 · AU 2026901657
Built by Sai Kiran Bathula · Coleambally, NSW, Australia

🔗 Try the live demo → nodemind.space
📄 Full project page → qlni.github.io/NodeMind


The Problem with RAG at Scale

When you index documents for RAG (Retrieval-Augmented Generation), your data expands 10× in size:

  • A 1 GB document collection → 10 GB float32 vector index
  • A 100 GB collection → ~1 TB in your vector database
  • Requires expensive GPUs for fast cosine similarity search
  • Requires a managed vector database (Pinecone, Weaviate, Qdrant) running 24/7

This is the standard industry approach. NodeMind replaces it entirely.


NodeMind's Approach

NodeMind converts float32 RAG embeddings into 1024-bit binary fingerprints using our proprietary patent-pending codec, then searches them using Multi-Index Hashing (MIH) — pure integer arithmetic, no GPU, no vector DB.

Original Documents RAG Index (float32 · 10× expansion) NodeMind Index (binary · 48× smaller online) Annual Savings vs Managed VDB
1 GB 10 GB 210 MB $290 / yr
10 GB 100 GB 2.1 GB $2,940 / yr
100 GB 1 TB 21 GB $29,400 / yr
1 TB 10 TB 210 GB $294,000 / yr

Costs use S3 Standard ($0.023/GB/mo) vs Pinecone managed vector DB ($2.50/GB/mo).
RAG 10× expansion confirmed by Elasticsearch, Pure Storage, and Milvus benchmarks.

Note on scale and overhead. Compression ratios scale with corpus size. On small datasets (< 10,000 chunks) compression measures closer to 31× online due to the fixed structural overhead of the 64 MIH sub-tables. At production scale (> 100,000 chunks) this fixed overhead is amortised, recovering the full 48× online compression. The 32× offline claim refers to the portable index file; if the raw corpus text is optionally bundled in the zip for fully self-contained portability, the total bundle footprint is roughly smaller than standard RAG.


Performance

Metric RAG (float32) NodeMind (binary)
Index size vs original docs 10× larger ~5× smaller
Search algorithm Cosine similarity — O(N·D) float multiply Hamming distance — POPCNT on 64-bit ints
Search speed Baseline 75× faster ¹
GPU required Yes (at scale) No — pure CPU
RAM for 250M chunks ~1 TB ~21 GB
Portable / offline No — needs live vector DB Yes — runs from a zip file
Compression vs float32 (text, online) 48×
Compression vs float32 (text, offline bundle) 32×
Compression vs float32 (image embeddings) up to 100× ²

¹ Speedup measured at >100,000 chunks. Below that, both indexes hit the 1 ms latency floor on small documents and report ~1× — the asymptotic gap appears once the cosine scan has enough work to do.
² Projection based on the binary codec applied to image-embedding distributions — not yet measured in production. Image, audio, and video pipelines are scheduled for the next release.


How It Works

Document (PDF / TXT / MD)
        │
        ▼  chunk + embed
BGE-M3 (1024-dim float32)
        │
        ▼  NodeMind binary codec  ← patent pending
1024-bit binary fingerprint  (128 bytes vs 4096 bytes)
        │
        ▼  build MIH index
64 sub-tables × 16-bit keys  →  sub-linear Hamming search
        │
        ▼
Portable .zip  →  download & run offline

Stage 1 — BGE-M3 Embeddings

State-of-the-art multilingual embedding model. 1024-dimensional dense vectors. Runs on community hardware — RTX 3080-class GPU with 128 GB system RAM. No datacenter, no $2/hr A100.

Stage 2 — Proprietary Binary Codec

Each 4,096-byte float32 embedding is transformed into a 128-byte (1024-bit) binary fingerprint using our patent-pending algorithm. This is not standard binary quantization (which gives 32× at ~5% quality loss). Our codec applies a spectral transform before binarization, achieving 48× compression on text with higher recall and up to 100× on image embeddings. The algorithm is not disclosed — it is protected under patent AU 2026901656.

Stage 3 — Multi-Index Hashing (MIH)

The 1024-bit fingerprint is split into 64 sub-strings of 16 bits each. Each sub-string is stored in a hash table. At query time, exact table lookups per sub-table are merged, giving sub-linear exact Hamming nearest-neighbour search — 75× faster than cosine on float32, with no approximation. Protected under patent AU 2026901657.


Modalities

Modality Status Compression vs float32
Text / Documents (PDF, TXT, MD) Live now 48× online · 32× offline
Images 🔜 Coming soon up to 100× (projection — not yet measured)
Audio 🔜 Coming soon 48× online · 32× offline (via Whisper transcript)
Video 🔜 Coming soon 48× online · 32× offline (via transcript + frame embeddings)

Live Demo

Visit nodemind.space to:

  1. Sign in with Google (one click — no password, no email round-trip)
  2. Upload any PDF, TXT, or Markdown file (10 MB per file, 50 MB lifetime per account)
  3. Watch it index on community hardware — typical 5,500-page PDF indexes in ~7 minutes on an RTX 3080
  4. Download your NodeMind binary index + RAG float32 index
  5. Query both side-by-side and compare speed, size, and results

Patents

Patent Number Status Covers
NodeMind WHT Binary Codec AU 2026901656 Provisional The spectral encoding algorithm achieving 48× text compression / up to 100× on images
NodeMind Centroid MIH AU 2026901657 Provisional Sub-linear Hamming search via centroid multi-index hashing

Both filed 2026. Inventor: Sai Kiran Bathula, independent researcher, Coleambally NSW, Australia.


Contact

Licensing · Enterprise integration · Research collaboration

📧 saikiranbathula1@gmail.com


© 2026 Sai Kiran Bathula. Patent Pending AU 2026901656 & AU 2026901657.

About

NodeMind — a binary-indexed knowledge graph that replaces vector databases, delivering 48× compression and 75× faster retrieval at a fraction of the cost.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages