48× smaller online · 32× smaller offline · up to 100× on images. 75× faster search. No GPU. No vector database.
Patent Pending · AU 2026901656 · AU 2026901657
Built by Sai Kiran Bathula · Coleambally, NSW, Australia
🔗 Try the live demo → nodemind.space
📄 Full project page → qlni.github.io/NodeMind
When you index documents for RAG (Retrieval-Augmented Generation), your data expands 10× in size:
- A 1 GB document collection → 10 GB float32 vector index
- A 100 GB collection → ~1 TB in your vector database
- Requires expensive GPUs for fast cosine similarity search
- Requires a managed vector database (Pinecone, Weaviate, Qdrant) running 24/7
This is the standard industry approach. NodeMind replaces it entirely.
NodeMind converts float32 RAG embeddings into 1024-bit binary fingerprints using our proprietary patent-pending codec, then searches them using Multi-Index Hashing (MIH) — pure integer arithmetic, no GPU, no vector DB.
| Original Documents | RAG Index (float32 · 10× expansion) | NodeMind Index (binary · 48× smaller online) | Annual Savings vs Managed VDB |
|---|---|---|---|
| 1 GB | 10 GB | 210 MB | $290 / yr |
| 10 GB | 100 GB | 2.1 GB | $2,940 / yr |
| 100 GB | 1 TB | 21 GB | $29,400 / yr |
| 1 TB | 10 TB | 210 GB | $294,000 / yr |
Costs use S3 Standard ($0.023/GB/mo) vs Pinecone managed vector DB ($2.50/GB/mo).
RAG 10× expansion confirmed by Elasticsearch, Pure Storage, and Milvus benchmarks.
Note on scale and overhead. Compression ratios scale with corpus size. On small datasets (< 10,000 chunks) compression measures closer to 31× online due to the fixed structural overhead of the 64 MIH sub-tables. At production scale (> 100,000 chunks) this fixed overhead is amortised, recovering the full 48× online compression. The 32× offline claim refers to the portable index file; if the raw corpus text is optionally bundled in the zip for fully self-contained portability, the total bundle footprint is roughly 5× smaller than standard RAG.
| Metric | RAG (float32) | NodeMind (binary) |
|---|---|---|
| Index size vs original docs | 10× larger | ~5× smaller |
| Search algorithm | Cosine similarity — O(N·D) float multiply | Hamming distance — POPCNT on 64-bit ints |
| Search speed | Baseline | 75× faster ¹ |
| GPU required | Yes (at scale) | No — pure CPU |
| RAM for 250M chunks | ~1 TB | ~21 GB |
| Portable / offline | No — needs live vector DB | Yes — runs from a zip file |
| Compression vs float32 (text, online) | 1× | 48× |
| Compression vs float32 (text, offline bundle) | 1× | 32× |
| Compression vs float32 (image embeddings) | 1× | up to 100× ² |
¹ Speedup measured at >100,000 chunks. Below that, both indexes hit the 1 ms latency floor on small documents and report ~1× — the asymptotic gap appears once the cosine scan has enough work to do.
² Projection based on the binary codec applied to image-embedding distributions — not yet measured in production. Image, audio, and video pipelines are scheduled for the next release.
Document (PDF / TXT / MD)
│
▼ chunk + embed
BGE-M3 (1024-dim float32)
│
▼ NodeMind binary codec ← patent pending
1024-bit binary fingerprint (128 bytes vs 4096 bytes)
│
▼ build MIH index
64 sub-tables × 16-bit keys → sub-linear Hamming search
│
▼
Portable .zip → download & run offline
State-of-the-art multilingual embedding model. 1024-dimensional dense vectors. Runs on community hardware — RTX 3080-class GPU with 128 GB system RAM. No datacenter, no $2/hr A100.
Each 4,096-byte float32 embedding is transformed into a 128-byte (1024-bit) binary fingerprint using our patent-pending algorithm. This is not standard binary quantization (which gives 32× at ~5% quality loss). Our codec applies a spectral transform before binarization, achieving 48× compression on text with higher recall and up to 100× on image embeddings. The algorithm is not disclosed — it is protected under patent AU 2026901656.
The 1024-bit fingerprint is split into 64 sub-strings of 16 bits each. Each sub-string is stored in a hash table. At query time, exact table lookups per sub-table are merged, giving sub-linear exact Hamming nearest-neighbour search — 75× faster than cosine on float32, with no approximation. Protected under patent AU 2026901657.
| Modality | Status | Compression vs float32 |
|---|---|---|
| Text / Documents (PDF, TXT, MD) | ✅ Live now | 48× online · 32× offline |
| Images | 🔜 Coming soon | up to 100× (projection — not yet measured) |
| Audio | 🔜 Coming soon | 48× online · 32× offline (via Whisper transcript) |
| Video | 🔜 Coming soon | 48× online · 32× offline (via transcript + frame embeddings) |
Visit nodemind.space to:
- Sign in with Google (one click — no password, no email round-trip)
- Upload any PDF, TXT, or Markdown file (10 MB per file, 50 MB lifetime per account)
- Watch it index on community hardware — typical 5,500-page PDF indexes in ~7 minutes on an RTX 3080
- Download your NodeMind binary index + RAG float32 index
- Query both side-by-side and compare speed, size, and results
| Patent | Number | Status | Covers |
|---|---|---|---|
| NodeMind WHT Binary Codec | AU 2026901656 | Provisional | The spectral encoding algorithm achieving 48× text compression / up to 100× on images |
| NodeMind Centroid MIH | AU 2026901657 | Provisional | Sub-linear Hamming search via centroid multi-index hashing |
Both filed 2026. Inventor: Sai Kiran Bathula, independent researcher, Coleambally NSW, Australia.
Licensing · Enterprise integration · Research collaboration
© 2026 Sai Kiran Bathula. Patent Pending AU 2026901656 & AU 2026901657.