Local AI Search Engine for file system!
While MacOS already has semantic search for the file system, it extends to all file types. I felt a more precise and personable local agent can offer a user-friendly way of interacting with your files.
I also wanted to explore the use cases for local LLMs, as I feel like small models are under-utilized in consumer tech products.
Ingestion Pipeline:
Inference Pipeline:

Tech Stack: Frontend: React Backend: Electron, Node.js Inference: llama.cpp Model: Gemma-4-E2B (Quantized)
1. Automatic Embedding
- Obi uses chokidar to watch the parent directory new files, editions, and deletions
- Custom embedding strategy depending on the file type (text, image of text, image of subject, audio)
- Gradual chunking and embedding stradegy prevents large batch jobs
2. Automatic Context Ingestions
- Relational database schema allows for runtime RAG responses
- Custom tool-calling allows for efficient context retrieval from database
3. Local Deployment and Low RAM Friendly
- Using a quantized gemma-E2B and llama.cpp allows for low RAM usage
- Chunked context allow for small prompt sizes -> lowering FLOPs, battery drain, and CPU temp
In the CLI
git clone git@github.com:durpdur/Obsidian_RAG.git- Navigate to
local-ragdirectory (Important!) npm installnpx electron-rebuild- Fixes
The module better_sqlite3.node was compiled against a different Node.js version using
- Fixes
In resources/models, place your .gguf models
Download Qwen3.5-2B-Q4_K_M.gguf for text model
Download nomic-embed-text-v2-moe.Q4_K_M.gguf for embed model
Potential Errors:
Apple could not verify “llama-server” is free of malware that may harm your Mac or compromise your privacy.
- You'll need to sign the binaries in order for mac to trust it, simply copy paste the error into chatGPT and it'll lead you through
Download llama.cpp binaries
Replace the binaries in resources/bin
In progress...
