A document Q&A chat application using IBM Granite and in-memory vector search.
- Deno installed
- Ollama installed and running with:
ibm/granite4:3b(for chat and embeddings)
- Place your documents (
.txt,.md,.pdffiles) indata/documents/ - Build embeddings:
deno task build - Start server:
deno task dev - Open browser to
http://localhost:8000
deno task build- Process documents and generate embeddingsdeno task dev- Run development server with watch modedeno task start- Run production server
This RAG application consists of:
- Vector Search: In-memory cosine similarity search for document retrieval
- LLM Integration: IBM Granite 4 (3B) via Ollama for answer generation
- Embeddings: IBM Granite 4 (3B) for semantic search
- Frontend: HTMX-based chat interface with Tailwind CSS
/project-root
/data
/documents # Place source documents here
/processed # Generated embeddings (created by build script)
/src
/server # HTTP server and routes
/services # Core business logic
/lib # Utility functions
/scripts # Build scripts
/public # Frontend assets
- The build script processes
.txt,.md, and.pdffiles - Default chunk size: 500 words with 100-word overlap
- Top-5 retrieved chunks used for context
- Ensure Ollama is running before starting the application