An AI-powered voice & CLI interface for inventory (DB) management.
I have built this project mainly to learn about DevOps & Agentic AI.
- Frontend: ReactJS, Vite, Tailwind CSS
- Backend & Database: FastAPI, SQLite
- AIML: Deepgram, LangGraph, LangChain, Groq, Ollama, ChromaDB, MLflow
- DevOps: DVC, Docker, Kubernetes, Prometheus, Grafana
Audio is recorded in the browser, sent to Deepgram for transcription (Speech to text), and the transcript is routed through a LangGraph agent backed by two LLMs:
- Groq (
llama-3.3-70b-versatile) - routes intent, generates read-only SQL and analysis, synthesizes responses. - Ollama (
vocalops-sql) - Finetuned using unsloth and colab. Generates mutating SQL (UPDATE,INSERT,DELETE) locally. Sensitive operations performed locally.
The agent uses six tools: check_order, update_order, create_order, delete_order, analyze_orders, and search_policies (RAG over a chromaDB vector store for queries on policies).
Browser mic → React UI ──┐
├─ FastAPI → Deepgram STT ─┐
Terminal mic → Voice CLI ─┘ │
LangGraph Agent → Groq (router)
├─ mutations → Ollama (local) → SQLite
├─ analytics → Groq → SQLite
└─ policies → ChromaDB (nomic-embed-text)
Start the FastAPI backend:
python src/vocalops_voice/main.py
Start the ReactJS frontend:
cd vocalops-frontend
npm install
npm run dev
This mode includes Human-In-The-Loop (HITL) approval for all database changes.
python src/voice_cli.py
- Type your query directly for text.
- Type
vand Enter for voice recording.
Prerequisites: Python 3.10+, Node 18+, Ollama running locally.
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env # DEEPGRAM_API_KEY, GROQ_API_KEY, OLLAMA_BASE_URL
dvc pull
cd data && ollama create vocalops-sql -f Modelfile && cd ..
ollama pull nomic-embed-text
python setup_db.py # creates and seeds data/orders.db
python src/vocalops_core/ingest.py # embeds policies into chromaDB

