Natural-language analytics assistant for a Walmart-style retail warehouse.
The app converts business questions into SQL, executes read-only queries on SQLite, and returns SQL + tabular results in an interactive dark UI.
Karan Badlani
- Accepts plain-English analytics questions
- Retrieves relevant schema context with RAG (ChromaDB)
- Generates SQL using OpenAI
- Enforces execution safety (read-only guard + approval flow)
- Executes SQL and returns results to the frontend
- Logs query activity in
query_log
This project is configured for the Walmart synthetic dataset in data/raw/.
Core tables include:
states,stores,customers,customer_addressescategories,subcategories,brands,products,product_listingsoffers,offer_productsorders,order_items,payments,shipments,returnsinventory_snapshots
Schema metadata is read from:
data/raw/schema.sqldata/raw/data_dictionary.csv
frontend/: React + Vite UIapi/: FastAPI routes and app bootstrapagent/: SQL generation chain, semantic schema, retriever, index builderdata/seed.py: Loads Walmart CSVs into SQLite fromschema.sqlmodel/: DB connection +query_logmodel
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cd frontend
npm install
cd ..cp .env.example .env
# set OPENAI_API_KEY in .envDefault database:
DATABASE_URL=sqlite:///./data/walmart.db
source .venv/bin/activate
python -m data.seedsource .venv/bin/activate
python -m agent.build_indexsource .venv/bin/activate
uvicorn api.main:app --reload --port 8000cd frontend
npm run devOpen:
- Frontend:
http://127.0.0.1:5173 - API docs:
http://127.0.0.1:8000/docs - Health:
http://127.0.0.1:8000/api/health
OPENAI_API_KEY(required)OPENAI_MODEL(default:gpt-4o)DATABASE_URL(default:sqlite:///./data/walmart.db)CHROMA_PERSIST_DIR(default:./chroma_store)EMBEDDING_MODEL(default:text-embedding-3-small)LOG_LEVEL(default:INFO)ALLOWED_ORIGINS(default:*for local dev)
- Query execution allows only single-statement
SELECT/WITH - Mutating SQL is blocked unless explicitly approved via approval endpoint
- All executions are logged to
query_log