Problem. HR/benefits questions live in tables, but users think in natural language.
Approach. Convert NL to deterministic SQL over a compact schema, execute on SQLite, return answers + the SQL used. Start with simple regex/keyword patterns; optionally plug an LLM provider later.
Result. A reproducible, testable baseline (no external keys required) with a small eval set and latency you can measure locally.
https://github.com//benefits-sql-agent (update link)
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python app/scripts/bootstrap.py # creates benefits.db with schema + seed data
uvicorn app.app:app --reload
# then in another shell:
curl -X POST http://127.0.0.1:8000/ask -H "Content-Type: application/json" -d '{"question":"List benefit plans for the Full-Time class"}'
- "How many employees are in the Executive class?"
- "List benefit plans available to Full-Time employees."
- "What is the employer contribution for Health Basic for the Full-Time class?"
- "Which employees are enrolled in Dental?"
python app/eval/run_eval.py
# outputs app/eval/report.md
app/
app.py # FastAPI service (/ask)
config.py # settings
db.py # sqlite engine + helpers
schema.sql # DB schema
seed.sql # sample data
scripts/bootstrap.py
nl2sql/
patterns.py # simple regex/keyword rules → SQL
provider.py # interface + router
openai_provider.py # optional (off by default)
eval/
dataset.csv # tiny, high-signal questions
rubric.md
run_eval.py
tests/
test_nl2sql_basic.py
FastAPI / SQLAlchemy / SQLite / pytest / GitHub Actions
MIT