An AI-powered natural language to SQL query engine built on a medical database. Ask questions in plain English and get back structured SQL results — powered by Groq's LLaMA 3.3 70B model, FastAPI, React, and PostgreSQL, all orchestrated with Docker Compose.
Watch the demo video on Google Drive
User → React UI (Nginx :3000)
→ FastAPI Backend (Uvicorn :8000)
→ Groq LLM API (llama-3.3-70b-versatile)
→ PostgreSQL :5432 (llm schema)
← SQL + formatted results
- Natural Language Queries — type a question like "What is the average BMI of diabetic patients?" and get SQL + results
- Intelligent Agent — maintains conversation history (last 5 exchanges) for context-aware follow-ups
- Schema-Aware — automatically extracts and caches the database schema so the LLM generates accurate queries
- Safety First — only
SELECT/WITHqueries are allowed;INSERT,UPDATE,DELETEare rejected - Deterministic Output — temperature set to 0.0 for consistent SQL generation
- Dark-Themed UI — responsive chat interface with collapsible SQL blocks and scrollable result tables
- Suggestion Chips — pre-built example questions to get started quickly
- One-Command Deployment —
docker-compose up --buildspins up the entire stack
| Layer | Technology |
|---|---|
| LLM | Groq API — llama-3.3-70b-versatile |
| Backend | FastAPI 0.115 · Uvicorn · psycopg2 |
| Frontend | React 19 · Vite 6 · Axios |
| Database | PostgreSQL 16 (Alpine) |
| Proxy | Nginx (Alpine) |
| Orchestration | Docker Compose |
Five medical datasets are auto-loaded from CSV on first startup into the llm schema:
| Table | Records | Description |
|---|---|---|
llm.heartattack_data |
~8,700 | Heart attack risk factors across countries |
llm.breastcancer_data |
~286 | Breast cancer diagnosis & treatment |
llm.livercirrhosis_data |
~418 | Liver cirrhosis progression & lab results |
llm.diabetes_data |
~768 | Diabetes risk factors & outcomes |
llm."glaucoma_Data" |
~100,000 | Glaucoma diagnosis & eye measurements |
Text2SQLgen/
├── docker-compose.yml # Full-stack orchestration
├── .env.example # Environment variable template
│
├── backend/
│ ├── Dockerfile
│ ├── requirements.txt
│ └── app/
│ ├── main.py # FastAPI app, CORS, health check
│ ├── config.py # Env var loading
│ ├── database.py # PostgreSQL connection & schema extraction
│ ├── agents/
│ │ ├── base.py # Abstract Agent class
│ │ └── text2sql.py # Text2SQL agent (Groq + query pipeline)
│ └── routes/
│ └── chat.py # /api/chat, /api/schema, /api/suggestions
│
├── frontend/
│ ├── Dockerfile # Multi-stage: Node build → Nginx serve
│ ├── nginx.conf # SPA routing + API proxy
│ ├── vite.config.js
│ └── src/
│ ├── App.jsx # Main component & state
│ ├── App.css # Dark theme styles
│ ├── api/
│ │ └── chat.js # Axios API client
│ └── components/
│ ├── ChatWindow.jsx # Scrollable message container
│ ├── MessageBubble.jsx # Message display + SQL toggle + result table
│ ├── InputBar.jsx # User input form
│ └── Suggestions.jsx # Example question chips
│
├── db/
│ ├── init.sql # Schema & table creation
│ └── load-data.sh # CSV import via psql \COPY
│
└── datatables/ # Medical CSV datasets
├── heart_attack_prediction_dataset.csv
├── breast-cancer.csv
├── cirrhosis.csv
├── diabetes.csv
└── glaucoma_dataset.csv
- Docker & Docker Compose
- A free Groq API key
-
Clone the repository
git clone https://github.com/gowthamarulmozhi/Text2SQLgen.git cd Text2SQLgen -
Create your environment file
cp .env.example .env
Open
.envand add your Groq API key:GROQ_API_KEY=gsk_xxxxxxxxxxxxxxxxxxxxxxxx POSTGRES_DB=wothmag POSTGRES_USER=postgres POSTGRES_PASSWORD=qwerty DATABASE_URL=postgresql://postgres:qwerty@db:5432/wothmag
-
Start the application
docker-compose up --build
-
Open in browser
Service URL Frontend http://localhost:3000 Backend API http://localhost:8000 API Docs (Swagger) http://localhost:8000/docs
- PostgreSQL starts and passes its health check
init.sqlcreates thellmschema and tablesload-data.shimports all CSV datasets- Backend waits for DB readiness, then starts Uvicorn
- Frontend builds the React app, Nginx begins serving
Send a natural language question and receive SQL + results.
Request
{
"message": "How many patients have diabetes?",
"history": [
["Previous question", "Previous answer"]
]
}Response
{
"response": "There are 268 patients with diabetes.",
"sql": "SELECT COUNT(*) FROM llm.diabetes_data WHERE \"Outcome\" = 1;",
"columns": ["count"],
"results": [[268]],
"error": null
}Returns the full database schema (tables, columns, data types).
Returns a list of 8 example questions to get started.
Health check endpoint.
{ "status": "ok" }- How many patients are there in total across all tables?
- What is the average BMI of diabetic patients?
- Show the top 5 countries with the highest heart attack rates
- What percentage of breast cancer patients received irradiation?
- List all liver cirrhosis patients in Stage 4
- What is the average intraocular pressure for glaucoma patients by type?
| Variable | Description | Default |
|---|---|---|
GROQ_API_KEY |
Your Groq API key | (required) |
POSTGRES_DB |
Database name | wothmag |
POSTGRES_USER |
PostgreSQL user | postgres |
POSTGRES_PASSWORD |
PostgreSQL password | qwerty |
DATABASE_URL |
Full connection string | postgresql://postgres:qwerty@db:5432/wothmag |
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload --port 8000Requires a running PostgreSQL instance. Update
DATABASE_URLin.envto point tolocalhostinstead ofdb.
cd frontend
npm install
npm run devVite dev server runs on port 5173 and proxies
/apirequests tolocalhost:8000.
MIT — Gowtham Arulmozhi
