In many organizations, valuable data lives in relational databases, but accessing it requires SQL expertise. This project bridges that gap by allowing users to:
-
Ask questions in natural language
-
Upload ER diagram images
-
Automatically generate validated SQL queries
All without writing SQL manually.
In many organizations, data are stored in databases, but non-technical users can’t query it directly because they don’t know SQL. They rely on data analysts for even simple queries like.
Non-technical users struggle to query databases because:
-
SQL has a steep learning curve
-
Even simple queries require analyst intervention
-
Complex queries take time and are error-prone
Example:
“Show me the total sales in 2024 by region.”
- Generating Effective and Complex for Non Technical Folk can be overwhelming
- Helps in generating complex query in seconds and saves time and cost for that specific organization
I’ve written a detailed blog that walks through the thinking, trade-offs, and occasional “why is this not working?” moments :) behind building this NL2SQL pipeline.
- How ER diagrams can be turned into executable schemas
- The design decisions that didn’t make it into the final code (for good reasons)
- And what actually breaks when theory meets production
Build NL2SQL Pipline that helps users to quickly generation query regarding sql and also generate sql query by ER diagram We built an NL2SQL pipeline that converts:
- 📝 Natural language queries → SQL
- 🖼️ ER diagram images → Database schema → SQL And many more features...
User Input (Natural Language)
↓
Text Preprocessing
↓
Schema Understanding
↓
NL2SQL Model (LLM or Encoder-Decoder)
↓
SQL Query Generator
↓
SQL Validator & Executor
↓
Database (e.g., PostgreSQL/MySQL)
↓
Results Display (Frontend/UI)- 🔤 Natural Language → SQL generation
- 🖼️ ER Diagram image → Schema extraction → SQL
- 🧠 LLM-based reasoning with schema awareness
- 🔐 SQL security & validation checks
- 📋 Copy-ready SQL output
- 🧪 Syntax validation before execution
- Syntactic Validit
- Used syntactic Metrics that Observes if the query syntax is valid or invalid so if valid then query can be parseable
- Checks whether generated SQL is parseable
- Schema Consistency
- Ensures correct table & column references
- Foreign Key Validation
- Security Rules
- Blocks unsafe SQL patterns
cd nl2sql
pip install requirements.txt
cd .\frontend
npm run devcd .\backend
uv run .\main.py
- 🔄 SQL execution & result visualization
- 📊 Query optimization hints
- 🧪 Automated test coverage
- 🗂️ Multi-schema support
- 🔍 Reasoning and Streaming of Response
Aditya Katkar
