Skip to content

khaled24ao/SQLGenAI

Repository files navigation

SQLGenAI

Python Flask License

SQLGenAI Cover

AI-powered SQL query generator that converts natural language into clean, production-ready SQL queries.

What is SQLGenAI?

Transform plain English questions into optimized SQL queries instantly. Perfect for developers, analysts, and anyone working with databases.

Example

Input:

Find top 10 customers who spent the most in the last 30 days

Output:

SELECT c.name, c.email, SUM(o.total_amount) AS total_spent,
       COUNT(o.id) AS num_orders
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.status = 'completed' 
  AND o.created_at >= NOW() - INTERVAL '30 days'
GROUP BY c.id, c.name, c.email
ORDER BY total_spent DESC
LIMIT 10;

Features

  • Smart SQL Generation - Clean, optimized queries
  • Multi-Dialect - PostgreSQL, MySQL, SQLite
  • Schema-Aware - Uses your table definitions
  • Complexity Analysis - Shows query difficulty
  • Copy & Export - One-click copy to clipboard
  • Dark Theme UI - Modern interface
  • Keyboard Shortcuts - Ctrl+Enter to generate
  • Docker & Kubernetes Ready

Tech Stack

  • Backend: Flask, Groq SDK, Pydantic, Flask-Limiter
  • Caching: Redis
  • Frontend: Vanilla JS, CSS (dark theme)
  • AI Model: llama3-8b-8192
  • DevOps: Docker, Kubernetes, GitHub Actions

Project Structure

SQLGenAI/
├── app.py                      # Entry point
├── backend/
│   ├── app.py                  # Flask app factory, error handlers, logging
│   ├── models/
│   │   └── schemas.py          # Pydantic validation models
│   ├── middleware/
│   │   ├── auth.py             # API Key authentication
│   │   └── rate_limit.py       # Rate limiting
│   ├── services/
│   │   ├── ai_service.py       # Singleton AIService
│   │   └── cache.py            # Redis caching
│   ├── routes/
│   │   └── generate.py        # API routes
│   └── utils/
│       └── logger.py            # Logging utility
├── templates/
│   └── index.html              # Frontend UI
├── kubernetes/
│   └── deployment.yaml         # K8s manifests
├── tests/
│   ├── test_api.py             # API tests
│   └── test_schemas.py         # Schema validation tests
├── docker-compose.yml          # Docker Compose
├── Dockerfile                  # Docker image
├── pytest.ini                  # Pytest config
├── requirements.txt
├── .env.example
├── .gitignore
├── .dockerignore
├── .github/workflows/ci.yml    # CI/CD pipeline
└── README.md

Quick Start

Development

# Clone and setup
git clone <repo>
cd SQLGenAI

# Virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

# Install dependencies
pip install -r requirements.txt

# Configure
cp .env.example .env
# Edit .env and add your GROQ_API_KEY

# Run
python app.py

Docker

# Build
docker build -t sqlgenai .

# Run with docker-compose
docker-compose up -d

Kubernetes

kubectl apply -f kubernetes/deployment.yaml

Environment Variables

Variable Description Default
GROQ_API_KEY Groq API key Required
REDIS_URL Redis URL simple://localhost
API_KEY API authentication key None
REQUIRE_AUTH Require API key true
DEFAULT_RATE_LIMIT Rate limit per hour 100
FLASK_ENV Environment production

API Endpoints

Method Endpoint Description
GET / UI homepage
GET /health Health check
POST /api/v1/generate Generate SQL

Authentication

Include API key in header:

curl -X POST http://localhost:5000/api/v1/generate \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-api-key" \
  -d '{"question": "Get all users"}'

Rate Limiting

  • Default: 30 requests/minute
  • Returns 429 when exceeded

Caching

  • Responses cached for 1 hour
  • Cache key: SHA256(question:schema:dialect)

API Usage

POST /api/v1/generate

curl -X POST http://localhost:5000/api/v1/generate \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Get top 10 customers by orders",
    "schema": "users(id,name) orders(id,user_id,total)",
    "dialect": "postgresql"
  }'

Request Fields:

  • question (required): Natural language question, max 500 chars
  • schema (optional): Table schema, max 2000 chars
  • dialect (optional): mysql, postgresql, sqlite (default: postgresql)

Response:

{
  "result": "{\"sql\": \"SELECT...\", \"explanation\": [...], \"tables_used\": [...], \"complexity\": \"medium\", \"warnings\": []}",
  "cached": false
}

Testing

# Run tests
pytest

# Run with coverage
pytest --cov=backend --cov-report=html

CI/CD

GitHub Actions workflow includes:

  • Python 3.11 test suite
  • Code coverage reporting
  • Docker image build & push

Production Checklist

  • Set FLASK_ENV=production
  • Configure API_KEY
  • Set up Redis for caching
  • Configure logging aggregation
  • Set up monitoring (Sentry/Prometheus)
  • Enable HTTPS/TLS
  • Configure resource limits (K8s)

Keyboard Shortcuts

  • Ctrl + Enter: Generate SQL from current input

License

MIT

About

AI-powered SQL query generator that converts natural language into clean, production-ready SQL queries

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors