Skip to content

Convert natural language questions to SQL queries using AI. Includes Docker setup with PostgreSQL, stock market data loader, and REST API

Notifications You must be signed in to change notification settings

jpukg/nl-to-sql-query

Repository files navigation

Natural Language to SQL Query System

A TypeScript-based system that converts natural language questions into SQL queries using Claude AI and executes them safely on PostgreSQL databases.

Features

  • Natural Language Processing: Convert plain English questions to SQL queries using Claude AI
  • Database Schema Introspection: Automatically reads your database structure to generate accurate queries
  • Safe Query Execution: Built-in validation to prevent destructive operations (only SELECT queries allowed)
  • Multiple Interfaces:
    • CLI (Command Line Interface) for interactive querying
    • REST API server for web applications
    • Library/module for integration into your own applications
  • Real-time Results: Execute queries and get formatted results instantly

Prerequisites

  • Docker and Docker Compose (recommended)
  • OR Node.js (v18 or higher) + PostgreSQL database
  • AI Provider API key (Sarvam, Anthropic Claude, or Ollama)

Quick Start with Docker (Recommended)

  1. Clone or download this repository

  2. Copy and configure environment variables:

cp .env.example .env
# Edit .env and add your API keys
  1. Start with Docker:
docker-compose up --build

The application will be available at http://localhost:3000 with PostgreSQL automatically configured and pre-loaded with stock market data.

For detailed Docker instructions, see docs/DOCKER.md

Manual Installation

  1. Install dependencies:
yarn install
  1. Create a .env file with your configuration:
cp .env.example .env
  1. Edit .env with your credentials - see .env.example for all available options

Usage

Option 1: CLI (Command Line Interface)

Interactive Mode

npm run query

Then type your questions in natural language:

❓ Your question: Show me all customers from New York
❓ Your question: What are the top 5 products by sales?
❓ Your question: List all orders from the last 30 days

Single Query Mode

npm run query "Show me all customers from New York"

Option 2: REST API Server

  1. Start the server:
npm run dev
  1. The API will be available at http://localhost:3000

API Endpoints

GET /health

  • Check API and database health status

GET /schema

  • Get database schema information

POST /query

  • Execute a natural language query
  • Body: { "query": "your question here" }

Example API Usage

Using curl:

curl -X POST http://localhost:3000/query \
  -H "Content-Type: application/json" \
  -d '{"query": "Show me all customers from New York"}'

Using JavaScript/fetch:

const response = await fetch('http://localhost:3000/query', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    query: 'Show me all customers from New York'
  })
});

const result = await response.json();
console.log(result);

Option 3: Library/Module

Import into your TypeScript/JavaScript application:

import { queryDatabase, testConnection } from './src/index';

async function main() {
  // Test connection
  await testConnection();

  // Execute query
  const result = await queryDatabase('Show me all customers from New York');

  if (result.success) {
    console.log('SQL:', result.sql);
    console.log('Data:', result.data);
    console.log('Row count:', result.rowCount);
  } else {
    console.error('Error:', result.error);
  }
}

main();

Example Queries

Here are some example natural language queries you can try:

  • "Show me all customers"
  • "What are the top 10 products by revenue?"
  • "List all orders placed in the last 7 days"
  • "Find customers who have spent more than $1000"
  • "Show me the average order value by month"
  • "Which products have never been ordered?"
  • "Get the total sales for each product category"

Security Features

The system includes several safety mechanisms:

  1. Query Validation: Only SELECT queries are allowed
  2. Keyword Filtering: Prevents INSERT, UPDATE, DELETE, DROP, and other destructive operations
  3. Query Chaining Prevention: Multiple statements are blocked
  4. Schema Isolation: Only accesses the 'public' schema
  5. No Direct SQL Execution: All queries go through Claude AI and validation

Project Structure

test-ai/
├── src/
│   ├── config/
│   │   └── database.ts          # Database connection configuration
│   ├── services/
│   │   ├── nlToSql.ts           # Natural language to SQL converter
│   │   └── queryExecutor.ts     # Safe query execution
│   ├── utils/
│   │   └── schemaIntrospection.ts  # Database schema reader
│   ├── cli.ts                   # Command-line interface
│   ├── server.ts                # REST API server
│   └── index.ts                 # Main module exports
├── data/
│   └── symbols_fundamentals.json # Stock market data
├── docs/
│   ├── DOCKER.md                # Docker setup guide
│   └── USAGE_GUIDE.md           # Detailed usage instructions
├── docker-entrypoint-initdb.d/  # PostgreSQL initialization scripts
│   ├── 01-init-symbols-fundamentals.sql
│   └── 03-load-json.sh
├── .dockerignore                # Docker ignore patterns
├── .env.example                 # Environment variables template
├── docker-compose.yml           # Docker orchestration
├── Dockerfile                   # Docker image definition
├── package.json
├── tsconfig.json
└── README.md

Building for Production

  1. Build the TypeScript code:
npm run build
  1. Run the compiled code:
npm start

Troubleshooting

Connection Issues

  • Verify your database credentials in .env
  • Ensure PostgreSQL is running and accessible
  • Check firewall settings if connecting to a remote database

API Key Issues

  • Verify your Anthropic API key is valid
  • Check that you have sufficient API credits
  • Ensure the key is correctly set in .env

Query Issues

  • If queries aren't generating correctly, try rephrasing your question
  • Be specific about table and column names when possible
  • Check that your database schema is properly set up

How It Works

  1. Schema Introspection: The system reads your database structure (tables, columns, relationships)
  2. Context Building: Schema information is formatted and provided to Claude AI
  3. Query Generation: Claude converts your natural language question into a SQL query
  4. Validation: The generated SQL is validated for safety
  5. Execution: The query is executed on your database
  6. Results: Data is returned in a formatted, user-friendly manner

License

MIT

Support

For issues, questions, or contributions, please refer to the project repository.

About

Convert natural language questions to SQL queries using AI. Includes Docker setup with PostgreSQL, stock market data loader, and REST API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published