Usage Guide

This page provides ready-to-use curl commands for every endpoint. All examples use the Docker Compose defaults (localhost:5181, API key meow123). Adjust the variables at the top for your environment.

Setup Variables

Set these once in your terminal session:

BASE_URL="http://localhost:5181"
CHAT_ID="test-session-001"
API_KEY="meow123"

Health Check

curl -s -X GET "${BASE_URL}/ping" | jq .

Full Pipeline (Stage 1 + Stage 2)

Minimal Request

The simplest call — only basePrompt is required. The engine uses defaults for everything else (retryCount=5, schemaExamples=5, feedbackExamples=3).

curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "How many orders were placed in the last 30 days?"
  }' | jq .

Full Request with All Parameters

curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "How many orders were placed in the last 30 days?",
    "retryCount": 5,
    "schemaExamples": 5,
    "feedbackExamples": 3,
    "schemaDescriptionKey": "dbSchemaDescription",
    "extraPayload": {"source": "curl-test"}
  }' | jq .

Force Fresh Schema Generation

Change schemaDescriptionKey to bypass the cached schema and regenerate from scratch:

curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "Show me the top 10 customers by total order value.",
    "schemaDescriptionKey": "dbSchemaDescription_v2"
  }' | jq .

Multi-Turn Conversation

Using the same chatID reuses cached schema context. The second call skips schema introspection entirely:

# Turn 1 — schema is introspected and cached
curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "How many orders were placed in the last 30 days?"
  }' | jq .

# Turn 2 — same chatID, schema reused from Redis
curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "Now break that down by country."
  }' | jq .

Stage 1 Only — SQL Generation (No Execution)

Generate SQL from natural language without executing it. Useful for previewing queries before running them.

curl -s -X POST "${BASE_URL}/inference/sqlQueryGeneration/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "List all active users who signed up this year."
  }' | jq .

With all optional parameters:

curl -s -X POST "${BASE_URL}/inference/sqlQueryGeneration/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "List all active users who signed up this year.",
    "schemaExamples": 10,
    "schemaDescriptionKey": "dbSchemaDescription",
    "extraPayload": null
  }' | jq .

Stage 2 Only — Query Evaluation & Repair

Execute a query you already have and let the engine repair it if it fails.

Valid Query

curl -s -X POST "${BASE_URL}/inference/sqlQueryEvaluation/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "How many orders were placed in the last 30 days?",
    "baseQuery": "SELECT COUNT(*) FROM orders WHERE created_at >= NOW() - INTERVAL '\''30 days'\''",
    "baseDescription": "Count orders placed in the last 30 days."
  }' | jq .

Deliberately Broken Query (Repair Loop Demo)

Pass a query referencing a nonexistent table to see the repair loop in action:

curl -s -X POST "${BASE_URL}/inference/sqlQueryEvaluation/${CHAT_ID}" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "Show me customer names and their total spend.",
    "baseQuery": "SELECT customer_name, SUM(total) FROM nonexistent_table GROUP BY customer_name",
    "baseDescription": "Get customer names and total spend from orders.",
    "retryCount": 3
  }' | jq .

The engine will detect the error, feed it to the LLM, and iteratively fix the query until it succeeds or exhausts retries. If retries exhaust, the best result seen across all attempts is returned.

OpenAI-Compatible Endpoints

List Models

curl -s -X GET "${BASE_URL}/v1/models" \
  -H "Authorization: Bearer ${API_KEY}" | jq .

Chat Completions — Streaming

curl -s -X POST "${BASE_URL}/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "messages": [
      {"role": "user", "content": "How many orders were placed in the last 30 days?"}
    ],
    "stream": true
  }'

Chat Completions — Non-Streaming

curl -s -X POST "${BASE_URL}/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "messages": [
      {"role": "user", "content": "How many active users registered this month?"}
    ],
    "stream": false
  }' | jq .

Extract Just the Answer

curl -s -X POST "${BASE_URL}/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "messages": [
      {"role": "user", "content": "What are the top 3 selling products?"}
    ],
    "stream": false
  }' | jq -r '.choices[0].message.content'

Multi-Turn with `chat_id`

# Turn 1 — schema generated and cached
curl -s -X POST "${BASE_URL}/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "chat_id": "my-session-001",
    "messages": [
      {"role": "user", "content": "How many orders were placed last month?"}
    ],
    "stream": false
  }' | jq .

# Turn 2 — schema reused from cache
curl -s -X POST "${BASE_URL}/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "chat_id": "my-session-001",
    "messages": [
      {"role": "user",      "content": "How many orders were placed last month?"},
      {"role": "assistant", "content": "There were 842 orders placed last month."},
      {"role": "user",      "content": "Break that down by product category."}
    ],
    "stream": false
  }' | jq .

Legacy Text Completions

# Streaming
curl -s -X POST "${BASE_URL}/v1/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "prompt": "How many users signed up in the last 7 days?",
    "stream": true
  }'

# Non-streaming
curl -s -X POST "${BASE_URL}/v1/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${API_KEY}" \
  -d '{
    "model": "SQLBot",
    "prompt": "List all tables and row counts in the database.",
    "stream": false
  }' | jq .

Using Explicit Connection Overrides

When environment variables aren't set on the server, or when you want to target a different LLM/database/Redis backend for a single request:

LLM_BASE_URL="http://localhost:11434/v1"
LLM_MODEL="qwen2.5-coder:7b"
LLM_API_KEY="ollama"
LLM_TEMPERATURE="0.1"
PG_HOST="localhost"
PG_PORT="5432"
PG_DB="mydb"
PG_USER="postgres"
PG_PASS="secret"
RD_HOST="localhost"
RD_PORT="6379"
RD_PASS=""
RD_DB="0"

curl -s -X POST "${BASE_URL}/inference/sqlQueryEngine/${CHAT_ID}?\
llmBaseURL=${LLM_BASE_URL}&\
llmAPIKey=${LLM_API_KEY}&\
llmModel=${LLM_MODEL}&\
llmTemperature=${LLM_TEMPERATURE}&\
postgreHost=${PG_HOST}&\
postgrePort=${PG_PORT}&\
postgreDBName=${PG_DB}&\
postgreUser=${PG_USER}&\
postgrePassword=${PG_PASS}&\
redisHost=${RD_HOST}&\
redisPort=${RD_PORT}&\
redisPassword=${RD_PASS}&\
redisDB=${RD_DB}" \
  -H "Content-Type: application/json" \
  -d '{"basePrompt": "How many records are in each table?"}' | jq .

This works with all three /inference/* endpoints — just swap the path and request body fields.

OpenAPI / Swagger UI

# Download the OpenAPI schema
curl -s "${BASE_URL}/openapi.json" | jq .

# Open Swagger UI in browser
open "${BASE_URL}/docs"

Smoke Tests

Run the built-in test suite to validate all endpoints:

bash curlCommands.sh

Or with custom settings:

BASE_URL="http://your-server:5181" API_KEY="your-key" bash curlCommands.sh

The test suite validates: health check, model listing, chat completions (streaming + non-streaming), multi-turn context preservation, all native inference endpoints, the repair loop, and OpenAPI schema availability.

📄 Paper: arXiv:2604.16511 | 📊 Dataset: Hugging Face | 💻 Source: GitHub

SQL Query Engine

SQL Query Engine

Design

Architecture

Setup

API

Internals

Evaluation

Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage Guide

Setup Variables

Health Check

Full Pipeline (Stage 1 + Stage 2)

Minimal Request

Full Request with All Parameters

Force Fresh Schema Generation

Multi-Turn Conversation

Stage 1 Only — SQL Generation (No Execution)

Stage 2 Only — Query Evaluation & Repair

Valid Query

Deliberately Broken Query (Repair Loop Demo)

OpenAI-Compatible Endpoints

List Models

Chat Completions — Streaming

Chat Completions — Non-Streaming

Extract Just the Answer

Multi-Turn with `chat_id`

Legacy Text Completions

Using Explicit Connection Overrides

OpenAPI / Swagger UI

Smoke Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Usage Guide

Setup Variables

Health Check

Full Pipeline (Stage 1 + Stage 2)

Minimal Request

Full Request with All Parameters

Force Fresh Schema Generation

Multi-Turn Conversation

Stage 1 Only — SQL Generation (No Execution)

Stage 2 Only — Query Evaluation & Repair

Valid Query

Deliberately Broken Query (Repair Loop Demo)

OpenAI-Compatible Endpoints

List Models

Chat Completions — Streaming

Chat Completions — Non-Streaming

Extract Just the Answer

Multi-Turn with chat_id

Legacy Text Completions

Using Explicit Connection Overrides

OpenAPI / Swagger UI

Smoke Tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Multi-Turn with `chat_id`