OpenAI Compatibility

SQL Query Engine implements the OpenAI API specification, so it works as a drop-in backend for any tool built against the OpenAI API. This page shows how to integrate with popular clients.

Supported OpenAI Endpoints

OpenAI Endpoint	Supported	Notes
`GET /v1/models`	Yes	Returns the engine's model name
`POST /v1/chat/completions`	Yes	Full support including streaming
`POST /v1/completions`	Yes	Legacy text completions
`POST /v1/embeddings`	No	Not applicable
`POST /v1/images/*`	No	Not applicable

Python — OpenAI SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:5181/v1",
    api_key="meow123",
)

# Non-streaming
response = client.chat.completions.create(
    model="SQLBot",
    messages=[{"role": "user", "content": "How many orders were placed last month?"}],
    stream=False,
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="SQLBot",
    messages=[{"role": "user", "content": "Show me the top 5 products by revenue."}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Multi-Turn Conversations

Pass chat_id as an extra body parameter to preserve context across turns:

# Turn 1
response = client.chat.completions.create(
    model="SQLBot",
    messages=[{"role": "user", "content": "How many tables are in the database?"}],
    stream=False,
    extra_body={"chat_id": "my-session-001"},
)
answer_1 = response.choices[0].message.content

# Turn 2 — same chat_id, schema is reused from cache
response = client.chat.completions.create(
    model="SQLBot",
    messages=[
        {"role": "user", "content": "How many tables are in the database?"},
        {"role": "assistant", "content": answer_1},
        {"role": "user", "content": "Show me the columns in the largest table."},
    ],
    stream=False,
    extra_body={"chat_id": "my-session-001"},
)
print(response.choices[0].message.content)

JavaScript / TypeScript — OpenAI SDK

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:5181/v1',
  apiKey: 'meow123',
});

// Non-streaming
const response = await client.chat.completions.create({
  model: 'SQLBot',
  messages: [{ role: 'user', content: 'How many orders last month?' }],
  stream: false,
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
  model: 'SQLBot',
  messages: [{ role: 'user', content: 'Top 5 products by revenue?' }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Python — LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://localhost:5181/v1",
    api_key="meow123",
    model="SQLBot",
    streaming=True,
)

# Simple invocation
response = llm.invoke("How many active users are there?")
print(response.content)

# Streaming
for chunk in llm.stream("Show me revenue by month for this year."):
    print(chunk.content, end="")

OpenWebUI Integration

OpenWebUI is pre-configured in the Docker Compose setup. If you're connecting manually:

Open OpenWebUI (default: http://localhost:5182)
Go to Settings → Connections
Add a new OpenAI-compatible connection:
- URL: http://sqlqueryengine:8080/v1 (Docker internal) or http://localhost:5181/v1 (external)
- API Key: Your OPENAI_API_KEY value
The model SQLBot should appear in the model selector

OpenWebUI automatically:

Sends chat_id for context preservation across turns
Renders <think>...</think> blocks as collapsible reasoning sections
Shows streaming progress in real-time

curl

See the Usage Guide for comprehensive curl examples covering every endpoint, streaming modes, multi-turn sessions, and connection overrides.

`chat_id` Context Preservation

The chat_id field controls session-level context caching:

When chat_id is provided: The engine stores the schema description in Redis under {chat_id}:SQLQueryEngine. The first request in a session introspects the database and caches the schema. Subsequent requests with the same chat_id reuse the cache — no re-introspection needed.

When chat_id is omitted: The engine derives a stable ID from MD5(first_user_message)[:16]. This means two requests with the same first message will share context, which works well for single-turn usage but is less reliable for multi-turn sessions.

Recommendation: Always provide chat_id explicitly for multi-turn conversations.

Streaming Format

Streaming responses follow the OpenAI SSE (Server-Sent Events) specification:

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"<think>\n"},"finish_reason":null}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"next token"},"finish_reason":null}]}

data: [DONE]

Pipeline progress (schema generation, query execution, repair loop steps) is wrapped in <think>...</think> tags within the streamed content. Clients that support reasoning display (like OpenWebUI) render these as collapsible sections.

Differences from Standard OpenAI API

Feature	OpenAI API	SQL Query Engine
`model` field	Selects the model	Ignored — the engine always uses the configured pipeline
`chat_id` field	Not present	Custom field for session management
`temperature`	Controls sampling	Ignored — set via `LLM_TEMPERATURE` env var
`max_tokens`	Limits output	Ignored — not applicable to the SQL engine
`usage` token counts	Accurate	Returns zeros (token counting not implemented)
Multiple models	Many models	Single model (the engine itself)
Function calling	Supported	Not supported
Vision	Supported	Not supported
`<think>` tags in streaming	Not present	Used for pipeline progress visibility

📄 Paper: arXiv:2604.16511 | 📊 Dataset: Hugging Face | 💻 Source: GitHub

SQL Query Engine

SQL Query Engine

Design

Architecture

Setup

API

Internals

Evaluation

Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI Compatibility

Supported OpenAI Endpoints

Python — OpenAI SDK

Multi-Turn Conversations

JavaScript / TypeScript — OpenAI SDK

Python — LangChain

OpenWebUI Integration

curl

`chat_id` Context Preservation

Streaming Format

Differences from Standard OpenAI API

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

OpenAI Compatibility

Supported OpenAI Endpoints

Python — OpenAI SDK

Multi-Turn Conversations

JavaScript / TypeScript — OpenAI SDK

Python — LangChain

OpenWebUI Integration

curl

chat_id Context Preservation

Streaming Format

Differences from Standard OpenAI API

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`chat_id` Context Preservation