Session Management

SQL Query Engine uses Redis for two critical functions: session caching (storing schema descriptions and chat histories) and real-time streaming (Pub/Sub progress updates). This page explains the Redis architecture, key structure, and how context is preserved across requests.

Redis Architecture Overview

graph TB
    subgraph Redis
        direction TB
        H["Hash Store<br/>{chatID}:SQLQueryEngine"]
        PS["Pub/Sub Channels"]
    end

    subgraph HashFields["Hash Fields"]
        SD["dbSchemaDescription<br/>(cached schema)"]
        QG["dbQueryGenerator<br/>(generation history)"]
        VC["validatorChat:1, :2, ...<br/>(repair loop histories)"]
        HC["historyCounterManager<br/>(auto-increment counter)"]
    end

    subgraph Channels["Pub/Sub Channels"]
        CH1["{chatID}<br/>(tagged progress)"]
        CH2["{chatID}:stream<br/>(clean content mirror)"]
    end

    H --> HashFields
    PS --> Channels

    QGen["QueryGenerator"] -->|write| SD
    QGen -->|write| QG
    QGen -->|publish| CH1
    QEval["QueryEvaluator"] -->|write| VC
    QEval -->|publish| CH1
    OAI["openaiCompat.py"] -->|subscribe| CH1
    OAI -->|publish| CH2

Hash Store: Session Context

All session data is stored in a single Redis hash per session, keyed as {chatID}:SQLQueryEngine.

Hash Fields

Field	Written By	Contents
`dbSchemaDescription`	QueryGenerator, QueryEvaluator	JSON-serialized LangChain messages containing the full schema description. This is the most expensive piece to generate (requires DB introspection + LLM call) and is cached for reuse.
`dbQueryGenerator`	QueryGenerator	JSON-serialized chat history from the query generation step. Appended to on each query generation call.
`validatorChat:N`	QueryEvaluator	JSON-serialized chat history from the Nth repair loop invocation. A new key is created each time the evaluator runs.
`historyCounterManager`	QueryEvaluator	Auto-incrementing integer counter used to generate unique `validatorChat:N` keys.

Schema Caching Strategy

The schema description is the most important cached artifact. Here's how it works:

flowchart TD
    A["Request arrives with chatID"] --> B{"Redis has<br/>dbSchemaDescription?"}
    B -->|Yes| C["Load cached schema<br/>(fast path)"]
    B -->|No| D["Introspect PostgreSQL<br/>(tables, columns, samples)"]
    D --> E["LLM generates schema<br/>description (streamed)"]
    E --> F["Cache in Redis"]
    F --> C
    C --> G["Proceed with<br/>query generation"]

The cache key can be controlled via the schemaDescriptionKey parameter. Changing it (e.g., from "dbSchemaDescription" to "dbSchemaDescription_v2") forces a fresh schema generation, which is useful when the database schema has changed.

Pub/Sub: Real-Time Progress Streaming

The engine publishes progress messages to Redis Pub/Sub channels during pipeline execution. This enables real-time streaming in the OpenAI-compatible endpoint.

Message Format

All Pub/Sub messages follow this format:

</{Component}:{Event}>{SPLIT_IDENTIFIER}{Content}

Where SPLIT_IDENTIFIER defaults to <|-/|-/>.

Example:

</SQLQueryEvaluator:QueryFixAttempt#1><|-/|-/>Current Query : SELECT COUNT(*) FROM orders

Stage 1 Messages (QueryGenerator)

Tag	Content
`</SQLQueryGenerator:schemaDescriptionChat>`	Streamed chunks of the LLM schema description

Stage 2 Messages (QueryEvaluator)

On early-accept (query returned rows — no LLM call needed):

Message	Content
`QueryFixAttempt#N`	`Query executed successfully with {N} rows`

Before LLM call (query failed or returned empty — per retry):

Message	Content
`QueryFixAttempt#N — Current Query`	The SQL being executed
`QueryFixAttempt#N — Current Observation`	Current status/observation
`QueryFixAttempt#N — Execution Errors`	PostgreSQL error details, or `No errors encountered.`

After LLM evaluation (per retry):

Message	Content
`QueryFixAttempt#N — LLM Observation`	What the LLM diagnosed
`QueryFixAttempt#N — Fixed Query`	Corrected SQL query
`QueryFixAttempt#N`	`Will verify fixed query on next iteration`

Streaming Subscriber Architecture

sequenceDiagram
    participant Client
    participant FastAPI as FastAPI (openaiCompat.py)
    participant Redis as Redis Pub/Sub
    participant Engine as SQLQueryEngine (thread)

    Client->>FastAPI: POST /v1/chat/completions (stream: true)
    FastAPI->>Redis: Subscribe to {chatID}
    FastAPI->>Engine: Launch in thread pool

    loop Pipeline Progress
        Engine->>Redis: Publish progress message
        Redis->>FastAPI: Receive message
        FastAPI->>Client: SSE chunk (wrapped in think tags)
        FastAPI->>Redis: Mirror to {chatID}:stream
    end

    Engine->>FastAPI: Return final result
    FastAPI->>Redis: Drain residual messages
    FastAPI->>Client: SSE chunk (final formatted result)
    FastAPI->>Client: data: [DONE]

Key details:

The subscriber attaches to the channel before the engine starts, preventing message loss
Progress content is extracted after the SPLIT_IDENTIFIER delimiter (<|-/|-/>)
All progress is wrapped in <think>...</think> tags (collapsible in OpenWebUI)
After the engine completes, residual messages are drained (100 retries, 100ms timeout each)
A mirror channel {chatID}:stream publishes clean content for external subscribers

Session Isolation

Each chatID gets its own Redis hash namespace. There is no cross-session data sharing:

Session abc123 → abc123:SQLQueryEngine hash + abc123 Pub/Sub channel
Session xyz789 → xyz789:SQLQueryEngine hash + xyz789 Pub/Sub channel

This means multiple users (or multiple conversations for the same user) can run concurrently without interference.

`chatID` vs `chat_id`

The engine uses two naming conventions for the session identifier, depending on the API surface:

API	Parameter	How It's Set
Native (`/inference/*`)	`chatID` (path parameter)	Always provided explicitly in the URL
OpenAI-compat (`/v1/*`)	`chat_id` (request body)	Optional. If omitted, derived from `MD5(first_user_message)[:16]`

Both resolve to the same Redis key prefix internally. The MD5 fallback ensures that identical first messages share context even without an explicit session ID, but explicit chat_id is more reliable for multi-turn usage.

Schema Context Resolution in the Evaluator

When the evaluator (Stage 2) needs schema context, it follows a three-tier fallback:

flowchart TD
    A["QueryEvaluator needs<br/>schema context"] --> B{"Generator payload<br/>available?"}
    B -->|Yes| C["Use schema from<br/>generator output<br/>(fastest — in-memory)"]
    B -->|No| D{"Redis cache<br/>has schema?"}
    D -->|Yes| E["Load from Redis<br/>(fast — network call)"]
    D -->|No| F["Generate from scratch<br/>(slow — DB + LLM)"]

    C --> G["Proceed with evaluation"]
    E --> G
    F --> G

From payload — When the full pipeline runs, the generator passes its schema context directly to the evaluator (zero-cost)
From Redis — When only the evaluator is called, it loads the cached schema from a previous session turn
From scratch — Full schema introspection + LLM description (slowest, used only when no cache exists)

Redis Connection Configuration

Environment Variable	Default	Description
`REDIS_HOST`	`""` (empty)	Redis server hostname
`REDIS_PORT`	not set	Redis server port
`REDIS_PASSWORD`	not set	Redis authentication password
`REDIS_DB`	not set	Redis database number (0–15)

The Redis client is configured with decode_responses=True for string-based storage.

📄 Paper: arXiv:2604.16511 | 📊 Dataset: Hugging Face | 💻 Source: GitHub

SQL Query Engine

SQL Query Engine

Design

Architecture

Setup

API

Internals

Evaluation

Help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session Management

Redis Architecture Overview

Hash Store: Session Context

Hash Fields

Schema Caching Strategy

Pub/Sub: Real-Time Progress Streaming

Message Format

Stage 1 Messages (QueryGenerator)

Stage 2 Messages (QueryEvaluator)

Streaming Subscriber Architecture

Session Isolation

`chatID` vs `chat_id`

Schema Context Resolution in the Evaluator

Redis Connection Configuration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Session Management

Redis Architecture Overview

Hash Store: Session Context

Hash Fields

Schema Caching Strategy

Pub/Sub: Real-Time Progress Streaming

Message Format

Stage 1 Messages (QueryGenerator)

Stage 2 Messages (QueryEvaluator)

Streaming Subscriber Architecture

Session Isolation

chatID vs chat_id

Schema Context Resolution in the Evaluator

Redis Connection Configuration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

`chatID` vs `chat_id`