Skip to content

Troubleshooting

codeadeel edited this page Apr 7, 2026 · 3 revisions

Common issues, debugging tips, and solutions for SQL Query Engine.

Connection Issues

Engine Won't Start

Symptom: docker compose up fails or the container exits immediately.

Check these in order:

  1. Port conflicts — Ensure ports 5181, 5182, and 6380 aren't already in use: lsof -i :5181
  2. Docker resources — The LLM model server needs significant memory. Check docker stats.
  3. Environment variables — Verify all required env vars are set in docker-compose.yml

GET /ping Returns Connection Refused

Cause: The FastAPI server hasn't started yet or the container isn't running.

# Check if the container is running
docker compose ps

# Check container logs
docker compose logs sql-query-engine

Redis Connection Failed

Symptom: Errors mentioning redis.exceptions.ConnectionError in logs.

Solutions:

  • Verify Redis is running: docker compose ps redisqueryengine
  • Check Redis password matches: the REDIS_PASSWORD env var must match the password in the redis service command section
  • If running locally, ensure Redis is accessible at the configured host/port

PostgreSQL Connection Failed

Symptom: Errors mentioning psycopg.OperationalError or connection timeouts.

Solutions:

  • Verify the PostgreSQL server is reachable from the engine container
  • Check credentials: POSTGRES_USER and POSTGRES_PASSWORD must have at least SELECT privileges
  • Firewall: ensure the PostgreSQL port is open between the Docker container and the database server
  • If your database runs on the host machine, use host.docker.internal (see Docker Issues below)

LLM Server Not Responding

Symptom: Requests hang or timeout with no response.

Solutions:

  • Verify LLM_BASE_URL points to a running server: curl -s YOUR_LLM_URL/v1/models
  • Check that LLM_MODEL matches a model actually loaded on the server
  • If using Ollama, ensure the model is pulled: ollama list
  • Increase timeout if the model is slow to respond (especially on first load)

Query Issues

Empty Results (Query Succeeds but Returns No Rows)

The engine treats empty result sets as failures and enters the repair loop. If the repair loop also produces empty results:

  • The query might be correct but the data simply doesn't exist
  • Check the original question — it may be asking about data not present in your database
  • Increase retryCount to give the LLM more attempts
  • Try rephrasing the basePrompt

Repair Loop Exhausts All Retries

Symptom: The response shows the final query attempt but results are empty or invalid.

Solutions:

  • Increase retryCount (default is 5): "retryCount": 10
  • Increase schemaExamples so the LLM sees more sample data: "schemaExamples": 10
  • Force a fresh schema description: "schemaDescriptionKey": "dbSchemaDescription_v2"
  • Check if your database schema is complex — the LLM may need a more capable model

Note: Even when retries exhaust, the engine returns the best result seen across all attempts (the attempt with the most rows), rather than nothing.

Wrong Table or Column Names in Generated SQL

The LLM generates SQL based on the schema description it creates. If it consistently picks wrong names:

  1. Force schema regeneration by changing schemaDescriptionKey
  2. Increase schemaExamples — more sample rows help the LLM understand table relationships
  3. Check the schema description in Redis: use redis-cli HGET "your-chatID:SQLQueryEngine" dbSchemaDescription to see what the LLM produced
  4. Try a more capable model — larger models produce more accurate schema descriptions

Query Takes Too Long

The engine has multiple LLM calls in its pipeline. Slow responses are usually caused by:

  • Slow LLM server — The biggest factor. Use a faster model or more powerful hardware.
  • Large schema — Many tables means a longer schema description step. The first request in a session is always slower.
  • Complex queries + repair loop — Each retry involves an LLM call. Reduce retryCount if speed matters more than accuracy.

Streaming Issues

No SSE Data Received

Symptom: Streaming requests return nothing or hang.

Check:

  • Redis Pub/Sub is working: redis-cli -h localhost -p 6380 -a YOUR_PASSWORD SUBSCRIBE test-channel
  • The engine is actually running: check container logs during the request
  • The SPLIT_IDENTIFIER env var matches between components (default: <|-/|-/>)

Streaming Stops Mid-Response

This can happen if:

  • The LLM server times out during a long generation
  • Redis Pub/Sub connection drops
  • The engine encounters an unhandled exception

Check docker compose logs sql-query-engine for the error.

Authentication Issues

401 Unauthorized on /v1/* Endpoints

Solutions:

  • Include the Authorization header: Authorization: Bearer YOUR_KEY
  • Verify the key matches one of the comma-separated values in OPENAI_API_KEY
  • If you don't need auth, set OPENAI_API_KEY to empty string or remove it entirely

Native /inference/* Endpoints Don't Require Auth

This is by design. The native endpoints rely on network-level security. If you need auth on these endpoints, put a reverse proxy (nginx, Caddy, Traefik) in front of the engine.

Docker-Specific Issues

Container Can't Reach Host Network Services

If your PostgreSQL or LLM server runs on the host machine (not in Docker):

# In docker-compose.yml, use host.docker.internal (macOS/Windows)
- POSTGRES_HOST=host.docker.internal
- LLM_BASE_URL=http://host.docker.internal:11434/v1

# On Linux, add to the service definition:
extra_hosts:
  - "host.docker.internal:host-gateway"

Redis Data Persistence

The Docker Compose setup uses a named volume (sqlEngineRedis) with append-only mode enabled, so Redis data persists across container restarts. To completely clear session data, remove the volume: docker volume rm sqlqueryengine_sqlEngineRedis.

Debugging Tips

View Raw Redis Session Data

# Connect to Redis
redis-cli -h localhost -p 6380 -a sqlPass

# List all keys for a session
HKEYS "your-chatID:SQLQueryEngine"

# View the cached schema description
HGET "your-chatID:SQLQueryEngine" "dbSchemaDescription"

# View the last repair loop history
HGET "your-chatID:SQLQueryEngine" "validatorChat:1"

# Watch Pub/Sub messages in real time
SUBSCRIBE "your-chatID"

Enable Verbose Logging

Check the FastAPI/Uvicorn logs:

docker compose logs -f sql-query-engine

Test Individual Components

Use the Stage 1 and Stage 2 endpoints separately to isolate issues:

# Test just SQL generation (no execution)
curl -s -X POST "http://localhost:5181/inference/sqlQueryGeneration/debug-session" \
  -H "Content-Type: application/json" \
  -d '{"basePrompt": "List all tables"}' | jq .

# Test just query execution (provide your own SQL)
curl -s -X POST "http://localhost:5181/inference/sqlQueryEvaluation/debug-session" \
  -H "Content-Type: application/json" \
  -d '{
    "basePrompt": "List tables",
    "baseQuery": "SELECT table_name FROM information_schema.tables WHERE table_schema = '\''public'\''",
    "baseDescription": "List all public tables"
  }' | jq .

Check the OpenAPI Schema

curl -s http://localhost:5181/openapi.json | jq '.paths | keys'

BIRD Evaluation Issues

SQLite Database Files Not Found

Symptom: bird-runner logs show FileNotFoundError or No SQLite file found for db_id.

Solution: Download the BIRD dataset and place the SQLite files at:

evaluation/bird/bird_data/dev_databases/{db_id}/{db_id}.sqlite

The runner expects one .sqlite file per database directory. See BIRD Evaluation for download links.

High Conversion Error Rate

Symptom: More than 15% of questions are excluded due to gold SQL conversion errors.

This is expected for certain databases (thrombosis_prediction, california_schools) that rely heavily on SQLite-specific date functions. The conversion pipeline handles 14 dialect transformations, but some complex SQLite idioms don't have clean PostgreSQL equivalents.

BIRD Evaluation Runs Very Slowly

BIRD has 11 databases. Each requires a schema warmup on first access (one sequential LLM call per database). After warmup, questions run in parallel.

Expected timelines:

  • Schema warmup: ~5-15 minutes (11 databases × 1-2 min each)
  • Per config (500 questions): ~30-90 minutes (depends on LLM speed and EVAL_MAX_WORKERS)
  • Full run (3 configs): ~2-4 hours

Solutions:

  • Increase EVAL_MAX_WORKERS if your LLM provider allows higher throughput
  • Use a faster LLM endpoint (Groq is typically fastest for open models)
  • Reduce TIMEOUT_SECONDS if you want to skip very slow questions

PostgreSQL Out of Memory (BIRD)

Symptom: bird-postgres container crashes or FATAL: could not resize shared memory segment.

Solution: The BIRD Docker Compose uses shm_size: '2g'. If you're running on a memory-constrained machine, reduce it to 1g, though some large BIRD databases may struggle.

FAQ

Q: Can the engine modify my database? No. All PostgreSQL connections are forced to read-only mode via conn.set_read_only(True). INSERT, UPDATE, DELETE, and DDL statements will be rejected by PostgreSQL itself.

Q: What happens if the LLM generates a destructive query? It will fail at execution time because the connection is read-only. The repair loop will then try to generate a valid SELECT query instead.

Q: Can I use multiple databases? Not simultaneously in a single request. However, you can use query parameter overrides to point different requests at different databases. Each chatID caches schema independently, so switching databases requires a new session ID.

Q: How do I clear a cached schema? Either use a different schemaDescriptionKey (e.g., "dbSchemaDescription_v2") or delete the Redis key manually: HDEL "your-chatID:SQLQueryEngine" "dbSchemaDescription"

Q: What models work best? Code-focused models (Qwen Coder, CodeLlama, DeepSeek Coder, GPT-4) tend to produce better SQL. The engine uses multi-strategy response parsing (JSON → code blocks → regex extraction) so it works with any model that can generate SQL — no structured output or function calling support required. In our benchmarks, Llama 4 Scout 17B achieved the best self-healing performance (58.7% accuracy, +10.7pp delta, zero regressions).

Q: Is there a query result size limit? Yes. Results are capped at 50 rows (hardLimit in the evaluator). This prevents memory exhaustion on large result sets.

Q: Can I run the BIRD evaluation without Docker? Yes, but you'll need to set up PostgreSQL and Redis manually, configure all environment variables, and run the BIRD entrypoint script directly. Docker is strongly recommended because the BIRD pipeline involves creating up to 11 PostgreSQL databases at runtime.

Q: Why are some BIRD questions excluded from results? About 12.6% of BIRD mini-dev questions have gold SQL that can't be cleanly converted from SQLite to PostgreSQL dialect. These are excluded from accuracy calculations but reported in conversion_report.json. Common causes: SQLite-specific date functions (JULIANDAY), unquoted multi-word column names, and GROUP BY strictness differences.

SQL Query Engine

Design

Setup

API

Internals

Evaluation

Help

Clone this wiki locally