-
Notifications
You must be signed in to change notification settings - Fork 1
Troubleshooting
Common issues, debugging tips, and solutions for SQL Query Engine.
Symptom: docker compose up fails or the container exits immediately.
Check these in order:
-
Port conflicts — Ensure ports 5181, 5182, and 6380 aren't already in use:
lsof -i :5181 -
Docker resources — The LLM model server needs significant memory. Check
docker stats. -
Environment variables — Verify all required env vars are set in
docker-compose.yml
Cause: The FastAPI server hasn't started yet or the container isn't running.
# Check if the container is running
docker compose ps
# Check container logs
docker compose logs sql-query-engineSymptom: Errors mentioning redis.exceptions.ConnectionError in logs.
Solutions:
- Verify Redis is running:
docker compose ps redisqueryengine - Check Redis password matches: the
REDIS_PASSWORDenv var must match the password in theredisservicecommandsection - If running locally, ensure Redis is accessible at the configured host/port
Symptom: Errors mentioning psycopg.OperationalError or connection timeouts.
Solutions:
- Verify the PostgreSQL server is reachable from the engine container
- Check credentials:
POSTGRES_USERandPOSTGRES_PASSWORDmust have at leastSELECTprivileges - Firewall: ensure the PostgreSQL port is open between the Docker container and the database server
- If your database runs on the host machine, use
host.docker.internal(see Docker Issues below)
Symptom: Requests hang or timeout with no response.
Solutions:
- Verify
LLM_BASE_URLpoints to a running server:curl -s YOUR_LLM_URL/v1/models - Check that
LLM_MODELmatches a model actually loaded on the server - If using Ollama, ensure the model is pulled:
ollama list - Increase timeout if the model is slow to respond (especially on first load)
The engine treats empty result sets as failures and enters the repair loop. If the repair loop also produces empty results:
- The query might be correct but the data simply doesn't exist
- Check the original question — it may be asking about data not present in your database
- Increase
retryCountto give the LLM more attempts - Try rephrasing the
basePrompt
Symptom: The response shows the final query attempt but results are empty or invalid.
Solutions:
- Increase
retryCount(default is 5):"retryCount": 10 - Increase
schemaExamplesso the LLM sees more sample data:"schemaExamples": 10 - Force a fresh schema description:
"schemaDescriptionKey": "dbSchemaDescription_v2" - Check if your database schema is complex — the LLM may need a more capable model
Note: Even when retries exhaust, the engine returns the best result seen across all attempts (the attempt with the most rows), rather than nothing.
The LLM generates SQL based on the schema description it creates. If it consistently picks wrong names:
-
Force schema regeneration by changing
schemaDescriptionKey -
Increase
schemaExamples— more sample rows help the LLM understand table relationships -
Check the schema description in Redis: use
redis-cli HGET "your-chatID:SQLQueryEngine" dbSchemaDescriptionto see what the LLM produced - Try a more capable model — larger models produce more accurate schema descriptions
The engine has multiple LLM calls in its pipeline. Slow responses are usually caused by:
- Slow LLM server — The biggest factor. Use a faster model or more powerful hardware.
- Large schema — Many tables means a longer schema description step. The first request in a session is always slower.
-
Complex queries + repair loop — Each retry involves an LLM call. Reduce
retryCountif speed matters more than accuracy.
Symptom: Streaming requests return nothing or hang.
Check:
- Redis Pub/Sub is working:
redis-cli -h localhost -p 6380 -a YOUR_PASSWORD SUBSCRIBE test-channel - The engine is actually running: check container logs during the request
- The
SPLIT_IDENTIFIERenv var matches between components (default:<|-/|-/>)
This can happen if:
- The LLM server times out during a long generation
- Redis Pub/Sub connection drops
- The engine encounters an unhandled exception
Check docker compose logs sql-query-engine for the error.
Solutions:
- Include the Authorization header:
Authorization: Bearer YOUR_KEY - Verify the key matches one of the comma-separated values in
OPENAI_API_KEY - If you don't need auth, set
OPENAI_API_KEYto empty string or remove it entirely
This is by design. The native endpoints rely on network-level security. If you need auth on these endpoints, put a reverse proxy (nginx, Caddy, Traefik) in front of the engine.
If your PostgreSQL or LLM server runs on the host machine (not in Docker):
# In docker-compose.yml, use host.docker.internal (macOS/Windows)
- POSTGRES_HOST=host.docker.internal
- LLM_BASE_URL=http://host.docker.internal:11434/v1
# On Linux, add to the service definition:
extra_hosts:
- "host.docker.internal:host-gateway"The Docker Compose setup uses a named volume (sqlEngineRedis) with append-only mode enabled, so Redis data persists across container restarts. To completely clear session data, remove the volume: docker volume rm sqlqueryengine_sqlEngineRedis.
# Connect to Redis
redis-cli -h localhost -p 6380 -a sqlPass
# List all keys for a session
HKEYS "your-chatID:SQLQueryEngine"
# View the cached schema description
HGET "your-chatID:SQLQueryEngine" "dbSchemaDescription"
# View the last repair loop history
HGET "your-chatID:SQLQueryEngine" "validatorChat:1"
# Watch Pub/Sub messages in real time
SUBSCRIBE "your-chatID"Check the FastAPI/Uvicorn logs:
docker compose logs -f sql-query-engineUse the Stage 1 and Stage 2 endpoints separately to isolate issues:
# Test just SQL generation (no execution)
curl -s -X POST "http://localhost:5181/inference/sqlQueryGeneration/debug-session" \
-H "Content-Type: application/json" \
-d '{"basePrompt": "List all tables"}' | jq .
# Test just query execution (provide your own SQL)
curl -s -X POST "http://localhost:5181/inference/sqlQueryEvaluation/debug-session" \
-H "Content-Type: application/json" \
-d '{
"basePrompt": "List tables",
"baseQuery": "SELECT table_name FROM information_schema.tables WHERE table_schema = '\''public'\''",
"baseDescription": "List all public tables"
}' | jq .curl -s http://localhost:5181/openapi.json | jq '.paths | keys'Symptom: bird-runner logs show FileNotFoundError or No SQLite file found for db_id.
Solution: Download the BIRD dataset and place the SQLite files at:
evaluation/bird/bird_data/dev_databases/{db_id}/{db_id}.sqlite
The runner expects one .sqlite file per database directory. See BIRD Evaluation for download links.
Symptom: More than 15% of questions are excluded due to gold SQL conversion errors.
This is expected for certain databases (thrombosis_prediction, california_schools) that rely heavily on SQLite-specific date functions. The conversion pipeline handles 14 dialect transformations, but some complex SQLite idioms don't have clean PostgreSQL equivalents.
BIRD has 11 databases. Each requires a schema warmup on first access (one sequential LLM call per database). After warmup, questions run in parallel.
Expected timelines:
- Schema warmup: ~5-15 minutes (11 databases × 1-2 min each)
- Per config (500 questions): ~30-90 minutes (depends on LLM speed and
EVAL_MAX_WORKERS) - Full run (3 configs): ~2-4 hours
Solutions:
- Increase
EVAL_MAX_WORKERSif your LLM provider allows higher throughput - Use a faster LLM endpoint (Groq is typically fastest for open models)
- Reduce
TIMEOUT_SECONDSif you want to skip very slow questions
Symptom: bird-postgres container crashes or FATAL: could not resize shared memory segment.
Solution: The BIRD Docker Compose uses shm_size: '2g'. If you're running on a memory-constrained machine, reduce it to 1g, though some large BIRD databases may struggle.
Q: Can the engine modify my database?
No. All PostgreSQL connections are forced to read-only mode via conn.set_read_only(True). INSERT, UPDATE, DELETE, and DDL statements will be rejected by PostgreSQL itself.
Q: What happens if the LLM generates a destructive query? It will fail at execution time because the connection is read-only. The repair loop will then try to generate a valid SELECT query instead.
Q: Can I use multiple databases?
Not simultaneously in a single request. However, you can use query parameter overrides to point different requests at different databases. Each chatID caches schema independently, so switching databases requires a new session ID.
Q: How do I clear a cached schema?
Either use a different schemaDescriptionKey (e.g., "dbSchemaDescription_v2") or delete the Redis key manually: HDEL "your-chatID:SQLQueryEngine" "dbSchemaDescription"
Q: What models work best? Code-focused models (Qwen Coder, CodeLlama, DeepSeek Coder, GPT-4) tend to produce better SQL. The engine uses multi-strategy response parsing (JSON → code blocks → regex extraction) so it works with any model that can generate SQL — no structured output or function calling support required. In our benchmarks, Llama 4 Scout 17B achieved the best self-healing performance (58.7% accuracy, +10.7pp delta, zero regressions).
Q: Is there a query result size limit?
Yes. Results are capped at 50 rows (hardLimit in the evaluator). This prevents memory exhaustion on large result sets.
Q: Can I run the BIRD evaluation without Docker? Yes, but you'll need to set up PostgreSQL and Redis manually, configure all environment variables, and run the BIRD entrypoint script directly. Docker is strongly recommended because the BIRD pipeline involves creating up to 11 PostgreSQL databases at runtime.
Q: Why are some BIRD questions excluded from results?
About 12.6% of BIRD mini-dev questions have gold SQL that can't be cleanly converted from SQLite to PostgreSQL dialect. These are excluded from accuracy calculations but reported in conversion_report.json. Common causes: SQLite-specific date functions (JULIANDAY), unquoted multi-word column names, and GROUP BY strictness differences.
📄 Paper: arXiv:2604.16511 | 📊 Dataset: Hugging Face | 💻 Source: GitHub
SQL Query Engine
Design
Setup
API
Internals
Evaluation
Help