A Model Context Protocol (MCP) server that lets AI assistants query any PostgreSQL database using natural language. Ask questions in plain English, get SQL and results back.
It learns over time — every query is logged with embeddings so similar future questions get better answers. Optionally connects to a Neo4j knowledge graph to store business term definitions and corrections.
When you ask a question like "How many users signed up last month?", this is what happens:
- Context Gathering (parallel) — Searches the knowledge graph for domain terms, finds similar past queries from memory, and lists all available tables.
- Table Selection — Fast heuristic picks which tables are relevant based on keywords in your question, past query patterns, and knowledge graph facts. No LLM call needed.
- Schema Discovery (parallel) — For each candidate table, discovers columns, types, foreign keys, sample rows, distinct values, and NULL rates from
pg_stats. All cached for 5 minutes. - SQL Generation — Sends the question + full schema context + domain knowledge + proven query patterns to OpenAI, which generates a PostgreSQL SELECT query.
- Validation & Execution — Validates the SQL is read-only (no INSERT/UPDATE/DELETE/DROP), runs EXPLAIN, then executes.
- Memory Logging — Logs the question, SQL, tables used, and result count with an embedding vector for future similarity search.
- Retry on Error — If the SQL fails (wrong column name, type mismatch, etc.), it discovers any missing tables mentioned in the error and retries up to 3 times.
Every successful query is stored in mcp_query_memory with an OpenAI embedding. When a new question comes in, the server finds the most similar past queries and includes their SQL as proven patterns for the LLM. This means repeated or similar questions get faster, more accurate answers over time.
If you connect Neo4j, you can:
- Teach terms — Tell the system that "contractor" means the
candidatetable, or that "active" meansstatus = 'Active'. - Correct queries — Record what was wrong about a past query so the system learns from mistakes.
- Seed knowledge — Pre-load domain definitions via
seed_knowledge.json.
The knowledge graph is optional. Without it, teach_term and correct_query return errors, but query_database and search_context still work using query memory.
| Tool | Description |
|---|---|
query_database |
Ask a question in natural language, get SQL + results |
teach_term |
Teach a business term mapping (requires Neo4j) |
correct_query |
Record a correction for a past query (requires Neo4j) |
search_context |
Look up what business terms mean in the knowledge graph and query memory |
- Python 3.11+
- PostgreSQL database (any provider — local, AWS RDS, Azure, etc.)
- OpenAI API key (for embeddings and SQL generation)
git clone https://github.com/JeffreyBennet/mcp-sql-server.git
cd mcp-sql-server
chmod +x install.sh
./install.shOr manually:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txtCopy the example and fill in your credentials:
cp .env.example .envEdit .env:
# Required
POSTGRES_HOST=your-host.com
POSTGRES_PORT=5432
POSTGRES_DB=your_database
POSTGRES_USER=your_user
POSTGRES_PASSWORD=your_password
OPENAI_API_KEY=sk-your-openai-key
# Optional
ANTHROPIC_API_KEY=sk-ant-your-key
EMBEDDING_MODEL=text-embedding-3-small
LOG_LEVEL=INFOEdit tables.conf to list only the tables you want the AI to access. One table per line. Leave empty or delete the file to allow all tables.
users
orders
products
Edit seed_knowledge.json to pre-load domain definitions:
[
["users", "user = account = member. Table: users."],
["orders", "order = purchase = transaction. Table: orders. Links to users via user_id."],
["products", "product = item = SKU. Table: products."]
]Each entry is [name, description]. These get loaded into the Neo4j knowledge graph on first startup.
Set MCP_INSTRUCTIONS in .env to customize what the AI sees:
MCP_INSTRUCTIONS=You are a database assistant for our e-commerce platform. Use query_database for all data questions. Key tables: users, orders, products.Add to your Claude Desktop config file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"sql-server": {
"command": "/path/to/mcp-sql-server/venv/bin/python",
"args": ["/path/to/mcp-sql-server/server.py"],
"env": {
"POSTGRES_HOST": "your-host.com",
"POSTGRES_PORT": "5432",
"POSTGRES_DB": "your_database",
"POSTGRES_USER": "your_user",
"POSTGRES_PASSWORD": "your_password",
"OPENAI_API_KEY": "sk-your-key",
"ANTHROPIC_API_KEY": "sk-ant-your-key"
}
}
}
}Replace /path/to/mcp-sql-server with the actual path (e.g., /Users/you/mcp-sql-server).
You can either put credentials in the env block (shown above) or rely on the .env file in the project directory.
Add to your Claude Code MCP settings:
Project-level: .claude/settings.json in your project root
User-level: ~/.claude/settings.json
{
"mcpServers": {
"sql-server": {
"command": "/path/to/mcp-sql-server/venv/bin/python",
"args": ["/path/to/mcp-sql-server/server.py"]
}
}
}Or use the Claude Code CLI:
claude mcp add sql-server /path/to/mcp-sql-server/venv/bin/python /path/to/mcp-sql-server/server.pyBuild and run with Docker:
docker build -t mcp-sql-server .
docker run -p 8000:8000 \
-e POSTGRES_HOST=your-host \
-e POSTGRES_PORT=5432 \
-e POSTGRES_DB=your_db \
-e POSTGRES_USER=your_user \
-e POSTGRES_PASSWORD=your_pass \
-e OPENAI_API_KEY=sk-your-key \
mcp-sql-serverThe SSE server exposes:
GET /health— health checkGET /sse— SSE connection for MCP clientsPOST /messages/— MCP message handler
| Variable | Required | Default | Description |
|---|---|---|---|
POSTGRES_HOST |
Yes | — | PostgreSQL host |
POSTGRES_PORT |
Yes | 5432 |
PostgreSQL port |
POSTGRES_DB |
Yes | — | Database name |
POSTGRES_USER |
Yes | — | Database user |
POSTGRES_PASSWORD |
Yes | — | Database password |
OPENAI_API_KEY |
Yes | — | OpenAI API key (embeddings + SQL gen) |
ANTHROPIC_API_KEY |
No | — | Anthropic API key |
EMBEDDING_MODEL |
No | text-embedding-3-small |
OpenAI embedding model |
CLAUDE_MODEL |
No | claude-opus-4-6 |
Claude model name |
MCP_INSTRUCTIONS |
No | (built-in) | Custom instructions shown to the AI |
DEFAULT_TABLES |
No | — | Comma-separated fallback tables |
NEO4J_URI |
No | bolt://localhost:7687 |
Neo4j connection URI |
NEO4J_USER |
No | neo4j |
Neo4j username |
NEO4J_PASSWORD |
No | compass2025 |
Neo4j password |
GRAPHITI_GROUP_ID |
No | compass |
Knowledge graph group ID |
RAG_MEMORY_TOP_K |
No | 30 |
Max similar queries to retrieve |
EMBEDDING_BACKFILL_LIMIT |
No | 200 |
Rows to backfill on startup |
LOG_LEVEL |
No | INFO |
Logging level |
mcp-sql-server/
server.py # Main MCP server (stdio transport)
server_sse.py # HTTP/SSE transport for remote deployment
requirements.txt # Python dependencies
.env.example # Environment template
tables.conf # Table allowlist (empty = allow all)
seed_knowledge.json # Pre-loaded knowledge graph entries
install.sh # One-click setup script
Dockerfile # Container deployment
- Only
SELECTqueries are allowed.INSERT,UPDATE,DELETE,DROP, and other DDL/DML are blocked. - SQL is validated before execution — forbidden keywords are rejected, and
EXPLAINis run first. - Use
tables.confto restrict which tables the AI can access. - Database credentials are stored in
.env(gitignored) or passed via environment variables.