A self-hosted stack for agentic analytics — your chat, your models, your data warehouse. Powered by MCP Toolbox for Databases, LibreChat, and Langfuse.
This project runs a fully self-hosted agentic analytics environment with Docker Compose. It connects a chat UI (LibreChat) to your data warehouse via MCP, with full LLM observability (Langfuse) — all in a single docker compose up command.
Supported warehouses: BigQuery, Snowflake, and ClickHouse — configure one or more in tools.yaml.
| Component | Purpose | Port |
|---|---|---|
| LibreChat | Chat UI with multi-model support (OpenAI, Anthropic, Google) | 3080 |
| MCP Toolbox | Warehouse-agnostic MCP server (BigQuery, Snowflake, ClickHouse) | 5050 |
| Langfuse | LLM observability — traces, cost tracking, evals, prompt management | 3000 |
| ClickHouse | Analytical database (used internally by Langfuse) | 8123 |
| PostgreSQL | Transactional database for Langfuse | 5432 |
| MongoDB | Transactional database for LibreChat | 27017 |
| MinIO | S3-compatible object storage | 9090 |
| Redis | Caching and queue | 6379 |
| Meilisearch | Full-text search for LibreChat | 7700 |
| pgvector | Vector database for RAG | 5433 |
| RAG API | Retrieval-augmented generation for file uploads | 8001 |
- Docker and Docker Compose v2+
- Credentials for at least one data warehouse (BigQuery, Snowflake, or ClickHouse)
- An API key for at least one LLM provider (OpenAI, Anthropic, or Google)
./scripts/prepare-demo.shThis generates a .env file with random credentials for all services, then presents an interactive menu to configure API keys for OpenAI, Anthropic, and/or Google. Any providers you skip will remain as user_provided, letting users enter their own keys in the LibreChat UI.
You can also generate credentials separately and customize the admin account:
USER_EMAIL="you@example.com" USER_PASSWORD="supersecret" USER_NAME="YourName" ./scripts/generate-env.shEdit tools.yaml to uncomment and configure the section for your warehouse. Each warehouse section has a source (connection details) and a tool (what the agent can do).
BigQuery:
sources:
bigquery:
kind: bigquery
project: your-gcp-project-id
location: US
tools:
query-bigquery:
kind: bigquery-execute-sql
source: bigquery
description: "Execute a SQL query against BigQuery using GoogleSQL syntax."Then configure authentication — see BigQuery Authentication below.
Snowflake:
sources:
snowflake:
kind: snowflake
account: your-account.us-east-1
user: ${SNOWFLAKE_USER}
password: ${SNOWFLAKE_PASSWORD}
database: YOUR_DATABASE
schema: PUBLIC
warehouse: COMPUTE_WH
tools:
query-snowflake:
kind: snowflake-sql
source: snowflake
description: "Execute a SQL query against Snowflake."
statement: "{{.sql}}"
parameters:
- name: sql
type: string
description: "The SQL query to execute"Then set SNOWFLAKE_USER and SNOWFLAKE_PASSWORD in your .env file.
ClickHouse (external):
sources:
clickhouse:
kind: clickhouse
host: your-clickhouse-host
protocol: http
port: 8123
user: ${TOOLBOX_CLICKHOUSE_USER}
password: ${TOOLBOX_CLICKHOUSE_PASSWORD}
database: default
tools:
query-clickhouse:
kind: clickhouse-sql
source: clickhouse
description: "Execute a SQL query against ClickHouse."
statement: "{{.sql}}"
parameters:
- name: sql
type: string
description: "The SQL query to execute"Then set TOOLBOX_CLICKHOUSE_USER and TOOLBOX_CLICKHOUSE_PASSWORD in your .env file.
For TLS-enabled ClickHouse deployments, use protocol: https and port: 8443.
You can configure multiple warehouses at once — just include multiple sources and tools in
tools.yaml.
For the full list of supported databases and configuration options, see the MCP Toolbox documentation.
docker compose up -d| Service | URL | Credentials |
|---|---|---|
| LibreChat | http://localhost:3080 | From .env (LANGFUSE_INIT_USER_EMAIL / LANGFUSE_INIT_USER_PASSWORD) |
| Langfuse | http://localhost:3000 | Same as above |
| MinIO Console | http://localhost:9091 | From .env (MINIO_ROOT_USER / PASSWORD) |
An admin user is created automatically on first startup using the credentials from your .env file.
- Open LibreChat at http://localhost:3080
- Click Create New Agent in the sidebar
- Select a provider and model (e.g., Google / gemini-2.0-flash)
- Open MCP Settings and verify the
data-warehouseserver is connected - Save the agent and start chatting — ask it to query your data
All agent interactions are automatically traced in Langfuse. Open http://localhost:3000 to see traces, token usage, cost, and latency for every conversation.
Two authentication methods are supported:
Service account key (recommended for production):
- Uncomment the credentials volume mount in
toolbox-mcp-compose.yml - Uncomment
GOOGLE_APPLICATION_CREDENTIALSin the environment section - Set
GCP_CREDENTIALS_FILEin.envto your service account JSON path, for example./secrets/gcp-service-account.json - Keep the JSON key outside the repository or in
./secrets/(gitignored by default)
Application Default Credentials (convenient for local dev):
Create a docker-compose.override.yml (gitignored) to mount your local ADC:
services:
toolbox-mcp:
command: ["--tools-file", "/app/tools.yaml", "--address", "0.0.0.0", "--port", "5000"]
volumes:
- type: bind
source: ./tools.yaml
target: /app/tools.yaml
read_only: true
- type: bind
source: ~/.config/gcloud/application_default_credentials.json
target: /app/credentials.json
read_only: true
environment:
GOOGLE_APPLICATION_CREDENTIALS: /app/credentials.jsonMake sure you have valid ADC credentials:
gcloud auth application-default loginSet these in your .env file:
SNOWFLAKE_USER=your_user
SNOWFLAKE_PASSWORD=your_password
Set these in your .env file:
TOOLBOX_CLICKHOUSE_USER=your_user
TOOLBOX_CLICKHOUSE_PASSWORD=your_password
LibreChat connects to your data warehouse through MCP Toolbox, allowing AI agents to query and analyze your data using natural language. All LLM interactions are traced in Langfuse for observability, cost tracking, and evaluation.
| File | Purpose |
|---|---|
tools.yaml |
Data warehouse connections and MCP tools |
librechat.yaml |
LLM endpoints, MCP servers, and agent capabilities |
.env |
All credentials and service configuration (see .env.example) |
docker-compose.yml |
Includes the three compose files below |
langfuse-compose.yml |
Langfuse, ClickHouse, PostgreSQL, Redis, MinIO |
toolbox-mcp-compose.yml |
MCP Toolbox for Databases |
librechat-compose.yml |
LibreChat, MongoDB, Meilisearch, pgvector, RAG API |
Local overrides: Create docker-compose.override.yml for machine-specific config (gitignored by default). You can also mount a gitignored tools.local.yaml from that override if you want per-machine MCP tool config.
| Script | Description |
|---|---|
scripts/prepare-demo.sh |
Generate .env and interactively configure API keys |
scripts/generate-env.sh |
Generate .env with random credentials |
scripts/reset-all.sh |
Stop all containers and wipe all data/volumes |
scripts/create-librechat-user.sh |
Manually create a LibreChat admin user |
scripts/init-librechat-user.sh |
Auto-init user on container startup (used internally) |
To tear down all containers and delete all data:
./scripts/reset-all.shThen set up again and start fresh:
./scripts/prepare-demo.sh
docker compose up -dPort 5050 conflict: If port 5050 is already in use, change the host mapping in toolbox-mcp-compose.yml (for example, 127.0.0.1:5051:5000) and keep librechat.yaml pointed at http://toolbox-mcp:5000/mcp.
"No key found" in LibreChat: You need to configure an LLM API key. Either set it in .env (e.g., GOOGLE_KEY=your-key) and restart LibreChat, or run ./scripts/prepare-demo.sh to set keys interactively.
MCP server not showing in agent config: Check that LibreChat can reach the Toolbox container. Run docker logs <toolbox-mcp-container> to confirm Toolbox initialized 1+ tools, and docker logs <librechat-container> for MCP client initialization messages.
Note: To use LibreChat's file search / RAG features, the RAG API needs a real API key for embeddings —
user_providedwon't work because the RAG API calls the embeddings endpoint directly. IfOPENAI_API_KEYis set touser_provided, setRAG_OPENAI_API_KEYto a valid OpenAI key (it overridesOPENAI_API_KEYfor RAG only). You can also switch embedding providers viaEMBEDDINGS_PROVIDER(openai,azure,huggingface,huggingfacetei,ollama). See the RAG API docs for details.
- MCP Toolbox for Databases — Warehouse-agnostic MCP server
- LibreChat — Chat UI
- Langfuse — LLM observability
- LibreChat Documentation — Full LibreChat configuration guide
