Skip to content

livepeer/livepeer-data-mcp

Repository files navigation

Livepeer Data MCP Server

MCP server for intelligent access to Livepeer's data stack. Ask questions in plain English and get routed to the right data source — ClickHouse for infrastructure/streaming data, PostHog for product analytics.

Setup (Claude Desktop)

Prerequisites

  • Node.js 20+
  • Access to the Livepeer ClickHouse cluster (ask infra team for credentials)
  • Clone of the analytics-dbt repo (dbt models live under dbt/)

1. Clone and build

git clone git@github.com:livepeer/livepeer-data-mcp.git
cd livepeer-data-mcp
npm install
npm run build

2. Get your credentials

ClickHouse — Ask the infra team for read-only credentials to the ClickHouse cluster.

PostHog (optional) — Generate a personal API key at https://us.posthog.com/settings/user-api-keys. The key should start with phx_.

dbt schemas — Clone the analytics repo that contains our dbt models (under dbt/):

git clone git@github.com:livepeer/analytics-dbt.git

3. Configure Claude Desktop

Open your Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json

Add the livepeer-data server to your mcpServers block. Replace the paths and credentials with your own:

{
  "mcpServers": {
    "livepeer-data": {
      "command": "node",
      "args": [
        "/absolute/path/to/livepeer-data-mcp/build/index.js"
      ],
      "env": {
        "CLICKHOUSE_URL": "https://fdkweodknu.us-east-1.aws.clickhouse.cloud:8443",
        "CLICKHOUSE_DATABASE": "semantic",
        "CLICKHOUSE_USER": "default",
        "CLICKHOUSE_PASSWORD": "<your-clickhouse-password>",
        "DBT_LOCAL_PATH": "/absolute/path/to/analytics-dbt/dbt",
        "DBT_MODELS_GLOB": "models/marts/**/*.yml",
        "CONTEXT_DIR": "/absolute/path/to/livepeer-data-mcp/context",
        "POSTHOG_API_KEY": "<your-posthog-personal-api-key>"
      }
    }
  }
}

4. Restart Claude Desktop

Quit and reopen Claude Desktop. You should see the livepeer-data server in the MCP tools list (hammer icon).

5. Try it out

Ask Claude questions like:

  • "How many visitors did we have yesterday?"
  • "What's the stream startup success rate this week?"
  • "Show me all feature flags"
  • "What datasets do we have for transcoding?"

The server automatically routes your question to the right data source.

Setup (Claude Code)

claude mcp add --transport http livepeer-data https://data-mcp.sre.livepeer.technology/mcp \
  --header "CF-Access-Client-Id: <id>" --header "CF-Access-Client-Secret: <secret>"

Available Tools

Tool Source Description
route_question Routes a question to ClickHouse or PostHog with confidence scoring
list_datasets ClickHouse Lists all available dbt datasets with descriptions
describe_dataset ClickHouse Shows full schema, columns, and example queries for a dataset
query ClickHouse Executes read-only SQL against ClickHouse (max 1000 rows)
get_glossary Looks up Livepeer-specific terms (orchestrator, gateway, etc.)
posthog_query PostHog Asks a question in natural language, returns HogQL results
posthog_list_insights PostHog Lists saved PostHog insights
posthog_get_insight PostHog Gets a specific insight by ID
posthog_list_dashboards PostHog Lists all PostHog dashboards
posthog_get_dashboard PostHog Gets a specific dashboard by ID
posthog_list_feature_flags PostHog Lists all feature flags with status
posthog_list_errors PostHog Lists tracked errors

PostHog tools only appear when POSTHOG_API_KEY is set.

Architecture

MCP Clients (Claude Desktop, Claude Code, Cursor)
    │
    ▼ MCP Protocol (stdio or Streamable HTTP)
┌──────────────────────────────────────────────┐
│         Livepeer Data MCP Server             │
│                                              │
│  route_question → list_datasets →            │
│  describe_dataset → query                    │
│                                              │
│  Context Layer:                              │
│    glossary.yml │ routing-hints.yml │         │
│    examples.yml │ dbt schemas                │
│                                              │
│  ┌─────────────┐   ┌──────────────────────┐  │
│  │  ClickHouse  │   │  PostHog MCP Proxy   │  │
│  │  (direct SQL)│   │  (remote MCP client) │  │
│  └─────────────┘   └──────────────────────┘  │
└──────────────────────────────────────────────┘

Troubleshooting

"CLICKHOUSE_USER is required" — Make sure all env vars are set in the Claude Desktop config JSON. The server reads from the config, not from a .env file.

PostHog tools not showing upPOSTHOG_API_KEY is missing or empty. PostHog tools are optional and only register when the key is set.

"No schema files found" — Check that DBT_LOCAL_PATH points to the dbt/ subdirectory of the analytics-dbt repo, and DBT_MODELS_GLOB is models/marts/**/*.yml. When using DBT_REPO_URL, the glob should be dbt/models/marts/**/*.yml (repo-root relative).

Tools show in Claude but queries fail — The ClickHouse password may have changed, or the cluster may be down. Run npm run start:stdio locally to see startup errors in your terminal.

Hosted Deployment (Infra Team)

The server runs as a Docker container on Kubernetes, deployed via ArgoCD from livepeer/infra-helm-values. Exposed at livepeer-data-mcp.livepeer.technology via Cloudflare tunnel.

Vault Secrets

All connection credentials are stored in Vault at sre/livepeer-data-mcp. To add or change a connection, update the secrets there:

Vault Key Description
CLICKHOUSE_URL ClickHouse cluster URL (e.g. https://....clickhouse.cloud:8443)
CLICKHOUSE_DATABASE Default database for queries (currently semantic)
CLICKHOUSE_USER ClickHouse username
CLICKHOUSE_PASSWORD ClickHouse password
GITHUB_TOKEN GitHub PAT with read access to livepeer/analytics-dbt (private repo, cloned at startup for dbt schemas)
POSTHOG_API_KEY PostHog personal API key (phx_...). Omit to disable PostHog tools.

The Helm values in infra-helm-values/livepeer-data-mcp/values.yaml reference these via ExternalSecrets. You don't need to touch the Helm values to change credentials — just update Vault and the pod will pick up the new values (via reloader).

Verify

# Health check
curl https://livepeer-data-mcp.livepeer.technology/health

# List tools (Streamable HTTP)
curl -X POST https://livepeer-data-mcp.livepeer.technology/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-03-26","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'

Container details

  • Image: livepeerci/livepeer-data-mcp:latest — multi-stage build, node:20-alpine with git
  • Port: 3000
  • Health check: GET /health every 30s
  • Startup: ~15s (clones dbt repo, connects to ClickHouse + PostHog)
  • Stateless: Each MCP request creates a fresh server instance backed by shared connection pools

Updating

Push to main on this repo → rebuild and push the Docker image → ArgoCD syncs automatically. The dbt schemas are cloned fresh on every container start, so schema changes in analytics-dbt/dbt/ are picked up on restart.

Development

npm run build          # Compile TypeScript
npm test               # Run tests (89 tests)
npm run dev            # Watch mode
npm run start:stdio    # Start in stdio mode (for Claude Desktop)
npm run start:http     # Start HTTP server (for Docker/remote)

MCP Inspector (debugging)

npx @modelcontextprotocol/inspector node build/index.js

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages