diff --git a/docs/toolhive/guides-mcp/grafana.mdx b/docs/toolhive/guides-mcp/grafana.mdx new file mode 100644 index 0000000..cb344b0 --- /dev/null +++ b/docs/toolhive/guides-mcp/grafana.mdx @@ -0,0 +1,229 @@ +--- +title: Grafana MCP server guide +sidebar_label: Grafana +description: + Using the Grafana MCP server with ToolHive for dashboard management, + datasource queries, alerting, and incident response. +last_update: + author: danbarr + date: 2025-11-06 +--- + +## Overview + +The official [Grafana MCP server](https://github.com/grafana/mcp-grafana) +provides comprehensive access to your Grafana instance and its surrounding +ecosystem. With support for dashboards, datasource queries (Prometheus, Loki, +Pyroscope), alerting, incident management, Grafana OnCall, and Sift +investigations, this server enables AI agents to interact with your entire +observability stack. + +The server works with both local Grafana instances and Grafana Cloud, making it +ideal for tasks like troubleshooting production issues, analyzing metrics and +logs, managing dashboards, and coordinating incident response. + +## Metadata + + + +## Usage + +You'll need a Grafana service account token to authenticate with the Grafana +API. The token must have permissions for the Grafana features you want to access +(such as dashboards, datasources, or alerting). Refer to the +[Grafana service account documentation](https://grafana.com/docs/grafana/latest/administration/service-accounts/) +for details on creating tokens and configuring permissions. + + + + +Select the `grafana` MCP server in the ToolHive registry. + +In the **Secrets** section, add your Grafana service account token or select an +existing secret that contains the token. + +In the **Environment Variables** section, configure the connection to your +Grafana instance: + +- `GRAFANA_URL`: Your Grafana instance URL (for example, `http://localhost:3000` + for local instances or `https://myinstance.grafana.net` for Grafana Cloud) +- `GRAFANA_ORG_ID` (optional): The numeric organization ID if your Grafana + instance has multiple organizations + +:::tip[Security tip] + +Enable outbound network filtering on the **Network Isolation** tab to restrict +the server's network access. Update the allowed hosts to match your Grafana +instance domain. + +::: + + + + +Create a secret containing your Grafana service account token: + +```bash +thv secret set grafana-token +``` + +Run the server with your Grafana instance URL and the secret: + +```bash +thv run \ + -e GRAFANA_URL=http://localhost:3000 \ + --secret grafana-token,target=GRAFANA_SERVICE_ACCOUNT_TOKEN \ + grafana +``` + +For Grafana Cloud, use your cloud instance URL: + +```bash +thv run \ + -e GRAFANA_URL=https://myinstance.grafana.net \ + --secret grafana-token,target=GRAFANA_SERVICE_ACCOUNT_TOKEN \ + grafana +``` + +Enable [network isolation](../guides-cli/network-isolation.mdx) to restrict the +server's network access. Create a permission profile with your Grafana instance +domain: + +```json title="grafana-profile.json" +{ + "network": { + "outbound": { + "insecure_allow_all": false, + "allow_host": ["myinstance.grafana.net"], + "allow_port": [443] + } + } +} +``` + +Then run with the custom profile: + +```bash +thv run \ + -e GRAFANA_URL=https://myinstance.grafana.net \ + --secret grafana-token,target=GRAFANA_SERVICE_ACCOUNT_TOKEN \ + --isolate-network --permission-profile grafana-profile.json \ + grafana +``` + +If your Grafana instance has multiple organizations, add the `GRAFANA_ORG_ID` +environment variable with the numeric organization ID (for example, +`-e GRAFANA_ORG_ID=2`). + +:::tip[Debug mode] + +Add the `--` separator followed by `-debug` to enable detailed logging of HTTP +requests and responses: + +```bash +thv run \ + -e GRAFANA_URL=http://localhost:3000 \ + --secret grafana-token,target=GRAFANA_SERVICE_ACCOUNT_TOKEN \ + -- -debug +``` + +::: + + + + +Create a Kubernetes secret containing your Grafana service account token: + +```bash +kubectl -n toolhive-system create secret generic grafana-token \ + --from-literal=token= +``` + +Create a Kubernetes manifest to deploy the Grafana MCP server: + +```yaml {14-17} title="grafana.yaml" +apiVersion: toolhive.stacklok.dev/v1alpha1 +kind: MCPServer +metadata: + name: grafana + namespace: toolhive-system +spec: + image: docker.io/mcp/grafana:latest + transport: sse + mcpPort: 8000 + proxyPort: 8080 + env: + - name: GRAFANA_URL + value: 'http://localhost:3000' + secrets: + - name: grafana-token + key: token + targetEnvName: GRAFANA_SERVICE_ACCOUNT_TOKEN +``` + +Apply the manifest to your Kubernetes cluster: + +```bash +kubectl apply -f grafana.yaml +``` + +For Grafana Cloud, update the `GRAFANA_URL` in the manifest: + +```yaml +spec: + env: + - name: GRAFANA_URL + value: 'https://myinstance.grafana.net' +``` + +If your Grafana instance has multiple organizations, add the `GRAFANA_ORG_ID` +environment variable with the numeric organization ID: + +```yaml +spec: + env: + - name: GRAFANA_URL + value: 'http://localhost:3000' + - name: GRAFANA_ORG_ID + value: '2' +``` + + + + +## Sample prompts + +Here are some sample prompts you can use to interact with the Grafana MCP +server: + +- "Show me all dashboards related to Kubernetes monitoring" +- "Query the Prometheus datasource for CPU usage over the last hour for the + `api-service` pod" +- "Get the recent alerts that are currently firing" +- "List all open incidents and show me details for the most recent one" +- "Find error patterns in the logs from the `production` namespace using Loki" +- "Who is currently on call for the backend team schedule?" +- "Create a new incident titled 'High memory usage on production cluster' with + severity critical" +- "Show me the panel queries from the 'API Performance' dashboard" +- "Get label values for the `namespace` label from the Loki datasource" +- "List all Sift investigations from the past week" + +## Recommended practices + +- Create service accounts with least-privilege permissions. Use fine-grained + RBAC scopes to limit access to only the datasources, dashboards, and features + required for your specific use case. +- Regularly rotate service account tokens and update the secrets in ToolHive. +- Enable network isolation to restrict the server's outbound network access to + your Grafana instance domain only. +- For dashboards with large JSON configurations, use the `get_dashboard_summary` + or `get_dashboard_property` tools to minimize context window usage instead of + retrieving the full dashboard with `get_dashboard_by_uid`. +- When working with multi-organization setups, always specify the + `GRAFANA_ORG_ID` to ensure operations target the correct organization. +- Enable [telemetry](../guides-cli/telemetry-and-metrics.mdx) to monitor API + calls and track which Grafana resources are being accessed. +- For production deployments, consider using the debug mode temporarily to + troubleshoot connection or permission issues, but disable it once everything + is working correctly.