Skip to content

Provision Grafana alert rules via Pulumi #1214

@rdimitrov

Description

@rdimitrov

Context

Today only the Grafana datasources are provisioned in code (deploy/pkg/k8s/monitoring.go:546). Alert rules and dashboards live in Grafana's PostgreSQL DB and are edited via UI — not versioned, not reviewable, and lost if Grafana's DB ever resets.

Surfaced while triaging recurring Publish Endpoint Latency and Availability dropped below 95% alerts that turned out to be metric-pipeline artifacts post-deploy. Wanted to add for: 10m and noDataState: OK to both rules; can only do that via UI today.

Proposal

Add a grafana-alerts ConfigMap mounted at /etc/grafana/provisioning/alerting/ mirroring the existing datasources pattern, and move alert rules into it. Likely worth doing dashboards and notification policies at the same time.

Caveats

  • Provisioned alerts are read-only in the UI — every change goes through a PR thereafter. Worth confirming team is OK with that.
  • Existing rules need to be exported (Grafana provisioning API) and committed.

Acceptance

  • Alert rules and notification policies provisioned from code
  • Dashboards same (or explicitly out of scope)
  • deploy/README.md updated with the new edit workflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions