Problem
`.github/workflows/grafana-lint.yml` runs `dashboard-linter lint --strict deploy/grafana/charon.json` on every push and PR that touches `deploy/grafana/**`. The latest run on `main` fails (`workflow_run id 24902714948`). The same lint failed on the very PR that introduced the dashboard (`#54 feat/26-grafana-dashboard`) and has been failing ever since because every subsequent fixup touched the dashboard JSON without addressing the strict-mode violations.
Concrete failures (full list in CI logs):
- Panel `Build info` has no `unit` defined.
- Every PromQL target is missing a `job=~"$job"` selector — the dashboard has no `job` template variable.
- Rate-based queries hardcode `[1m]` / `[5m]` instead of `$__rate_interval`.
- `instance` template `allValue` is `'.*'`, the linter expects `'.+'`.
- Dashboard top-level `editable: true` should be `false`.
- Missing top-level `job` template variable.
Impact
- Every PR that touches `deploy/grafana/**` now arrives with a red CI tile, conditioning maintainers to merge despite a failing check.
- Future regressions in the dashboard JSON are masked by the constant-failure baseline — a real `promtool check rules` regression on `alerts.yaml` would not stand out.
- New contributors cannot tell whether a lint failure on their PR is theirs or pre-existing.
Fix
Single PR that simultaneously:
- Adds a `job` template variable with `label_values(charon_build_info, job)`, default `charon`.
- Adds `job=~"$job"` to every existing target query.
- Replaces hardcoded `[1m]` / `[5m]` with `$__rate_interval` on every `rate()` / `irate()` expression.
- Sets `unit: "short"` on `Build info` (or `unit: "none"` — value is always 1).
- Sets the `instance` template `allValue: ".+"`.
- Sets dashboard `editable: false`.
After landing, `grafana-lint` should turn green on `main` and surface real regressions.
Severity
Medium — CI noise that erodes signal. Not a runtime bug.
Problem
`.github/workflows/grafana-lint.yml` runs `dashboard-linter lint --strict deploy/grafana/charon.json` on every push and PR that touches `deploy/grafana/**`. The latest run on `main` fails (`workflow_run id 24902714948`). The same lint failed on the very PR that introduced the dashboard (`#54 feat/26-grafana-dashboard`) and has been failing ever since because every subsequent fixup touched the dashboard JSON without addressing the strict-mode violations.
Concrete failures (full list in CI logs):
Impact
Fix
Single PR that simultaneously:
After landing, `grafana-lint` should turn green on `main` and surface real regressions.
Severity
Medium — CI noise that erodes signal. Not a runtime bug.