Skip to content

Latest commit

 

History

History
1804 lines (1094 loc) · 47.4 KB

File metadata and controls

1804 lines (1094 loc) · 47.4 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

4.21.3 - 2026-04-27

Added

  • Add envoy-vs-nginx-loadtesting and envoy-vs-kong-loadtesting dashboards to public org, split from the former envoy-gateway-loadtesting private dashboard

Fixed

  • Fix metric comparability in envoy-vs-nginx-loadtesting: align downstream RPS (rate/aggregation), success rate window ([2m]→[5m]), and replace wrong Nginx upstream RPS metric; fix Nginx downstream latency unit (seconds→ms)
  • Fix metric comparability in envoy-vs-kong-loadtesting: align downstream RPS (rate/aggregation), fix success rate regex to correctly exclude 4xx/5xx codes

4.21.2 - 2026-04-22

Fixed

  • Removed hardcoded label in one of envoy-gateway-loadtesting dashboard's graph.

4.21.1 - 2026-04-20

Changed

  • Update envoy-gateway-loadtesting dashboard with added kong graphs

4.21.0 - 2026-04-08

Added

  • Add envoy-gateway-loadtesting dashboard

Changed

  • Reorganize Grafana Cloud directories: dashboards/grafana-cloud/sources/, grafana-cloud/grafana-cloud/backup/, GC API scripts → grafana-cloud/scripts/
  • Update backup workflow and README to reflect new paths

4.20.0 - 2026-04-07

Added

  • Add monthly GitHub Actions workflow to auto-update Tempo, Mimir, Loki, and Alloy mixin dashboards
  • Add Monitoring Landscape / Customer Audit dashboard to audit monitoring tools across workload clusters and compare resource consumption with the GiantSwarm observability platform

Changed

  • Move Tempo dashboards from private_dashboards_mz to team_atlas under Giant Swarm/Observability/Tempo
  • Move Loki dashboards from private_dashboards_al to team_atlas under Giant Swarm/Observability/Loki
  • Refresh Loki dashboards from latest upstream mixin
  • Move Mimir dashboards from private_dashboards_mz to team_atlas under Giant Swarm/Observability/Mimir
  • Refresh Mimir mixin dashboards from upstream mimir-2.17.6
  • Move Alloy dashboards from private_dashboards_al to team_atlas under Giant Swarm/Observability/Alloy
  • Refresh Alloy mixin dashboards from upstream v1.15.0
    • Adds alloy-loki and alloy-otel-engine-overview dashboards

4.19.0 - 2026-03-31

Changed

  • Updated Tempo dashboards to mixins v2.10
  • bugfixes in Tempo operational dashboard
  • Update Network Traffic Analysis Overview dashboard
    • Replace average network traffic gauges with total data transfer bar chart
    • Add a time period selector for data transfer periods
    • Make total visible in all panels
    • Remove datasource variable
    • Simplify traffic rate queries

Removed

  • Remove the object-storage-operator

4.18.0 - 2026-03-03

Fixed

  • Fix memory usage calculation in nodes-overview and cluster-overview dashboards by using node_memory_MemAvailable_bytes instead of node_memory_MemFree_bytes, which incorrectly excluded cached/buffered memory from free memory

Changed

  • Update DNS dashboard
    • Add new node and pod filters
    • Update variables description and query to use coredns_build_info as label source
    • Remove cache prefetch panel since the metric is gone
    • Fix DNS dashboard log panels
  • Kube-Builder Operators dashboard: add a logs datasource selector

4.17.0 - 2026-02-23

Added

  • Add missing team-atlas labels

Changed

  • Rename team-turtles to team-tenet

4.16.1 - 2026-02-13

Fixed

  • fix configmap names overlap between public and private dashboards

4.16.0 - 2026-02-13

Added

  • memcached dashboard (private)

4.15.1 - 2026-02-12

Changed

  • Use Chart.AppVersion instead of Chart.Version for Kubernetes labels to support Flux OCI automatic upgrades.
  • Update team annotation to OpenContainers format (io.giantswarm.application.team).

4.15.0 - 2026-02-05

Changed

  • Update Network Traffic Analysis dashboards
    • Set variables to refresh on time window change
    • Change queries to user avg_over_time to have a smoothened graph over longer time ranges
    • Sort by Mean rather than Last in graps to match what's shown on the tables
    • Use 7 days as default time window on overview dashboard
    • Add Mean column next to Last on graphs
    • Add cluster variable description

4.14.0 - 2026-02-05

Changed

  • Bump kube-mixins to 1.4.1

Removed

  • Remove the unused home dashboard

4.13.0 - 2026-02-02

Added

  • Add Network Traffic Analysis Overview dashboard

Changed

  • Improve Network Traffic Analysis Dashboard Performance
    • [performance] Re-use panel queries, which gladly reduces the number of queries made to the backend, and ease maintenance
    • [performance] Change Pie Charts query types from range to instant
    • [performance] Set maximum datapoints to 500 and minimal interval to 2mn
    • [ux] Change destination pie charts to only show top 10 destinations
    • [ux] Change the top list panels to use pagination rather than only showing top 10 elements
    • [ux] Change bottom graphs to use stacked lines and added list of values + total count
    • [ux] Remove the per-namespace section which was merely a duplicate of the top one with additional namespace filter. All panels now have a namespace filter which default to all namespace, therefore keeping the old behavior of the top panels and also allowing behavior of the bottom ones at the same time.
    • [ux] Add links between both Network Traffic Analysis dashboards
    • [ux] Add Include non-namespaced toggle to filter/include non-namespaced network traffic
    • [ux] Improve documentation panel
    • [ux] Add annotations for CiliumNetworkPolicies events
    • [ux] Change Legend set to "unknown" when no value is found
    • [ux] Show percentage and all values in tooltip on destination panels
    • [maintenance] Move subnets regex to a constant
  • NGINX Ingress controller dashboard: reworked variables
    • removed app selector
    • removed namespace selector
    • added ingress namespace selector

Removed

  • Remove logging-operator related data as it is now deprecated.

4.12.0 - 2026-01-20

Added

  • Add "Envoy Gateway | Configuration" dashboard.

4.11.1 - 2026-01-19

Fixed

  • Made node utilization dashboard work again with all nodes
  • Network Traffic Analysis dashboard: consider 100.64.0.0/16 private

4.11.0 - 2026-01-05

Changed

  • Add cluster_id to Envoy Gateway dashboards

4.10.0 - 2026-01-05

Added

  • Add Alloy Metrics Targets dashboard
  • Add beyla-based Network Traffic Analysis

Changed

  • Update Envoy Gateway dashboards and make them public.

Removed

  • Remove the alertmanager-overview dashboard in favor of the better Mimir / Alertmanager dashboard
  • Clean up alertmanager-mixin auto-update scripts
  • Remove dashboards about deprecated components (Prometheus, Promxy, Prometheus-agent, Promtail)

4.9.1 - 2025-10-07

Changed

  • Improve Unschedulable pods dashboard: better count

4.9.0 - 2025-10-06

Added

  • Add Tempo mixin dashboards.
  • Loki and Mimir ressource efficiency dashboards.
  • Unschedulable pods dashboard.

Fixed

  • Updates in servicemonitor overview to comply with alloy.

4.8.2 - 2025-09-17

Fixed

  • Fixed cluster variable filter for Falco private dashboard.

4.8.1 - 2025-09-17

Added

  • Add Falco dashboards.

4.8.0 - 2025-09-11

Changed

  • Switch aws dashboards to be deployed in CAPA, plus some cleanup of unused dashboards.

4.7.1 - 2025-08-28

Fixed

  • "storage lock" graph was displaying wrong data in the zot dashboard

4.7.0 - 2025-08-26

Added

  • Add dashboard "Deprecated container registries in use"

4.6.0 - 2025-08-25

Added

  • Grafana Cloud Dashboard Backup Workflow
  • Update Cilium Agent logs dashboard and show cluster info
  • Mimir Rate Control dashboard
  • Mimir cardinality management dashboards

Changed

  • Add failed netpols panel to Cilium Agent logs dashboard

4.5.0 - 2025-05-07

Added

  • Add Konfigure Operator dashboard

4.4.3 - 2025-04-29

Fixed

  • dns dashboard: fixed logs row

4.4.2 - 2025-04-28

Chunged

  • Cluster Overview dashboard: improve alerts links

4.4.1 - 2025-04-28

Changed

  • alerts timeline dashboard:
    • fix silences count
    • add links to silences, alerts, notifications
    • fix consistency between alerts timeline and alerts list
    • update title
  • alertmanager overview update to mimir alertmanager

4.4.0 - 2025-03-24

Added

  • Add Cilium Agent logs dashboard.

4.3.1 - 2025-03-20

Fixed

  • Mimir Continuous Tests: fix datasource selector

4.3.0 - 2025-03-12

Added

  • Add Observability Resource Usage dashboard
  • Add Backstage dashboard

4.2.0 - 2025-03-05

Added

  • Add Envoy Gateway dashboards

4.1.0 - 2025-02-25

Added

  • Add Loki log volume dashboard

Changed

  • Update the home dashboard
    • Change Alerts link to point to Alerting > Active notifications
    • Add grafana organization link
    • Add new data ingestion link
    • Replace Node overview with Cluster overview dashboard
    • Fix broken Custom dashboards link

Fixed

  • Fix Cluster overview dashboard to consistently open links in new tabs.

4.0.0 - 2025-02-13

Changed

  • Update Cilium Performance BPF map pressure panel datasource.
  • Dashboard labels for observability-operator provisioning.

3.30.0 - 2025-02-05

Changed

  • Add team label to karpenter dashboard.
  • Add grafana organization label to public dashboards

Removed

  • Remove CAPI Overview dashboard.

3.29.1 - 2025-01-22

Fixed

  • Fix datasource for the new Cluster Overview dashboard.

3.29.0 - 2025-01-22

Changed

  • Update DNS dashboard to include memory requests.
  • Added Cluster Overviewdashboard.

Fixed

  • fix alloy mixins generation
  • fix alloy mixins
  • fix Alloy / Logs Overview dashboard.

3.28.0 - 2025-01-13

Added

  • Add Promtail overview dashboard.

Changed

  • Updated DNS dashboard and show memory usage using cAdvisor metrics.

3.27.0 - 2025-01-10

Changed

  • Fixed nodes overview dashboard to avoid master duplicating numbers.
  • Add observability organization label to GS private dashboards.

3.26.1 - 2024-11-12

Added

  • Add Mimir / Continuous test dashboard.

Fixed

  • Fixed "Management cluster Overview" dashboard

3.26.0 - 2024-10-30

Changed

  • Resync alloy, loki and mimir mixins from upstream.

3.25.0 - 2024-10-14

Added

  • Added "Loki - Slow Queries" dashboard

Changed

  • Change ownership from BigMac to Shield

3.24.0 - 2024-09-23

Changed

  • Updated alertmanager dashboard to show related logs
  • Add loki mixins dashboards update script.
  • Update mimir mixins dashboards via script.

Fixed

  • Fix Alloy mixins tags.

3.23.0 - 2024-08-22

Added

  • Added Alloy mixin dashboards

  • Added Makefile.custom.mk to group scripts usage

    • Added make update-mixin to update mixin dasbhboards
    • Added make lint-dashboards to dashboards linting
    • Added make install-tools to install required tools
  • Added scripts/update-alloy-mixin.sh to update the Alloy mixin dashboards

  • Added update-alertmanager-mixin and update-kubernetes-mixin Makefile targets

Changed

  • Updated all dashboars using decbytes unit to use bytes (IEC units) instead.

Fixed

  • Fix dashboards destination path in update-monitoring-mixin-dashboards.sh script

3.22.0 - 2024-08-01

Added

  • mimir query stats dashboard

Changed

  • Fixed etcd count dashboard replacing the namespace to expoted_namespace.

Removed

  • Remove Linkerd Control Plane dashboard

3.21.0 - 2024-07-03

Changed

  • Get rid of the app label in Phoenix dashboards.

3.20.0 - 2024-07-01

Added

  • Add "BPF map pressure" graph to "Cilium performance" dashboard.
  • Add kube-builder logs in "Kube-Builder Operators" dashboard.

Changed

  • fluentbit dashboard: cluster selection

Fixed

  • Mimir Cost Estimation: fix RAM usage

Removed

  • Removed the dashboard 'Webhook Health'.

3.19.0 - 2024-06-13

Changed

  • Make Certificate Details dashboard public
  • Migrate deprecated angular panels.

3.18.0 - 2024-06-12

Added

  • Added missing datasource to all dashboards and panels.

Removed

  • Ger rid the of useless analytics panels.

3.17.0 - 2024-06-07

Added

  • Add script to update all mimir dashboards from mixins.

Changed

  • Replace Cluster ID with cluster in dashboard labels.
  • remotewrite: improve legends
  • remotewrite: add count of agent replicas
  • Reviewed labels used in turtles dashboards
  • servicemonitors-overview: add info about agent pods

Fixed

  • Add missing datasource to api-security dashboard.
  • Add missing datasource to api-performance dashboard and fix tags.
  • Add missing datasource to api-performance dashboard and fix tags.
  • Add missing datasource to certificate dashboard and fix tags.
  • Lint the capi and capa aggregated-error-logs dashboards to have proper datasource variables.

Removed

  • Remove vintage azure dashboards.

3.16.1 - 2024-06-04

Changed

  • Get rid of the app label in Atlas dashboards.

3.16.0 - 2024-05-30

Changed

  • Move SLO reporting dashboard to be public.

Fixed

  • Add missing data source to atlas dashboards.
  • Fix missing provider specific dashboards.
  • Fix prometheus-cost-estimation dashboard.
  • Update alertmanager-overview dashboard (angular deprecation).
  • Update fluentbit dashboard (angular deprecation).
  • Update operatorkit dashboard (angular deprecation).
  • Update prometheus-overview dashboard (angular deprecation).
  • Update prometheus-remote-write dashboard (angular deprecation).

Removed

  • Remove prometheus benchmark dashboard.
  • Remove EFK dashboards.
  • Removed KVM dashboards.

3.15.1 - 2024-05-29

Fix

  • Fix missing ingress slo in slo reporting.

3.15.0 - 2024-05-23

Added

  • Add datasource variable to Prometheus dashboard

Fixed

  • Adjust panel positions to fill width and move Mimir related panel under related section

3.14.2 - 2024-05-20

Changed

  • ServiceMonitors Overview dashboard: add RAM usage estimation

3.14.1 - 2024-05-15

Changed

  • Change "Node utilization" dashboard to use allocatable over capacity, to better reflect the available resources of the nodes.

3.14.0 - 2024-05-15

Fixed

  • Fix loki and mimir mixins recording rules
  • Fix atlas dashboard tags.
  • Fix storage related panes on zot's dashboards
  • prometheus: scraping info can now be filtered by cluster
  • add some basic linting configuration so we can track down issues in dashboards.

Added

  • ServiceMonitors overview dashboard
  • ServiceMonitors details dashboard

3.13.0 - 2024-04-24

Changed

  • Change "Worker node utilization" dashboard to "Node utilization", also allowing to analyze data for control plane nodes.

3.12.0 - 2024-04-23

Added

  • Add dashboard "Worker node utilization".

3.11.4 - 2024-04-17

Changed

  • Updated "In-cluster container registry (Zot)" dashboard to use metric kubelet_volume_stats_used_bytes for storage used.

3.11.3 - 2024-04-17

Changed

  • Move node-problem-detector to be aws only.

Fixed

  • Fix Grafana Cloud service-level dashboard in case we have duplicate clusrer names in different installations.
  • Invalid datasource variable name in mimir cost estimate dashboard.

3.11.2 - 2024-04-16

Fixed

  • Fix Mimir / writes resources disk usage related graphs.

3.11.1 - 2024-04-16

Changed

  • Improved many details on the dashboard "In-cluster container registry (Zot)".
  • Change net-exporter dashboard ownership from turtles to cabbage.
  • Change cluster-total.json dashboard ownership from turtles to cabbage.
  • Change namespace-by-pod.json dashboard ownership from turtles to cabbage.
  • Change namespace-by-workload.json dashboard ownership from turtles to cabbage.
  • Change pod-total.json dashboard ownership from turtles to cabbage.
  • Change workload-total.json dashboard ownership from turtles to cabbage.
  • Update "Ingress NGINX Controller Connection Distribution" dashboard file to schema version 39.
  • Update "Giant Swarm / Kubernetes Persistent Volumes" dashboard file to replace old graph panels with new time series panels.
  • Update "Security: Falco Dashboard" dashboard file to replace old graph panel with new time series panel, old table with new table panel.

Fixed

  • Fix Mimir / Reads resources Disk usage graphs.

Removed

  • Remove "Microstorage" dashboard.

3.11.0 - 2024-04-11

Added

  • Add a CAPA aggregated error logs dashboard.

3.10.4 - 2024-04-10

Fixed

  • Fixes "All Dex requests" panel showing "No data" by increasing query interval to 2m.

3.10.3 - 2024-04-10

Changed

  • Remove app and namespace labels from the prometheus - remotewrite's nginx graphs.

Fixed

  • Fix cpu throttling panel in prometheus dashboard

3.10.2 - 2024-04-10

Fixed

  • Fix and update Flux Control Plane dashboard in various ways.

Changed

  • The private Zot dashboard is updated because of a namespace change, and some minor fixes are applied.

3.10.1 - 2024-04-08

Fixed

  • Fix Mimir / Compactor / Overview dashboard.

3.10.0 - 2024-04-03

Added

  • Add private dashboard about in-cluster container image registry (Zot)

Fixed

  • Fix Mimir / Overview resources dashboard

3.9.0 - 2024-03-27

Added

  • Add dashboard for loki-canary (private for now)

Changed

  • Migrate panels from graph to timeseries in DNS dashboard.
  • Add promxy cpu and memory graphs to prometheus cost dashboard.
  • Add promxy cpu and memory graphs to mimir cost dashboard.

3.8.5 - 2024-03-26

Added

  • Add public Cilium performance dashboard.

Fixed

  • Fix all dashboards that were only supporting only role=master to now support role=~control-plane|master.
  • Fix Mimir - Prometheus cost dashboard to compare over real data (not missing on data from old prometheus instances)

3.8.4 - 2024-03-11

Added

  • Add mimir overview mixins dashboard.

3.8.3 - 2024-03-11

Added

  • Add mimir mixins dashboard for compactor.

3.8.2 - 2024-03-07

Added

  • Add mimir mixins dashboard for ruler.

3.8.1 - 2024-03-05

Added

  • Add mimir mixins dashboards for read.

3.8.0 - 2024-02-29

Added

  • Add mimir mixins dashboards for write.

Changed

  • Change managed-apps-efk-stack-app dashboard owner to Atlas.
  • Updated k8s-resources-node dashboard.

3.7.1 - 2024-02-27

Fixed

  • Fix Prometheus - Mimir Cost Comparison dashboard.
  • Moved Dashboards from deprecated ludacris team to turtles team tags.
  • Removed Dashboards with deprecated data sources.

3.7.0 - 2024-01-31

Added

  • Create Policy Enforcement (PSS) Status workload cluster compliance overview dashboard.
  • Add first draft of the SLA reporting dashboard.

Fixed

  • Fix query in API server dashboard for CAPI clusters

3.6.2 - 2024-01-18

Fixed

  • Fix name and query for API server dashboard

3.6.1 - 2024-01-18

Changed

  • Fixed API server dashboard and made it public

3.6.0 - 2024-01-15

Added

  • Kube-Builder operators dashboard
  • Add Linkerd Control Plane dashboard

Changed

  • Better description and small spacing fixes in etcd-backup dashboard

3.5.0 - 2023-11-24

Changed

  • Loki Operational dashboard improvements:
    • Added a Backend Path panel
    • Added disk usage to Write and Backend path panel
    • Added total pods to Write, Read and Backend panels

Fixed

  • Loki cost estimation: update legends to name tenants

3.4.2 - 2023-11-21

Changed

  • Add three stat panels to "Container images from docker.io" dashboard

3.4.1 - 2023-11-20

Fixed

  • Fix queries in "Container images from docker.io" dashboard to use the image_spec label instead of image.

3.4.0 - 2023-11-20

Added

  • Public dashboard: Container images from docker.io

3.3.0 - 2023-11-09

Added

  • Add Nginx admission controller dashboard.
  • Add Node Problem Detector dashboard.

3.2.4 - 2023-11-06

Added

  • Add KEDA dashboard.

3.2.3 - 2023-10-26

Fixed

  • Fix Kubernetes Events & Resources Count dashboard.

3.2.2 - 2023-10-20

Changed

  • Added rate limit graphs to cilium

3.2.1 - 2023-10-19

fixed

  • Fixed nginx panel in remote-writes dashboard

3.2.0 - 2023-10-19

Changed

  • Prometheus remote-writes: add nginx stats

3.1.0 - 2023-10-13

Changed

  • prometheus remote-writes: filter on URL

3.0.0 - 2023-10-11

Changed

  • Update Loki Cost Estimation dashboard.
  • [BREAKING] Split the chart in sub-chart to get around chart size limitation - all values are moved to a global section.

2.47.0 - 2023-10-11

Changed

  • Update Loki Cost Estimation dashboard.

2.46.0 - 2023-10-09

Fixed

  • Prometheus dashboard: fix node RAM

Added

  • Add custom dashboards documentation link into home panel.
  • Loki dashboards from mixins

Changed

  • Cilium: migrate to Timeseries and change rate timeframe.

2.45.0 - 2023-09-21

Changed

  • Improve DNS logs with filtering options.

Fixed

  • Alerts dashboard: show alerts with missing fields

2.44.0 - 2023-09-19

Changed

  • Alerts dashboard:
    • new filters: team, severity, alertname
    • removed "inhibitions" graph as it was not reliable
    • changed visualization for alerts from graph to timeline
    • added a table listing alerts that have previously fired
    • minor changes to global stat numbers

2.43.0 - 2023-09-19

Added

  • Add internal links between all Prometheus Dashboards.

2.42.0 - 2023-08-31

Changed

  • DNS Dashboard: add logs panel for CoreDNS
  • Kong Connection Distribution: Make public

2.41.0 - 2023-08-30

Changed

  • Prometheus dashboard: logs panel and restart logs annotation

2.40.0 - 2023-08-24

Added

  • Added dashboard showing AWS Load Balancer Controller errors
  • Added Karpenter dashboard

2.39.0 - 2023-08-22

Changed

  • Align owner tags in Grafana cloud dashboards to format owner:TEAM-NAME.

2.38.0 - 2023-08-21

Fixed

  • Added promtail to cpu & memory graphs

2.37.0 - 2023-08-18

Fixed

  • Fixed issues from Loki Cost Estimate dashboard
  • Fixed cluster id in Cilium Metrics dashboard

Changed

  • Update Kyverno health dashboard to be compatible with Kyverno 1.10.
  • Add modified "Kong (official)" dashboard
  • Update Kong-Connection-Distribution dashboard to work with new kong-app versions

2.36.0 - 2023-07-27

Added

  • Add External DNS dashboard.
  • Add Cache panes to DNS dashboard.

2.35.0 - 2023-07-13

Changed

  • Dashboards: Rename nginx-ingress-controller to ingress-nginx. (#331)
  • Allow filtering by DNS zone in the DNS dashboard.

2.34.0 - 2023-07-12

Changed

  • Add MC data into DNS dashboard.

2.33.0 - 2023-07-11

Changed

  • Add per-node brakedown of latency in DNS dashboard.

2.32.1 - 2023-07-11

Fixed

  • Set correct filter in DNS Latency graph for node-local DNS cache.

2.32.0 - 2023-07-10

Changed

  • Improved the DNS dashboard in order to be quicker and richer, especially considering node-local DNS cache and coredns as separate components.

2.31.2 - 2023-07-10

Fixed

  • Fix broken panels in DNS dashboard.

2.31.1 - 2023-06-28

Removed

  • Remove broken managed app sli dashboard.

2.31.0 - 2023-06-23

Added

  • Add CAPI Overview dashboard.
  • Add Mimir Cost Estimate dashboard.
  • Add Prometheus - Mimir Comparative dashboard.
  • Add Kyverno Health dashboard.

2.30.0 - 2023-06-20

Changed

  • Update team label of dashboards related to bigmac components.

2.29.0 - 2023-06-13

Changed

  • Update "Persistent Volume Usage" dashboard.

2.28.4 - 2023-06-06

Added

  • Add Helm charts to Flux cluster dashboard.
  • Add Prometheus Cost Estimate dashboard.

Removed

  • Stop pushing to openstack-app-collection.

2.28.3 - 2023-05-04

Fixed

  • Make Nodes Overview dashboard work for Workload Clusters and assign to team phoenix.

Changed

  • Adjusted the ENA dashboard to have an overview of the data across time

2.28.2 - 2023-05-03

Changed

  • added link from prometheus to prometheus/availability dashboard
  • added link from prometheus/availability to prometheus dashboard

2.28.1 - 2023-05-02

Fixed

  • prometheus availability over period

2.28.0 - 2023-04-20

Added

  • Add Loki cost estimation dashboard

Changed

  • rework of requests vs usage to use the same metrics as kube-mixins and change filtering options
  • requests vs usage dashboard made public

2.27.0 - 2023-04-17

Changed

  • Move cilium dashboard to public dashboards.

2.26.0 - 2023-04-06

Changed

  • Add AWS ENA Performance
  • Updated team labels for team-rocket
  • Add graph in Node Overview to identify emptydir growth
  • Update kube-mixins to 0.12
  • Added Etcd health for monitoring the Etcd key space status
  • Adjusted the K8s Api Perfomance master nodes memory dashoboard and using node_memory_MemAvailable_bytes instead of node_memory_MemFree_bytes

2.25.0 - 2023-03-24

Added

  • Prometheus availability dashboard

Changed

  • Small improvements for Prometheus dashboard

2.24.1 - 2023-03-09

Changed

  • Change the name of metric ETCD Backend Quota Low Space to ETCD Keyspace usage.

2.24.0 - 2023-03-03

Added

  • Prometheus - opsrecipe dashboard
  • Prometheus Overview dashboards - from prometheus-mixins
  • Add ETCD Backend Quota Low Space to K8s API Performance Dashboard.

Changed

  • Make main prometheus dashboard public

2.23.0 - 2023-02-28

Changed

  • Monitoring Dashboard Updated
?? helm/dashboards/dashboards/shared/public/alertmanager-overview.json

Added

  • Alertmanager / Overview Dashboard

Changed

  • Prometheus dashboard improvements: available node resources, scraped metrics info and rules info.

2.22.0 - 2023-02-23

Changed

  • Prometheus dashboard improvements: add volume usage, add memory and cpu limits.
  • README updates
  • Update cimg/go Docker tag from v1.20.0 to v1.20.1
  • Move prometheus-remote-write dashboard to public

2.21.0 - 2023-02-20

Changed

  • updates of Prometheus dashboard
  • updates of Pod request vs usage dashboard

Fixed

  • Datasource for kong-config-reload dashboard

Removed

  • What is presumed to be a debug matcher from upstream

2.20.0 - 2023-01-18

Added

  • Add private dashboard for External Secrets.

2.19.3 - 2023-01-12

Fixed

  • Fix hardcoded interval in the DNS dashboard.

2.19.2 - 2023-01-11

Changed

  • Fix efk-stack-app dashboard

2.19.1 - 2023-01-10

Changed

  • Add dashboards to gcp.

2.19.0 - 2023-01-05

Added

Changed

  • Added uid for some dashboards when missing (etcd, managed-apps-efk-stack-app, management-cluster-kubernetes, microstorage, prometheus-remote-write, workload-cluster-kubernetes)

2.18.0 - 2022-11-22

Changed

  • Add namespace to the Flux reconciliation duration graph.

Added

  • Add user metrics for API audit logs.
  • Add private dashboard for Crossplane.

2.17.0 - 2022-10-20

Added

  • Add dashboard to display Cilium agents' metrics.

Changed

  • Make API Audit dashboard work with management clusters.
  • Move nginx Ingress Controller to public dashboards

2.16.0 - 2022-09-12

Added

  • Add NGINX Ingress Controller Connection Distribution dashboard.
  • Add Kong Connection Distribution dashboard.

2.15.0 - 2022-09-06

Changed

  • Use app label instead of service label to identify metrics in dex dashboard.

Added

  • Add new dashboard for CertificateRequests

Changed

  • Add variable app and upgrade panel types for the "NGINX Ingress controller" dashboard.
  • Metrics dashboard: replace Prometheus total memory with time series per cluster
  • Metrics dashboard: change Prometheus total time series to stat panel type

2.14.0 - 2022-08-12

Added

  • operatorkit dashboard

Changed

  • Deprecated cert-operator dashboard, superceded by operatorkit dashboard

2.13.3 - 2022-07-27

Changed

  • Certificates expiration details file name

2.13.2 - 2022-07-27

Added

  • Certificates expiration details

2.13.1 - 2022-07-26

Added

  • Webhooks performance dashboard

2.13.0 - 2022-07-18

Changed

  • Move ownership of dashboards from Celestial and Firecracker to Phoenix.
  • Set Team Shield as owner of Falco and Starboard dashboards.
  • Split the "Rate of API requests" chart by instance in the "K8s API performance" private dashboard.

2.12.1 - 2022-06-29

Fixed

  • Fix datasource using UID.

2.12.0 - 2022-06-13

Added

  • Add sync mixin scripts.
  • Add Organization filter to shared and AWS public dashboards.

2.11.2 - 2022-05-31

Fixed

  • Yet another fix for the Fairness and Priority charts in the K8s API Performance dashboard.

2.11.1 - 2022-05-30

Fixed

  • Fixed query for the Fairness and Priority charts in the K8s API Performance dashboard.

2.11.0 - 2022-05-30

Changed

  • Add Fairness and Priority charts in the K8s API Performance dashboard.

2.10.0 - 2022-05-26

Changed

  • Updated mixin dashboards

2.9.2 - 2022-05-26

Added

  • Add etcd response times histogram.

Changed

  • Improve Etcd k8s events and resources dashboard
  • Fix KVM usage dashboard

2.9.1 - 2022-05-23

Added

  • Update etcd-k8s-resources-count to etcd-k8s-events-and-resources-count

2.9.0 - 2022-05-18

Added

  • Add dashboard for Fluentbit.

2.8.0 - 2022-05-17

Added

  • Add new dashboard 'Webhook Health'.
  • Add dashboard for Fluentbit.

Changed

  • Fixes and updates to grafana dashboard.
  • Improve K8s API performance dashboard.

2.7.0 - 2022-05-12

Changed

  • Improve K8s API performance dashboard.

2.6.0 - 2022-05-10

Changed

  • Improve K8s API performance dashboard.

2.5.0 - 2022-05-09

Changed

  • Improve K8s API performance dashboard.

2.4.2 - 2022-05-05

Fixed

  • Fix query in K8s API performance dashboard to make it more reliable.

2.4.1 - 2022-05-05

Fixed

  • Only display master nodes metrics in K8s API performance dashboard.

2.4.0 - 2022-05-04

Added

  • Add new dashboard 'K8s API performance'.

2.3.0 - 2022-04-28

Changed

  • Make DNS dasbhoard public.

2.2.0 - 2022-04-21

Changed

  • Improved DNS dashboard to visualize data about node-local DNS cache.

2.1.1 - 2022-04-19

Changed

  • Make kube-proxy dashboard UX compliant.
  • Fix ceph-cluster dashboard datasource.

2.1.0 - 2022-04-05

Changed

  • Split each dashboards into specific configmaps
  • Make mixin dashboard private

2.0.0 - 2022-04-04

Added

  • Add all dashboards from g8s-grafana.
  • Add grafana sidecar annotation to all configmaps.
  • Add a dashboard for ceph cluster usage in KVM.

1.11.0 - 2022-03-29

Added

  • Add a dashboard for kube-proxy.

1.10.2 - 2022-03-18

Fixed

  • Fix query in AWS Cluster Status dashboard.

1.10.1 - 2022-02-28

Fixed

  • Fix kube_node_status_capacity_ metrics.

1.10.0 - 2022-02-23

Added

  • Add support for arbitrary providers instead of hardcoding dashboard settings per provider.

1.9.1 - 2022-01-20

Changed

  • Dex dashboard:
    • Add new panel to show requests with more details.
    • Exclude health check requests (handler /healthz) from the successful requests graph.

1.9.0 - 2022-01-13

Added

  • Add a new dashboard showing success and error responses for SSO via dex.

1.8.1 - 2021-12-02

Fixed

  • Fix metrics dashboard showing incorrect values for number of time series per installation and customer.

1.8.0 - 2021-12-01

Deleted

  • Delete Azure Load Balancer Backend Nodes dashboard.

1.7.0 - 2021-11-24

Added

  • Add new dashboard to visualize API audit metrics

1.6.0 - 2021-11-09

Added

  • Add FluxCD dashboards.

1.5.1 - 2021-10-20

Changed

  • Update description and icon

1.5.0 - 2021-09-17

Added

  • Add AWS Cluster Status Dashboard
  • Add Managed Apps dashboard.

1.4.0 - 2021-09-06

Added

  • Add Grafana Analytics dashboard.
  • Add Azure Load Balancer Instances dashboard.

1.3.0 - 2021-08-18

Added

  • Added macropower-analytics-panel to public dashboards.

1.2.0 - 2021-08-13

Changed

  • Migrate to configuration management.
  • Update architect-orb to v4.0.0.

1.1.0 - 2021-08-05

Added

  • Add KVM Resource Usage dashboard link to the home page for KVM only.
  • Allow templating in dashboards.

1.0.5 - 2021-07-27

  • Several updates to the KVM Resource Usage dashboard:
    • Change graph (old) to Grafana 8 time series panels
    • Change graphs for CPUs and memory left from absolute numbers to percentage

1.0.4 - 2021-07-26

  • Moved KVM dashboard to a provider dedicated folder.

0.1.3 - 2021-07-23

Added

  • New dashboard to expose KVM information, helping with capacity planning and observability.

Fixed

  • Updated the link to the source repo of the Prometheus rules and alerts in the Alerts dashboard.

0.1.2 - 2021-06-29

Changed

  • Update the Nodes Overview dashboard to use Grafana 8 time series instead of graph panels.
  • Update the Alerts dashboard to use Grafana 8 time series instead of graph panels.
  • Update Home dashboard to only show release notes related to dashboards.

Fixed

  • A typo in the alerts dashboard.

0.1.1 - 2021-06-25

Fixed

  • Fix CI build

0.1.0 - 2021-06-25

Added

  • Add public grafana dashboards.