Update grafana-dashboard.json #1916

itay-grudev · 2023-04-12T15:24:29Z

New Cluster Overview

A new cluster overview section at the top that features:

Base backup and last archived WAL information
Total CPU and memory usage across all nodes
An Alerts section
Last failover/switchover time
Cluster PostgreSQL version
Transactions per second
Volume usage and total database size
Replication/Write/Flush/Replay lag

Server Health

Added instance zone
Displaying the full PostgreSQL version
Bug Fix: When there are redundant kube-state-metrics instances there are duplicate status gauges
Bug Fix: Max Connections not displaying a progress bar in the gauge due to missing min and max values.

Before:

After:

Configuration

The configuration parameters are transposed such that every row now contains a parameter, while every column contains the parameter setting across individual database instances. This makes it much easier to scroll through settings.

Before:

After:

New Storage & IO Space and Inode Usage metrics

Added volume space and inode usage gauges.

Accumulated Tuple/IO graph to make the data more comprehensible

Before:

After:

General

Fixed tooltips and set shared crosshairs of all graphs.
Bug Fix: Cluster variable query picking up any metric ending with cluster

github-actions · 2023-04-12T15:24:42Z

❗ By default, the pull request is configured to backport to all release branches.

To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

jsilvela · 2023-05-05T16:18:37Z

Sorry for the delay. Been deploying this locally.
There is a problem with the datasource references as you have them in your files.

"datasource": {
        "type": "prometheus",
        "uid": "${DS_PROMETHEUS}"
}

These will not work properly with the Grafana as we deploy from the quickstart, using the kube-prometheus-stack.
We can substitute the "${DS_PROMETHEUS}" with "prometheus", as that uid is generated for the data source in the kube-prometheus-stack
deployment.

jsilvela · 2023-05-05T16:21:19Z

Since the JSON file you generated, in particular, you took from a grafana deployed with a different mechanism, I think there is value in keeping it as-is.
The YAML though, I would like to alter to make sure beginners following the quickstart will get the dashboard working out of the box.
So, I plan to add a bit more context in the documentation, and perhaps fix your files but add your JSON untouched but with a new name.

jsilvela · 2023-05-05T16:23:36Z

On the dashboard itself, the amount of work and polish is ... 👏 bravo.

jsilvela · 2023-05-08T07:16:54Z

With Itay's latest commit, the dashboard works out of the box if following the quickstart document.
Doing a detail review of the panels...

sxd · 2023-05-23T06:56:21Z

@itay-grudev fallowing @jsilvela advice, I think we should add the disclaimer into the quickstart with this PR adding that some features may fails because they need a really new/early version of Grafana.

I'm also adding my self as a reviewer to this PR because I really want to try ti! :D

benoitschipper · 2023-05-23T07:49:58Z

@sxd and @itay-grudev ill happily test/review what I can. Already have the dashboard running :)

itay-grudev · 2023-05-23T08:40:06Z

@sxd I discussed offline with @jsilvela that the alpha metrics are enabled by default on Prometheus. I think it's a limitation of KinD that the volume data is not reported. I've tested it on AWS EKS and Digital Ocean and there it works fine.

I still need to add the change @benoitschipper proposed.

sxd · 2023-05-23T09:11:37Z

@itay-grudev I've added a warning message that will help users to understand why the graphs doesn't have data, can you review it please! I understand that those are limited in KinD, but lots of users use KinD for testing and at least we need to be nice and put the warning message

benoitschipper · 2023-05-24T04:27:26Z

@sxd I discussed offline with @jsilvela that the alpha metrics are enabled by default on Prometheus. I think it's a limitation of KinD that the volume data is not reported. I've tested it on AWS EKS and Digital Ocean and there it works fine.

I still need to add the change @benoitschipper proposed.

Ill check/review it by testing it as soon as you have added the regex proposal so I can confirm if it works.

itay-grudev · 2023-05-25T16:09:38Z

@benoitschipper Done. Test if it works for you now.

benoitschipper · 2023-05-25T16:40:57Z

@benoitschipper Done. Test if it works for you now.

Will do!

benoitschipper

@itay-grudev I loaded the Dashboard and went through all the graphs on a live cluster in two seperate clusters. All worked without me making any changes (only the name). Works like a charm!

Signed-off-by: Itay Grudev <igrudev@clustermarket.com>

* Base backup and last archived WAL information * Total CPU and memory usage across all nodes * An Alerts section * Last failover/switchover time * Cluster PostgreSQL version * Transactions per second * Volume usage and total database size * Replication/Write/Flush/Replay lag * Added instance zone * Displaying the full PostgreSQL version * Bug Fix: When there are redundant `kube-state-metrics` instances there are duplicate status gauges * Bug Fix: Max Connections not displaying a progress bar in the gauge due to missing `min` and `max` values. * Transposed configuration section * Added volume space and inode usage gauges. * Fixed tooltips and set shared crosshairs of all graphs. * Bug Fix: Cluster variable query picking up any metric ending with cluster --------- Signed-off-by: Itay Grudev <igrudev@clustermarket.com> (cherry picked from commit aea4365)

itay-grudev requested review from fcanovai, gbartolini, leonardoce, mnencia, phisco, sxd and armru as code owners April 12, 2023 15:24

github-actions bot added backport-requested ◀️ This pull request should be backported to all supported releases release-1.18 release-1.19 labels Apr 12, 2023

itay-grudev force-pushed the grafana-dashboard branch 11 times, most recently from 744a2d8 to c5532ee Compare April 21, 2023 19:57

mnencia assigned jsilvela May 3, 2023

jsilvela force-pushed the grafana-dashboard branch from c5532ee to 75276dd Compare May 4, 2023 10:14

itay-grudev force-pushed the grafana-dashboard branch from 75276dd to 29c2aed Compare May 7, 2023 16:11

jsilvela added the no-issue label May 8, 2023

sxd self-assigned this May 23, 2023

sxd force-pushed the grafana-dashboard branch from 3abaaa2 to ad225d3 Compare May 23, 2023 07:47

itay-grudev force-pushed the grafana-dashboard branch 3 times, most recently from ae65888 to 4df2bb3 Compare May 25, 2023 16:00

benoitschipper approved these changes May 26, 2023

View reviewed changes

sxd force-pushed the grafana-dashboard branch from 4df2bb3 to 773086c Compare May 26, 2023 12:17

jsilvela approved these changes May 26, 2023

View reviewed changes

sxd force-pushed the grafana-dashboard branch from 773086c to 141aa3a Compare May 27, 2023 10:26

sxd approved these changes May 27, 2023

View reviewed changes

sxd added the ok to merge 👌 This PR can be merged label May 27, 2023

gbartolini added the release-1.20 label May 28, 2023

jsilvela force-pushed the grafana-dashboard branch from 141aa3a to 4822f15 Compare May 31, 2023 11:34

Itay Grudev added 2 commits May 31, 2023 18:02

Added some documentation about alpha metrics used

17ec3ff

Signed-off-by: Itay Grudev <igrudev@clustermarket.com>

Grafana Dashboard Improvement

d1e7086

Signed-off-by: Itay Grudev <igrudev@clustermarket.com>

jsilvela force-pushed the grafana-dashboard branch from 4822f15 to d1e7086 Compare May 31, 2023 16:02

jsilvela merged commit aea4365 into cloudnative-pg:main May 31, 2023
13 checks passed

itay-grudev deleted the grafana-dashboard branch June 2, 2023 08:33

igorrenquin mentioned this pull request Aug 1, 2023

CNPG: MAJ SocialGouv/support#454

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update grafana-dashboard.json #1916

Update grafana-dashboard.json #1916

itay-grudev commented Apr 12, 2023 •

edited

github-actions bot commented Apr 12, 2023

jsilvela commented May 5, 2023 •

edited

jsilvela commented May 5, 2023

jsilvela commented May 5, 2023

jsilvela commented May 8, 2023

sxd commented May 23, 2023

benoitschipper commented May 23, 2023 •

edited

itay-grudev commented May 23, 2023

sxd commented May 23, 2023

benoitschipper commented May 24, 2023 •

edited

itay-grudev commented May 25, 2023

benoitschipper commented May 25, 2023

benoitschipper left a comment •

edited

Update grafana-dashboard.json #1916

Update grafana-dashboard.json #1916

Conversation

itay-grudev commented Apr 12, 2023 • edited

New Cluster Overview

Server Health

Configuration

New Storage & IO Space and Inode Usage metrics

Accumulated Tuple/IO graph to make the data more comprehensible

General

github-actions bot commented Apr 12, 2023

jsilvela commented May 5, 2023 • edited

jsilvela commented May 5, 2023

jsilvela commented May 5, 2023

jsilvela commented May 8, 2023

sxd commented May 23, 2023

benoitschipper commented May 23, 2023 • edited

itay-grudev commented May 23, 2023

sxd commented May 23, 2023

benoitschipper commented May 24, 2023 • edited

itay-grudev commented May 25, 2023

benoitschipper commented May 25, 2023

benoitschipper left a comment • edited

Choose a reason for hiding this comment

itay-grudev commented Apr 12, 2023 •

edited

jsilvela commented May 5, 2023 •

edited

benoitschipper commented May 23, 2023 •

edited

benoitschipper commented May 24, 2023 •

edited

benoitschipper left a comment •

edited