Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement: Add owner_node on resource and resourcegroup for mscluster #1395

Merged
merged 17 commits into from
Apr 21, 2024

Conversation

webalexeu
Copy link
Contributor

@webalexeu webalexeu commented Jan 25, 2024

Hello,

This PR will add owner_node label on resource and resourcegroup metrics for mscluster
(It has been tested)

Let me know if you need additionnal details or some code adjustements

Thank you,
Alex

@webalexeu
Copy link
Contributor Author

webalexeu commented Feb 3, 2024

Hello reviewers @jkroepke @breed808,

Can you please have a look at this PR ?

Thank you,
Alex

@jkroepke
Copy link
Member

jkroepke commented Feb 3, 2024

Hi @webalexeu

I have this PR on my list, however I'm currently limited in PR review, since a destroyed my laptop weeks ago.

@webalexeu
Copy link
Contributor Author

Hi @webalexeu

I have this PR on my list, however I'm currently limited in PR review, since a destroyed my laptop weeks ago.

Do you want to have an output before and after the change ?

I can also provide the compiled version as well if it can help you

@jkroepke
Copy link
Member

jkroepke commented Feb 4, 2024

If you have a screenshot from the owner inside the sql system + the output from the metrics, it would help a lot. I dont have any mscluster expericence.

@webalexeu
Copy link
Contributor Author

webalexeu commented Feb 4, 2024

If you have a screenshot from the owner inside the sql system + the output from the metrics, it would help a lot. I dont have any mscluster expericence.

This is the output of the metrics and the cmdlet output of the cluster associated
(To keep output short, only the state but owner_node is defined on every metrics generated for both collectors)

windows_mscluster_resource_state gauge

windows_mscluster_resource_state{name="Cluster IP DC1",owner_group="Cluster",owner_node="node2",type="IP Address"} 3
windows_mscluster_resource_state{name="Cluster IP DC2",owner_group="Cluster",owner_node="node2",type="IP Address"} 3
windows_mscluster_resource_state{name="Cluster IP DC3",owner_group="Cluster",owner_node="node2",type="IP Address"} 2
windows_mscluster_resource_state{name="Cluster VIP",owner_group="Cluster",owner_node="node2",type="Network Name"} 2
windows_mscluster_resource_state{name="sql1",owner_group="sql1",owner_node="node2",type="SQL Server Availability Group"} 2
windows_mscluster_resource_state{name="sql1_ip_dc1",owner_group="sql1",owner_node="node2",type="IP Address"} 3
windows_mscluster_resource_state{name="sql1_ip_dc2",owner_group="sql1",owner_node="node2",type="IP Address"} 2
windows_mscluster_resource_state{name="sql1_ip_dc3",owner_group="sql1",owner_node="node2",type="IP Address"} 3
windows_mscluster_resource_state{name="sql1_sql1-ag1",owner_group="sql1",owner_node="node2",type="Network Name"} 2

Get-ClusterResource | Select Name,OwnerGroup,OwnerNode,ResourceType,State | ft
Name           OwnerGroup OwnerNode ResourceType                    State
----           ---------- --------- ------------                    -----
Cluster IP DC3 Cluster    node2     IP Address                    Offline
Cluster IP DC1 Cluster    node2     IP Address                    Offline
Cluster IP DC2 Cluster    node2     IP Address                     Online
Cluster VIP    Cluster    node2     Network Name                   Online
sql1           sql1       node2     SQL Server Availability Group  Online
sql1_ip_dc1    sql1       node2     IP Address                    Offline
sql1_ip_dc2    sql1       node2     IP Address                     Online
sql1_ip_dc3    sql1       node2     IP Address                    Offline
sql1_sql1-ag1  sql1       node2     Network Name                   Online

windows_mscluster_resourcegroup_state gauge

windows_mscluster_resourcegroup_state{name="Available Storage",owner_node="node2"} 1
windows_mscluster_resourcegroup_state{name="Cluster",owner_node="node2"} 0
windows_mscluster_resourcegroup_state{name="sql1",owner_node="node2"} 0

Get-ClusterGroup
Name              OwnerNode   State
----              ---------   -----
Available Storage node2     Offline
Cluster           node2      Online
sql1              node2      Online

@jkroepke
Copy link
Member

jkroepke commented Feb 4, 2024

I see, the owner node is something what could be changes from time to time, right?

Reading https://prometheus.io/docs/practices/naming/#labels

CAUTION: Remember that every unique combination of key-value label pairs represents a new time series, which can dramatically increase the amount of data stored. Do not use labels to store dimensions with high cardinality (many different label values), such as user IDs, email addresses, or other unbounded sets of values.

It's an sub-optimal use-case for an label value.

I can see, that owner node is attached to each metric. Is it sufficient to have the label only on windows_mscluster_resource_state and windows_mscluster_resourcegroup_state?

The label could be attach to results via operator, e.g.:

metric + on(name) group_left(owner_node) windows_mscluster_resource_state * 0

I'm aware that owner_group is attached everywhere, too. Thats also something, was is not strictly nessesary.

@webalexeu
Copy link
Contributor Author

Indeed, OwnerNode will change on every failover of ressources within the cluster

There is no need indeed to have that label on every metrics and have it on state should be enough

I will update then my PR

Thanks for the review

DiniFarb and others added 15 commits February 16, 2024 08:54
Signed-off-by: Dinifarb <andreas.vogt89@bluewin.ch>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
…rsFlag

Signed-off-by: Dinifarb <andreas.vogt89@bluewin.ch>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.45.0 to 0.46.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.45.0...v0.46.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Bumps [github.com/containerd/containerd](https://github.com/containerd/containerd) from 1.7.0 to 1.7.11.
- [Release notes](https://github.com/containerd/containerd/releases)
- [Changelog](https://github.com/containerd/containerd/blob/main/RELEASES.md)
- [Commits](containerd/containerd@v1.7.0...v1.7.11)

---
updated-dependencies:
- dependency-name: github.com/containerd/containerd
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: dinifarb <andreas.vogt89@bluewin.ch>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Bumps [github.com/yusufpapurcu/wmi](https://github.com/yusufpapurcu/wmi) from 1.2.3 to 1.2.4.
- [Release notes](https://github.com/yusufpapurcu/wmi/releases)
- [Commits](yusufpapurcu/wmi@v1.2.3...v1.2.4)

---
updated-dependencies:
- dependency-name: github.com/yusufpapurcu/wmi
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Microsoft currently support Windows Server 2016 or newer, and Windows 10
and Windows 11 (21HR or later). Dropping support for end-of-life Windows
Server versions will reduce maintenance overhead for project
maintainers.

Signed-off-by: Ben Reedy <breed808@breed808.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Co-authored-by: Jan-Otto Kröpke <github@jkroepke.de>
Signed-off-by: Ben Reedy <breed808@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.16.0 to 0.17.0.
- [Commits](golang/sys@v0.16.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
Signed-off-by: Alexandre JARDON <28548335+webalexeu@users.noreply.github.com>
@jkroepke jkroepke marked this pull request as draft February 16, 2024 14:37
@jkroepke
Copy link
Member

I convert the PR to draft, please let me know, when you finished

@webalexeu
Copy link
Contributor Author

webalexeu commented Feb 16, 2024

Hello @jkroepke ,
I've review the PR based on our discussion
Finally, I've generate new metrics for owner node and not new labels
So based on the previous example, here is the new metrics generated:

HELP windows_mscluster_resource_owner_node The node hosting the resource. 0: Not hosted; 1: Hosted
TYPE windows_mscluster_resource_owner_node gauge

windows_mscluster_resource_owner_node{name="Cluster IP DC1",owner_group="Cluster",node_name="node1",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="Cluster IP DC1",owner_group="Cluster",node_name="node2",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="Cluster IP DC2",owner_group="Cluster",node_name="node1",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="Cluster IP DC2",owner_group="Cluster",node_name="node2",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="Cluster IP DC3",owner_group="Cluster",node_name="node1",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="Cluster IP DC3",owner_group="Cluster",node_name="node2",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="Cluster VIP",owner_group="Cluster",node_name="node1",type="Network Name"} 1
windows_mscluster_resource_owner_node{name="Cluster VIP",owner_group="Cluster",node_name="node2",type="Network Name"} 0
windows_mscluster_resource_owner_node{name="sql1",owner_group="sql1",node_name="node1",type="SQL Server Availability Group"} 0
windows_mscluster_resource_owner_node{name="sql1",owner_group="sql1",node_name="node2",type="SQL Server Availability Group"} 1
windows_mscluster_resource_owner_node{name="sql1_ip_dc1",owner_group="sql1",node_name="node1",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="sql1_ip_dc1",owner_group="sql1",node_name="node2",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="sql1_ip_dc2",owner_group="sql1",node_name="node1",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="sql1_ip_dc2",owner_group="sql1",node_name="node2",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="sql1_ip_dc3",owner_group="sql1",node_name="node1",type="IP Address"} 0
windows_mscluster_resource_owner_node{name="sql1_ip_dc3",owner_group="sql1",node_name="node2",type="IP Address"} 1
windows_mscluster_resource_owner_node{name="sql1_sql1-ag1",owner_group="sql1",node_name="node1",type="Network Name"} 0
windows_mscluster_resource_owner_node{name="sql1_sql1-ag1",owner_group="sql1",node_name="node2",type="Network Name"} 1

Get-ClusterResource | Select Name,OwnerGroup,OwnerNode,ResourceType,State | ft
Name           OwnerGroup OwnerNode ResourceType                    State
----           ---------- --------- ------------                    -----
Cluster IP DC3 Cluster    node1     IP Address                    Offline
Cluster IP DC1 Cluster    node1     IP Address                    Offline
Cluster IP DC2 Cluster    node1     IP Address                     Online
Cluster VIP    Cluster    node1     Network Name                   Online
sql1           sql1       node2     SQL Server Availability Group  Online
sql1_ip_dc1    sql1       node2     IP Address                    Offline
sql1_ip_dc2    sql1       node2     IP Address                     Online
sql1_ip_dc3    sql1       node2     IP Address                    Offline
sql1_sql1-ag1  sql1       node2     Network Name                   Online

HELP windows_mscluster_resourcegroup_owner_node The node hosting the resource group. 0: Not hosted; 1: Hosted
TYPE windows_mscluster_resourcegroup_owner_node gauge

windows_mscluster_resourcegroup_owner_node{name="Available Storage",node_name="node1"} 0
windows_mscluster_resourcegroup_owner_node{name="Available Storage",node_name="node2"} 1
windows_mscluster_resourcegroup_owner_node{name="Cluster",node_name="node1"} 1
windows_mscluster_resourcegroup_owner_node{name="Cluster",node_name="node2"} 0
windows_mscluster_resourcegroup_owner_node{name="sql1",node_name="node1"} 0
windows_mscluster_resourcegroup_owner_node{name="sql1",node_name="node2"} 1

Get-ClusterGroup
Name              OwnerNode   State
----              ---------   -----
Available Storage node2     Offline
Cluster           node1      Online
sql1              node2      Online

Let me know if you want additionnal details or some code review

Thank you

@webalexeu webalexeu marked this pull request as ready for review February 16, 2024 15:17
@jkroepke
Copy link
Member

@webalexeu It looks good to me here.

You last post explains the feature well. Could you please copy the content to the docs docs/collector.mscluster_resource.md? It explains it really well!

@webalexeu
Copy link
Contributor Author

@webalexeu It looks good to me here.

You last post explains the feature well. Could you please copy the content to the docs docs/collector.mscluster_resource.md? It explains it really well!

I already put example and useful queries

Under which section do you want me to put it ?

Copy link
Member

@jkroepke jkroepke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@jkroepke jkroepke merged commit b1c272a into prometheus-community:master Apr 21, 2024
5 checks passed
@webalexeu
Copy link
Contributor Author

Hello @jkroepke ,

Any idea when you will release a new version of the exporter ?

Have a nice weekend,
Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants