HOSTEDCP-1044: Add NodePool Size/Replicas to Telemetry #2682

muraee · 2023-06-13T14:17:33Z

What this PR does / why we need it:
Add hypershift_nodepools_size and hypershift_nodepools_available_replicas metrics to Telemetry

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #HOSTEDCP-1044

Checklist

Subject and description added to both, commit and PR.
Relevant issues have been referenced.
This change includes docs.
This change includes unit tests.

openshift-ci-robot · 2023-06-13T14:17:37Z

@muraee: This pull request references HOSTEDCP-1044 which is a valid jira issue.

In response to this:

What this PR does / why we need it:
Add hypershift_nodepools_size and hypershift_nodepools_available_replicas metrics to Telemetry

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #HOSTEDCP-1044

Checklist

Subject and description added to both, commit and PR.

Relevant issues have been referenced.

This change includes docs.

This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-06-13T14:24:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: muraee
Once this PR has been reviewed and has the lgtm label, please assign enxebre for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enxebre · 2023-06-14T09:52:40Z

cmd/install/assets/recordingrules/hypershift.yaml

+    expr: max by(platform) (hypershift_nodepools)
+
+  - record: hypershift:nodepools:size


would this record as many NodePools as we have in a platform? e.g in a management cluster with 80 HC and 3 NodePools each that would be 240 time series.
Do we know what's our cardinality budget for telemetry? cc @csrwng

right, other alternative is to use sum by (platform) which would yield the total number of nodes across all clusters/nodepools, but I am not sure if that would be useful.

openshift-ci · 2023-07-22T00:08:23Z

@muraee: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws	`2bf624e`	link	true	`/test e2e-aws`
ci/prow/e2e-kubevirt-aws-ovn	`2bf624e`	link	true	`/test e2e-kubevirt-aws-ovn`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2023-10-21T01:00:14Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2023-11-20T08:30:59Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-12-21T00:00:19Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-12-21T00:00:52Z

@openshift-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

muraee · 2024-02-16T13:40:25Z

/reopen

openshift-ci · 2024-02-16T13:42:42Z

@muraee: Failed to re-open PR: state cannot be changed. There is already an open pull request from muraee:nodepools-metrics-telemetry to openshift:main.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

add nodepools metric for telemetry

2bf624e

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 13, 2023

openshift-ci bot added the do-not-merge/needs-area label Jun 13, 2023

openshift-ci bot requested review from enxebre and isco-rodriguez June 13, 2023 14:23

openshift-ci bot added area/cli Indicates the PR includes changes for CLI and removed do-not-merge/needs-area labels Jun 13, 2023

enxebre reviewed Jun 14, 2023

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 21, 2023

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 20, 2023

openshift-ci bot closed this Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HOSTEDCP-1044: Add NodePool Size/Replicas to Telemetry #2682

HOSTEDCP-1044: Add NodePool Size/Replicas to Telemetry #2682

muraee commented Jun 13, 2023

openshift-ci-robot commented Jun 13, 2023 •

edited by openshift-ci bot

openshift-ci bot commented Jun 13, 2023

enxebre Jun 14, 2023

muraee Jun 14, 2023 •

edited

openshift-ci bot commented Jul 22, 2023

openshift-bot commented Oct 21, 2023

openshift-bot commented Nov 20, 2023

openshift-bot commented Dec 21, 2023

openshift-ci bot commented Dec 21, 2023

muraee commented Feb 16, 2024

openshift-ci bot commented Feb 16, 2024

		expr: max by(platform) (hypershift_nodepools)

		- record: hypershift:nodepools:size

HOSTEDCP-1044: Add NodePool Size/Replicas to Telemetry #2682

HOSTEDCP-1044: Add NodePool Size/Replicas to Telemetry #2682

Conversation

muraee commented Jun 13, 2023

openshift-ci-robot commented Jun 13, 2023 • edited by openshift-ci bot

openshift-ci bot commented Jun 13, 2023

enxebre Jun 14, 2023

Choose a reason for hiding this comment

muraee Jun 14, 2023 • edited

Choose a reason for hiding this comment

openshift-ci bot commented Jul 22, 2023

openshift-bot commented Oct 21, 2023

openshift-bot commented Nov 20, 2023

openshift-bot commented Dec 21, 2023

openshift-ci bot commented Dec 21, 2023

muraee commented Feb 16, 2024

openshift-ci bot commented Feb 16, 2024

openshift-ci-robot commented Jun 13, 2023 •

edited by openshift-ci bot

muraee Jun 14, 2023 •

edited