-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1806518: Monitoring Dashboard metrics corrections #4323
Conversation
@abhi-kn: This pull request references Bugzilla bug 1802300, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug |
\cc @openshift/openshift-team-monitoring |
@@ -71,12 +71,21 @@ export const monitoringDashboardQueries: MonitoringQuery[] = [ | |||
humanize: humanizeBinaryBytes, | |||
byteDataType: ByteDataTypes.BinaryBytes, | |||
}, | |||
{ | |||
query: _.template( | |||
`topk(25, sort_desc(sum((kubelet_volume_stats_used_bytes {namespace='<%= namespace %>'}* on (namespace,persistentvolumeclaim) group_right() kube_pod_spec_volumes_persistentvolumeclaims_info)) by (namespace,pod)))`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as far as I know these metrics refer to persistent volumes, the title though states Local Storage Usage
. Is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @abhi-kn
@@ -85,7 +94,7 @@ export const monitoringDashboardQueries: MonitoringQuery[] = [ | |||
`topk(25, sort_desc(sum(rate(container_network_receive_bytes_total{ container="POD", pod!= "", namespace = '<%= namespace %>'}[5m]) + rate(container_network_transmit_bytes_total{ container="POD", pod!= "", namespace = '<%= namespace %>'}[5m])) BY (namespace, pod)))`, | |||
), | |||
chartType: GraphTypes.line, | |||
title: 'Network Received ', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Query includes both Received & Transmitted network so changed the title as per query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about that? If so then that seems like a bug because there is the separate metric container_network_transmit_bytes_total
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separated received & transmitted bandwidth.
{ | ||
query: _.template( | ||
`topk(25, sort_desc(sum(pod:container_fs_usage_bytes:sum{container="",pod!="",namespace='<%= namespace %>'}) BY (pod, namespace)))`, | ||
), | ||
chartType: GraphTypes.line, | ||
title: 'Filesystem Usage', | ||
title: 'Attached Storage Usage', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the metric used in the query refers to the amount of space the containers rootfs uses, which I don't think is particularly interesting to have on display here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@serenamarie125 do you still want to keep it or shall I remove?
/bugzilla refresh |
@spadgett: This pull request references Bugzilla bug 1802300, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@abhi-kn: This pull request references Bugzilla bug 1806518, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@abhi-kn: This pull request references Bugzilla bug 1806518, which is valid. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@serenamarie125 please have a look |
@abhi-kn PR needs rebase cc @vikram-raj |
/retest |
@abhi-kn: This pull request references Bugzilla bug 1806518, which is valid. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/cherry-pick release-4.4 |
@abhi-kn: once the present PR merges, I will cherry-pick it on top of release-4.4 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhi-kn, christianvogt, invincibleJai The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
12 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@abhi-kn: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@abhi-kn: All pull requests linked via external trackers have merged. Bugzilla bug 1806518 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.4 |
@andrewballantyne: new pull request created: #4566 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1806518
https://issues.redhat.com/browse/ODC-2943
Analysis / Root cause:
Metrics queries for Dashboard & Metrics tab have been updated in design doc. Hence code need to be synced w.r.t queries / order / labels.
Step count in request was too low (6) & was leading to performance issue due to huge data set in response.
Solution Description:
Updated labels for OOB queries in Monitoring/Metrics.
Synchronised OOB queries in Dashboard & Workload metrics.
Increased step count to 60 by setting
defaultSamples
count which fixed performance issue.Screen shots / Gifs for design review:
@openshift/team-devconsole-ux
Test setup: N/A
Browser conformance: