Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1986061: Monitor openshift-network-diagnostics namespace #1190

Conversation

creydr
Copy link
Member

@creydr creydr commented Aug 26, 2021

CNO deploys a service monitor named "network-check-source" which is never picked up by prometheus.

This PR adds the openshift.io/cluster-monitoring="true" label to the openshift-network-diagnostics namespace and fixes some RBAC so the service is picked up for monitoring.

How to verify:

  1. Spin up a cluster with this patch
  2. Login to the cluster dashboard, go to Observe -> Metrics and query for one of the metrics of the checkendpoints pod, e.g. pod_network_connectivity_check_count
    endpointCheckCounter = metrics.NewCounterVec(&metrics.CounterOpts{
    Name: "pod_network_connectivity_check_count",
    Help: "Report status of pod network connectivity checks over time.",
    }, []string{"component", "checkName", "targetEndpoint", "tcpConnect", "dnsResolve"})
    tcpConnectLatencyGauge = metrics.NewGaugeVec(&metrics.GaugeOpts{
    Name: "pod_network_connectivity_check_tcp_connect_latency_gauge",
    Help: "Report latency of TCP connect to target endpoint over time.",
    }, []string{"component", "checkName", "targetEndpoint"})
    dnsResolveLatencyGauge = metrics.NewGaugeVec(&metrics.GaugeOpts{
    Name: "pod_network_connectivity_check_dns_resolve_latency_gauge",
    Help: "Report latency of DNS resolve of target endpoint over time.",
    }, []string{"component", "checkName", "targetEndpoint"})

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Aug 26, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2021

@creydr: This pull request references Bugzilla bug 1986061, which is invalid:

  • expected the bug to target the "4.9.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1986061: Monitor openshift-network-diagnostics namespace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 26, 2021

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot requested review from aojea and trozet August 26, 2021 10:52
@creydr creydr force-pushed the monitor-openshift-network-diagnostics-namespace branch 2 times, most recently from 585dccf to 81089ce Compare August 27, 2021 07:46
Signed-off-by: Christoph Stäbler <cstabler@redhat.com>
@creydr creydr force-pushed the monitor-openshift-network-diagnostics-namespace branch 2 times, most recently from 517904e to 7829aaf Compare August 27, 2021 08:26
Signed-off-by: Christoph Stäbler <cstabler@redhat.com>
@creydr creydr force-pushed the monitor-openshift-network-diagnostics-namespace branch from 7829aaf to 71b5c3f Compare August 27, 2021 08:26
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@creydr: This pull request references Bugzilla bug 1986061, which is invalid:

  • expected the bug to target the "4.9.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 1986061: Monitor openshift-network-diagnostics namespace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@creydr creydr marked this pull request as ready for review August 27, 2021 08:34
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 27, 2021
@creydr
Copy link
Member Author

creydr commented Aug 27, 2021

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Aug 27, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@creydr: This pull request references Bugzilla bug 1986061, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @zhaozhanqi

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from zhaozhanqi August 27, 2021 08:35
@creydr
Copy link
Member Author

creydr commented Aug 27, 2021

/bugzilla refresh
/retest required

@openshift-ci openshift-ci bot removed the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Aug 27, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@creydr: This pull request references Bugzilla bug 1986061, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @zhaozhanqi

In response to this:

/bugzilla refresh
/retest required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. label Aug 27, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 27, 2021

@creydr: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test e2e-agnostic-upgrade
  • /test e2e-aws-ovn-windows
  • /test e2e-aws-sdn-multi
  • /test e2e-gcp
  • /test e2e-gcp-ovn
  • /test e2e-metal-ipi-ovn-ipv6
  • /test images
  • /test unit
  • /test verify

The following commands are available to trigger optional jobs:

  • /test e2e-azure-ovn
  • /test e2e-azure-ovn-dualstack
  • /test e2e-gcp-ovn-upgrade
  • /test e2e-metal-ipi-ovn-ipv6-ipsec
  • /test e2e-openstack
  • /test e2e-openstack-kuryr
  • /test e2e-openstack-ovn
  • /test e2e-ovn-hybrid-step-registry
  • /test e2e-ovn-ipsec-step-registry
  • /test e2e-ovn-step-registry
  • /test e2e-vsphere-ovn
  • /test e2e-vsphere-windows

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-cluster-network-operator-master-e2e-agnostic-upgrade
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-windows
  • pull-ci-openshift-cluster-network-operator-master-e2e-aws-sdn-multi
  • pull-ci-openshift-cluster-network-operator-master-e2e-azure-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-gcp-ovn-upgrade
  • pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6
  • pull-ci-openshift-cluster-network-operator-master-e2e-metal-ipi-ovn-ipv6-ipsec
  • pull-ci-openshift-cluster-network-operator-master-e2e-openstack-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-hybrid-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-ipsec-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-ovn-step-registry
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-ovn
  • pull-ci-openshift-cluster-network-operator-master-e2e-vsphere-windows
  • pull-ci-openshift-cluster-network-operator-master-images
  • pull-ci-openshift-cluster-network-operator-master-unit
  • pull-ci-openshift-cluster-network-operator-master-verify

In response to this:

/bugzilla refresh
/retest required

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@creydr
Copy link
Member Author

creydr commented Aug 27, 2021

/retest-required

@creydr
Copy link
Member Author

creydr commented Aug 27, 2021

/retest

@creydr
Copy link
Member Author

creydr commented Aug 27, 2021

/retest-required

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@abhat
Copy link
Contributor

abhat commented Sep 3, 2021

/override ci/prow/e2e-gcp-ovn
Fails because of: https://bugzilla.redhat.com/show_bug.cgi?id=2000589

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 3, 2021

@abhat: Overrode contexts on behalf of abhat: ci/prow/e2e-gcp-ovn

In response to this:

/override ci/prow/e2e-gcp-ovn
Fails because of: https://bugzilla.redhat.com/show_bug.cgi?id=2000589

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@dcbw
Copy link
Member

dcbw commented Sep 3, 2021

/retest

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@dcbw
Copy link
Member

dcbw commented Sep 3, 2021

/override ci/prow/e2e-gcp
openshift/origin#26444

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 3, 2021

@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-gcp

In response to this:

/override ci/prow/e2e-gcp
openshift/origin#26444

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@dcbw
Copy link
Member

dcbw commented Sep 3, 2021

Still openshift/origin#26444
/retest-required

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@dcbw
Copy link
Member

dcbw commented Sep 4, 2021

/retest-required

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2021

@creydr: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-azure-ovn 71b5c3f link /test e2e-azure-ovn
ci/prow/e2e-gcp-ovn-upgrade 71b5c3f link /test e2e-gcp-ovn-upgrade
ci/prow/e2e-vsphere-ovn 71b5c3f link /test e2e-vsphere-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dcbw
Copy link
Member

dcbw commented Sep 4, 2021

/override ci/prow/e2e-agnostic-upgrade
https://bugzilla.redhat.com/show_bug.cgi?id=2001231

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2021

@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-agnostic-upgrade

In response to this:

/override ci/prow/e2e-agnostic-upgrade
https://bugzilla.redhat.com/show_bug.cgi?id=2001231

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit 8437b07 into openshift:master Sep 4, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 4, 2021

@creydr: All pull requests linked via external trackers have merged:

Bugzilla bug 1986061 has been moved to the MODIFIED state.

In response to this:

Bug 1986061: Monitor openshift-network-diagnostics namespace

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants