Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-3541: Don't create route metrics for ingress controllers that are not admitted #869

Merged

Conversation

gcs278
Copy link
Contributor

@gcs278 gcs278 commented Dec 13, 2022

To fix OCPBUGS-3541, don't create metrics for ingress controller that are not admitted. The Ingress Operator won't add the finalizer if the Ingress Controller was not admitted. Since there is no finalizer, we can't delete any metrics that we created for the Ingress Controller upon deletion. Only adding metrics for Ingress Controllers that are admitted will correct this bug.

Furthermore, if an ingress controller is not admitted, it can't have any routes, so there isn't a large point in creating the metrics for it yet.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Dec 13, 2022
@openshift-ci-robot
Copy link
Contributor

@gcs278: This pull request references Jira Issue OCPBUGS-3541, which is invalid:

  • expected the bug to target the "4.13.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

To fix OCPBUGS-3541, don't create metrics for ingress controller that are not admitted. The Ingress Operator won't add the finalizer if the Ingress Controller was not admitted. Since there is no finalizer, we can't delete any metrics that we created for the Ingress Controller upon deletion. Only adding metrics for Ingress Controllers that are admitted will correct this bug.

Furthermore, if an ingress controller is not admitted, it can't have any routes, so there isn't a large point in creating the metrics for it yet.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gcs278
Copy link
Contributor Author

gcs278 commented Dec 14, 2022

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 14, 2022
@openshift-ci-robot
Copy link
Contributor

@gcs278: This pull request references Jira Issue OCPBUGS-3541, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.13.0) matches configured target version for branch (4.13.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @lihongan

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from lihongan December 14, 2022 00:03
@@ -3,6 +3,7 @@ package routemetrics
import (
"context"
"fmt"
"github.com/openshift/cluster-ingress-operator/pkg/util/ingresscontroller"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import can be grouped with the logf import.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the "golang.org/x/time/rate" got moved. * grin *. Can you put "golang.org/x/time/rate" back where it was?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I moved it cause it didn't seem related to openshift packages, but moved it back. Not sure what the organization is supposed to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. I don't know where this is clearly documented, and we are inconsistent with the ordering of imports across the files in this repository (and other repositories). We usually group imports into the following groups and ordering:

  • Standard packages (e.g. "context", "fmt", "time").
  • General non-standard packages (e.g. "github.com/davecgh/go-spew/spew", "github.com/google/go-cmp/cmp", "golang.org/x/time/whatever").
  • OpenShift packages (e.g. "github.com/openshift/api/operator/v1", "github.com/openshift/library-go/pkg/crypto", "github.com/openshift/cluster-ingress-operator/whatever"), except ofttimes we put github.com/openshift/cluster-ingress-operator/* in its own group, usually before or after other OpenShift packages and in a couple cases as the last group of imports.
  • Kubernetes packages (e.g. "k8s.io/api/core/v1", "k8s.io/apimachinery/pkg/api/errors"), except sometimes we put k8s.io/api/* and k8s.io/apimachinery/* in separate groups, and we usually put sigs.k8s.io/controller-runtime/* in another group after that.

So the basic pattern is typically this: standard packages, non-standard, github.com/openshift/*, k8s.io/*, sigs.k8s.io/*. However, we have a lot of exceptions, and we have a lot of imports that just don't follow any rules (I often see people do weird things with their imports, some editors screw with import ordering, and I often let it go during reviews).

This closest thing I have found to an upstream guideline is this obscure GitHub comment: kubernetes/kubernetes#103981 (comment). One interesting thing about that comment is that it puts imports for the local repository last. We put github.com/openshift/cluster-ingress-operator/* imports last in only a few files in this repository.

Anyway, if I'm being picky, "golang.org/x/time/rate" should be moved to a new group. However, that import was already grouped with the github.com/openshift/* imports, so I prefer the commit that fixes the bug just leave it where it was originally so as not to distract from the important changes in the commit.

@Miciah
Copy link
Contributor

Miciah commented Dec 14, 2022

Why did you regenerate bindata.go? You might need to change Go versions.

@Miciah
Copy link
Contributor

Miciah commented Dec 14, 2022

/assign

@gcs278
Copy link
Contributor Author

gcs278 commented Dec 14, 2022

Why did you regenerate bindata.go? You might need to change Go versions.

Sorry, I fixed it. I'm using go 1.18.6, but this is the update it keeps trying to make a spacing update, but I just ignore and don't check it in.

diff --git a/pkg/manifests/bindata.go b/pkg/manifests/bindata.go
index 326b5b40..d36f38c7 100644
--- a/pkg/manifests/bindata.go
+++ b/pkg/manifests/bindata.go
@@ -946,13 +946,11 @@ var _bindata = map[string]func() (*asset, error){
 // directory embedded in the file by go-bindata.
 // For example if you run go-bindata on data/... and data contains the
 // following hierarchy:
-//
-//     data/
-//       foo.txt
-//       img/
-//         a.png
-//         b.png
-//
+//     data/
+//       foo.txt
+//       img/
+//         a.png
+//         b.png
 // then AssetDir("data") would return []string{"foo.txt", "img"},
 // AssetDir("data/img") would return []string{"a.png", "b.png"},
 // AssetDir("foo.txt") and AssetDir("notexist") would return an error, and

@gcs278
Copy link
Contributor Author

gcs278 commented Dec 14, 2022

/hold
writing unit tests

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 14, 2022
@gcs278 gcs278 force-pushed the route-metrics-ic-admitted branch 3 times, most recently from e3db693 to 22be71c Compare December 15, 2022 15:18
@gcs278
Copy link
Contributor Author

gcs278 commented Dec 15, 2022

/unhold
Unit tests are written. @arkadeepsen FYSA, I learnt you can mock K8S clients fairly easy, so added unit tests to route-metrics controller.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 15, 2022
@arkadeepsen
Copy link
Contributor

Unit tests are written. @arkadeepsen FYSA, I learnt you can mock K8S clients fairly easy, so added unit tests to route-metrics controller.

@gcs278 This is pretty easy and very useful. Thanks for adding the unit tests 👍

@gcs278
Copy link
Contributor Author

gcs278 commented Dec 21, 2022

all appear to be known issues
/retest

@gcs278
Copy link
Contributor Author

gcs278 commented Dec 22, 2022

/retest
unrelated failures

@gcs278
Copy link
Contributor Author

gcs278 commented Jan 3, 2023

/retest

@candita
Copy link
Contributor

candita commented Jan 6, 2023

/test e2e-gcp-ovn-serial

@alebedev87
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 24, 2023
@gcs278
Copy link
Contributor Author

gcs278 commented Apr 25, 2023

Mostly TestClientTLS Failures
/retest-required

@gcs278
Copy link
Contributor Author

gcs278 commented May 1, 2023

Known flakes except TestRouteMetricsControllerOnlyNamespaceSelector flaked. I see it happened one other time with https://search.ci.openshift.org/?search=error+waiting+for+route+metrics&maxAge=336h&context=1&type=build-log&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job so I'll keep an eye on it, but I think it's a seperate issue.
/retest-required

@gcs278
Copy link
Contributor Author

gcs278 commented May 1, 2023

Cluster install fail
/test e2e-azure-operator

@gcs278
Copy link
Contributor Author

gcs278 commented May 26, 2023

/retest e2e-hypershift

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 26, 2023

@gcs278: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

  • /test e2e-aws-operator
  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-serial
  • /test e2e-aws-ovn-upgrade
  • /test e2e-azure-operator
  • /test e2e-gcp-operator
  • /test e2e-hypershift
  • /test images
  • /test unit
  • /test verify

The following commands are available to trigger optional jobs:

  • /test e2e-aws-ovn-single-node
  • /test e2e-azure-ovn
  • /test e2e-gcp-ovn

Use /test all to run all jobs.

In response to this:

/retest e2e-hypershift

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gcs278
Copy link
Contributor Author

gcs278 commented May 26, 2023

/test e2e-hypershift

@gcs278
Copy link
Contributor Author

gcs278 commented Jun 21, 2023

/retest-required

@gcs278
Copy link
Contributor Author

gcs278 commented Jul 5, 2023

/retest

… metrics for ingress controllers that are not admitted
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 25, 2023
…e function.

Add builder functions for building routes, namespaces, and ingress controllers.
@gcs278
Copy link
Contributor Author

gcs278 commented Sep 25, 2023

@alebedev87 minor changes in https://github.com/openshift/cluster-ingress-operator/compare/004ebf75059b4db3834e0e4502bbaaceedfd9ab9..7ee37ab92e459af483c3c2b1989fdb9de0864e4b

import ordering, fixing Build() methods to refer to struct as a pointer, and some other minor clean up.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 25, 2023

@gcs278: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-ovn-serial 8c34806 link true /test e2e-gcp-ovn-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@gcs278
Copy link
Contributor Author

gcs278 commented Sep 25, 2023

https://issues.redhat.com/browse/OCPBUGS-13106
/test e2e-gcp-operator

disruption
/test e2e-gcp-ovn

weird error: the server doesn't have a resource type \"volumesnapshotclass\"\n"}
/test e2e-hypershift

@alebedev87
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 26, 2023
@alebedev87
Copy link
Contributor

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alebedev87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 26, 2023
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD ed6306f and 2 for PR HEAD 7ee37ab in total

@openshift-merge-robot openshift-merge-robot merged commit a10d93e into openshift:master Sep 26, 2023
14 checks passed
@openshift-ci-robot
Copy link
Contributor

@gcs278: Jira Issue OCPBUGS-3541: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-3541 has been moved to the MODIFIED state.

In response to this:

To fix OCPBUGS-3541, don't create metrics for ingress controller that are not admitted. The Ingress Operator won't add the finalizer if the Ingress Controller was not admitted. Since there is no finalizer, we can't delete any metrics that we created for the Ingress Controller upon deletion. Only adding metrics for Ingress Controllers that are admitted will correct this bug.

Furthermore, if an ingress controller is not admitted, it can't have any routes, so there isn't a large point in creating the metrics for it yet.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.15.0-0.nightly-2023-09-27-073353

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. jira/severity-low Referenced Jira bug's severity is low for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants