Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating ose-cluster-ingress-operator images to be consistent with ART #834

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Oct 3, 2022

Reconciling with https://github.com/openshift/ocp-build-data/tree/b44f15ec9e84d1831eac81f8c757b3bed985dbeb/images/ose-cluster-ingress-operator.yml

This PR is based on #802 but additionally includes changes to the formatting of pkg/manifests/bindata.go that result from using the updated Go 1.19 builder image, as well as formatting changes to several other files to appease gofmt. Including these changes is necessary in order for the verify job to pass.

@Miciah
Copy link
Contributor Author

Miciah commented Oct 4, 2022

Since this is just #802 plus formatting changes and #802 doesn't need a tracker, this PR shouldn't need a tracker either.
/approve
/label bugzilla/valid-bug

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 4, 2022

@Miciah: Can not set label bugzilla/valid-bug: Must be member in one of these teams: [openshift-patch-managers openshift-staff-engineers]

In response to this:

Since this is just #802 plus formatting changes and #802 doesn't need a tracker, this PR shouldn't need a tracker either.
/approve
/label bugzilla/valid-bug

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 4, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 4, 2022
@Miciah
Copy link
Contributor Author

Miciah commented Oct 4, 2022

e2e-aws-operator failed because TestUnmanagedDNSToManagedDNSInternalIngressController failed and one of the control-plane nodes did not come ready.
/test e2e-aws-operator

e2e-aws-ovn-single-node failed because of the following failures:

[sig-arch][bz-kube-apiserver][Late] Alerts alert/KubePodNotReady should not be at or above info in ns/openshift-kube-apiserver [Suite:openshift/conformance/parallel]
[sig-arch][bz-kube-controller-manager][Late] Alerts alert/KubePodNotReady should not be at or above info in ns/openshift-kube-controller-manager [Suite:openshift/conformance/parallel]
[sig-arch][bz-kube-scheduler][Late] Alerts alert/KubePodNotReady should not be at or above info in ns/openshift-kube-scheduler [Suite:openshift/conformance/parallel]

/test e2e-aws-ovn-single-node

e2e-aws-ovn-upgrade failed because of another KubePodNotReady issue.
/test e2e-aws-ovn-upgrade

e2e-gcp-ovn-serial failed because of KubePodNotReady errors as well as OVNKubernetesNorthboundDatabaseClusterMemberError, OVNKubernetesNorthboundDatabaseInboundConnectionMissing, OVNKubernetesNorthboundDatabaseOutboundConnectionMissing, OVNKubernetesSouthboundDatabaseClusterMemberError, OVNKubernetesSouthboundDatabaseInboundConnectionMissing, and OVNKubernetesSouthboundDatabaseOutboundConnectionMissing alerts.
/test e2e-gcp-ovn-serial

@Miciah
Copy link
Contributor Author

Miciah commented Oct 4, 2022

https://issues.redhat.com/browse/TRT-589 is tracking KubeNotReady issues.

@Miciah
Copy link
Contributor Author

Miciah commented Oct 4, 2022

e2e-aws-operator failed because various tests failed: TestUnmanagedDNSToManagedDNSInternalIngressController, TestManagedDNSToUnmanagedDNSIngressController, TestUnmanagedDNSToManagedDNSIngressController, TestAWSLBTypeChange, and TestInternalLoadBalancer, probably because kube-controller-manager (which provisions LBs) failed to roll out, along with kube-apiserver, kube-scheduler, and ovnkube-master.
/test e2e-aws-operator

e2e-aws-ovn-single-node failed because prometheus-operator is overly watchful:

: [sig-arch][Late] operators should not create watch channels very often [apigroup:config.openshift.io] [Suite:openshift/conformance/parallel]
Run #0: Failed
{  fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:453]: Expected
    <[]string | len:1, cap:1>: [
        "Operator \"prometheus-operator\" produces more watch requests than expected: watchrequestcount=215, upperbound=200, ratio=1.075",
    ]
to be empty
Ginkgo exit error 1: exit with code 1}
Run #1: Failed
{  fail [github.com/openshift/origin/test/extended/apiserver/api_requests.go:453]: Expected
    <[]string | len:1, cap:1>: [
        "Operator \"prometheus-operator\" produces more watch requests than expected: watchrequestcount=215, upperbound=200, ratio=1.075",
    ]
to be empty
Ginkgo exit error 1: exit with code 1}

/test e2e-aws-ovn-single-node

e2e-gcp-ovn-serial failed because of OVNKubernetes alerts, auth and console disruption, and an unhealthy etcd member.
/test e2e-gcp-ovn-serial

@Miciah
Copy link
Contributor Author

Miciah commented Oct 5, 2022

e2e-gcp-ovn-serial failed because of OAuth and console disruption and an unhealthy etcd member.
/test e2e-gcp-ovn-serial

@alebedev87
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 20, 2022
@Miciah
Copy link
Contributor Author

Miciah commented Oct 21, 2022

/test all
now that #843 has merged.
/test e2e-azure-operator
/test e2e-gcp-operator

@Miciah
Copy link
Contributor Author

Miciah commented Oct 21, 2022

e2e-aws-operator failed, and the ingress-operator logs show that the hosted zone could not be found:

2022-10-21T15:46:35.700Z	ERROR	operator.dns_controller	dns/controller.go:359	failed to publish DNS record to zone	{"record": {"dnsName":"*.apps.ci-op-s5ip724i-43abb.origin-ci-int-aws.dev.rhcloud.com.","targets":["af72772571fe849f88ae21931f161e7a-1427110984.us-west-1.elb.amazonaws.com"],"recordType":"CNAME","recordTTL":30,"dnsManagementPolicy":"Managed"}, "dnszone": {"tags":{"Name":"/errorci-op-s5ip724i-43abb-j4v5k-int","kubernetes.io/cluster/ci-op-s5ip724i-43abb-j4v5k":"owned"}}, "error": "failed to find hosted zone for record: no matching hosted zone found"}

Let's see whether this is an aberration.
/test e2e-aws-operator

@Miciah
Copy link
Contributor Author

Miciah commented Oct 21, 2022

e2e-gcp-ovn-serial failed because disruption/ingress-to-oauth-server connection/new and disruption/ingress-to-console connection/new failed.
/test e2e-gcp-ovn-serial

@Miciah
Copy link
Contributor Author

Miciah commented Oct 21, 2022

e2e-aws-operator failed because TestRouteMetricsControllerRouteAndNamespaceSelector failed and because kube-apiserver, kube-controller-manager, kube-scheduler, and network had difficulties rolling out pods.

TestRouteMetricsControllerRouteAndNamespaceSelector failed because an API call failed:

    route_metrics_test.go:325: failed to update route: Operation cannot be fulfilled on routes.route.openshift.io "route-rs-ns-foo-label": the object has been modified; please apply your changes to the latest version and try again

If this becomes a recurring issue, we could add some retry logic here:

// updateRouteAndWaitForMetricsUpdate updates the Route and waits for metric to be updated to the expected value.
func updateRouteAndWaitForMetricsUpdate(t *testing.T, route *routev1.Route, prometheusClient prometheusv1.API, shardName string, value int) {
// Update the Route resource.
if err := kclient.Update(context.TODO(), route); err != nil {
t.Fatalf("failed to update route: %v", err)

Otherwise, might add retry logic there as part of a more comprehensive effort to add retries in order to increase resilience of E2E tests overall.
/test e2e-aws-operator

@Miciah
Copy link
Contributor Author

Miciah commented Oct 22, 2022

e2e-aws-operator failed because TestUpdateDefaultIngressController failed and because authentication, kube-apiserver, kube-controller-manager, kube-scheduler, and storage had difficulties rolling out pods.

TestUpdateDefaultIngressController failed because a Kube API call failed:

    operator_test.go:508: failed to delete test secret: Delete "https://api.ci-op-0njv3kcw-43abb.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/openshift-ingress/secrets/test-l28g7": read tcp 10.131.227.107:55492->52.5.128.156:6443: read: connection reset by peer

/test e2e-aws-operator

@Miciah
Copy link
Contributor Author

Miciah commented Oct 22, 2022

e2e-aws-operator failed because kube-apiserver reported NodeInstallerProgressing.
/override ci/prow/e2e-aws-operator

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 22, 2022

@Miciah: Overrode contexts on behalf of Miciah: ci/prow/e2e-aws-operator

In response to this:

e2e-aws-operator failed because kube-apiserver reported NodeInstallerProgressing.
/override ci/prow/e2e-aws-operator

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 22, 2022

@Miciah: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Miciah
Copy link
Contributor Author

Miciah commented Oct 22, 2022

#802 was automatically labeled so it could merge without a bug or epic. This PR likewise shouldn't require a bug or epic.
/label docs-approved
/label px-approved
/label qe-approved

@openshift-ci openshift-ci bot added docs-approved Signifies that Docs has signed off on this PR px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR labels Oct 22, 2022
@openshift-merge-robot openshift-merge-robot merged commit 232edb0 into openshift:master Oct 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. docs-approved Signifies that Docs has signed off on this PR lgtm Indicates that a PR is ready to be merged. px-approved Signifies that Product Support has signed off on this PR qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants