vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420

wking · 2020-07-29T16:16:39Z

Pulling in openshift/client-go@94e1065ab1 and openshift/library-go@1ed21c4fa8. Generated with:

$ go get -u github.com/openshift/client-go@83993cebb5aee533bf2f2dded2d87b9e32835f4a
go: github.com/openshift/client-go 94e1065ab1f8a34cd8a0e1edb9ba8be0326e77b5 => v0.0.0-20200723130357-94e1065ab1f8
$ go get -u github.com/openshift/library-go@cc498c355c99983057e9e01f3abcceb40ca5c298
go: github.com/openshift/library-go 1ed21c4fa86c2bf6034823231b8e52df4f068d58 => v0.0.0-20200724192307-1ed21c4fa86c
$ go mod tidy
$ go mod vendor
$ git add -A go.* vendor

using:

$ go version
go version go1.14.4 linux/amd64

This gets us up to 1.19 for the Kubernetes-based dependencies, and should hopefully allow me to bump openshift/api to pick up openshift/api@3ae6754513 without hitting:

  go: github.com/openshift/cluster-version-operator/lib imports
	  	k8s.io/client-go/kubernetes/scheme imports
		  k8s.io/api/auditregistration/v1alpha1: package provided by k8s.io/api at latest version v0.18.6 but not at required version v0.19.0-rc.2

And the API bump in turn will allow us to start using the new Release type (#419).

LalatenduMohanty

/lgtm

openshift-bot · 2020-07-29T18:59:58Z

/retest

Please review the full test history for this PR and help us cut down flakes.

wking · 2020-07-29T19:28:05Z

Most recent failures were both CI-registry 503s.

wking · 2020-07-29T22:21:10Z

e2e:

level=info msg="Pulling debug logs from the bootstrap machine"
level=info msg="Bootstrap gather logs captured here \"/tmp/installer/log-bundle-20200729194059.tar.gz\""
level=fatal msg="Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition"

From bootstrap/journals/bootkube.log:

Jul 29 19:28:36 localhost bootkube.sh[2299]: [#394] failed to create some manifests:
Jul 29 19:28:36 localhost bootkube.sh[2299]: "99_openshift-cluster-api_master-machines-0.yaml": unable to get REST mapping for "99_openshift-cluster-api_master-machines-0.yaml": no matches for kind "Machine" in version "machine.openshift.io/v1beta1"
...
Jul 29 19:28:39 localhost bootkube.sh[2299]: Failed to create "etcd-metric-signer-secret.yaml" secrets.v1./etcd-metric-signer -n openshift-config: namespaces "openshift-config" not found
Jul 29 19:28:39 localhost bootkube.sh[2299]: Error: error while checking pod status: timed out waiting for the condition
Jul 29 19:28:39 localhost bootkube.sh[2299]: Tearing down temporary bootstrap control plane...

From the CVO logs:

I0729 19:39:13.577749       1 sync_worker.go:669] Running sync for role "openshift-machine-api/cluster-autoscaler-operator" (211 of 606)
E0729 19:39:13.578329       1 runtime.go:78] Observed a panic: &errors.errorString{s:"converting (v1beta1.Role) to (v1.Role): unknown conversion"} (converting (v1beta1.Role) to (v1.Role): unknown conversion)
goroutine 294 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1713b80, 0xc00105d730)
  /go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
  /go/src/github.com/openshift/cluster-version-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82
panic(0x1713b80, 0xc00105d730)
  /usr/local/go/src/runtime/panic.go:969 +0x166
github.com/openshift/cluster-version-operator/lib/resourceread.ReadRoleV1OrDie(0xc000969500, 0x2f8, 0x300, 0x0)
  /go/src/github.com/openshift/cluster-version-operator/lib/resourceread/rbac.go:55 +0x1b0
github.com/openshift/cluster-version-operator/lib/resourcebuilder.(*roleBuilder).Do(0xc001ec6780, 0x1c09920, 0xc0000a3580, 0xc001ec6780, 0xc001ec6780)
  /go/src/github.com/openshift/cluster-version-operator/lib/resourcebuilder/rbac.go:98 +0x45
github.com/openshift/cluster-version-operator/pkg/cvo.(*resourceBuilder).Apply(0xc000c6fdd0, 0x1c09920, 0xc0000a3580, 0xc001436ec0, 0x2, 0xc001d99be0, 0xa)
  /go/src/github.com/openshift/cluster-version-operator/pkg/cvo/cvo.go:687 +0xc2
github.com/openshift/cluster-version-operator/pkg/payload.(*Task).Run(0xc00097a7d0, 0x1c09920, 0xc0000a3580, 0xc000960500, 0x17, 0x1bc3880, 0xc000c6fdd0, 0x2, 0x0, 0x0)
  /go/src/github.com/openshift/cluster-version-operator/pkg/payload/task.go:75 +0xaf
github.com/openshift/cluster-version-operator/pkg/cvo.(*SyncWorker).apply.func2(0x1c09920, 0xc0000a3580, 0xc001004670, 0x16, 0x332, 0x2, 0x2)
  /go/src/github.com/openshift/cluster-version-operator/pkg/cvo/sync_worker.go:678 +0x377
github.com/openshift/cluster-version-operator/pkg/payload.RunGraph.func3(0xc001970280, 0xc000dcc3c0, 0xc00100b890, 0xc000dcc420, 0x1c09920, 0xc0000a3580, 0x39)
  /go/src/github.com/openshift/cluster-version-operator/pkg/payload/task_graph.go:487 +0xea
created by github.com/openshift/cluster-version-operator/pkg/payload.RunGraph
  /go/src/github.com/openshift/cluster-version-operator/pkg/payload/task_graph.go:477 +0x32a
panic: converting (v1beta1.Role) to (v1.Role): unknown conversion [recovered]
  panic: converting (v1beta1.Role) to (v1.Role): unknown conversion
...

The v1beta1.Role manifest in question is here. Hmm, looks like we have the v1beta1.Roles until at least v1.22. And indeed, in the CVO as of ba32622:

$ git --no-pager grep 'type Role struct'
vendor/k8s.io/api/rbac/v1/types.go:type Role struct {
vendor/k8s.io/api/rbac/v1alpha1/types.go:type Role struct {
vendor/k8s.io/api/rbac/v1beta1/types.go:type Role struct {

Which has not changed in this PR?

$ git --no-pager log -1 --oneline origin/master
e49df43a (wking/master, origin/release-4.7, origin/release-4.6, origin/master, origin/HEAD, master) Merge pull request #418 from wking/RetriveOnce-typo
$ git --no-pager grep 'type Role struct' e49df43a
e49df43a:vendor/k8s.io/api/rbac/v1/types.go:type Role struct {
e49df43a:vendor/k8s.io/api/rbac/v1alpha1/types.go:type Role struct {
e49df43a:vendor/k8s.io/api/rbac/v1beta1/types.go:type Role struct {

So not clear on what's going on here yet...

openshift-bot · 2020-07-29T22:41:03Z

/retest

Please review the full test history for this PR and help us cut down flakes.

wking · 2020-07-29T22:50:28Z

From kubernetes/kubernetes#49642:

Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes

But I can reproduce this in a new unit test, so I'll just stumble around until I figure out what magic we're missing to get automatic translation...

wking · 2020-07-29T22:50:53Z

/hold

No need to spend CI on this while I debug the Role business.

wking · 2020-07-29T23:01:49Z

Ah:

$ git --no-pager show --oneline ba32622 vendor/k8s.io/apimachinery/pkg/conversion/converter.go
ba32622c (HEAD -> client-go-bump, wking/client-go-bump, origin/pr/420) vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24)
diff --git a/vendor/k8s.io/apimachinery/pkg/conversion/converter.go b/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
index 2d7c8bd1..838d5b0a 100644
--- a/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
+++ b/vendor/k8s.io/apimachinery/pkg/conversion/converter.go
@@ -442,20 +442,20 @@ func (c *Converter) doConversion(src, dest interface{}, flags FieldMatchingFlags
 	if fn, ok := c.generatedConversionFuncs.untyped[pair]; ok {
 		return fn(src, dest, scope)
 	}
-	// TODO: consider everything past this point deprecated - we want to support only point to point top level
-	// conversions
 
 	dv, err := EnforcePtr(dest)
 	if err != nil {
 		return err
 	}
-	if !dv.CanAddr() && !dv.CanSet() {
-		return fmt.Errorf("can't write to dest")
-	}
 	sv, err := EnforcePtr(src)
 	if err != nil {
 		return err
 	}
+	return fmt.Errorf("converting (%s) to (%s): unknown conversion", sv.Type(), dv.Type())
+
+	// TODO: Everything past this point is deprecated.
+	//  Remove in 1.20 once we're sure it didn't break anything.
+
 	// Leave something on the stack, so that calls to struct tag getters never fail.
 	scope.srcStack.push(scopeStackElem{})
 	scope.destStack.push(scopeStackElem{})

So that's kubernetes/kubernetes#90018.

Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion in unit tests. The manual conversion and wash through JSON seems like a terrible hack, but I haven't been able to figure out a more elegant approach yet. From [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642

Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion in unit tests. The manual conversion and wash through JSON seems like a terrible hack, but I haven't been able to figure out a more elegant approach yet. From [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. Also add similar logic for apiregistration, since that's the other group where we register multiple schema versions. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642

wking · 2020-07-30T04:59:06Z

Ok, I think I have something that works in 6f247d4, although it's not particularly elegant.

/hold cancel

vrutkovs · 2020-07-30T09:36:09Z

/retest

LalatenduMohanty · 2020-07-30T12:08:35Z

/test e2e

The failure did not seem related to the PR https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728

wking · 2020-07-30T13:13:20Z

e2e:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728/artifacts/e2e/ipi-install-install/log-bundle-20200730102730.tar | tar -xOz log-bundle-20200730102730/bootstrap/containers/cluster-version-operator-3859ba2ba9e7592f8693b4ee362fda81b2c463b32f6970c7b1d2ea0f0a0226fd.log | grep 'Result of work' | tail -n2
I0730 10:23:46.632144       1 task_graph.go:546] Result of work: [Cluster operator etcd is still updating Cluster operator kube-apiserver is still updating Cluster operator kube-controller-manager is still updating Cluster operator kube-scheduler is still updating Cluster operator config-operator is still updating Cluster operator machine-api is still updating Cluster operator openshift-apiserver is still updating Cluster operator kube-storage-version-migrator is still updating Cluster operator cloud-credential is still updating Cluster operator authentication is still updating Could not update role "openshift-machine-api/cluster-autoscaler-operator" (211 of 606) Cluster operator csi-snapshot-controller is still updating Cluster operator image-registry is still updating Cluster operator ingress is still updating Cluster operator machine-approver is still updating Could not update clusterrole "cluster-node-tuning-operator" (294 of 606) Cluster operator openshift-controller-manager is still updating Could not update rolebinding "openshift/cluster-samples-operator-openshift-edit" (321 of 606): resource may have been deleted Cluster operator storage is still updating Could not update clusterrole "cluster-monitoring-operator" (361 of 606) Could not update oauthclient "console" (367 of 606): the server does not recognize this resource, check extension API servers Cluster operator insights is still updating Cluster operator operator-lifecycle-manager is still updating Could not update clusterrolebinding "marketplace-operator" (454 of 606) Cluster operator service-ca is still updating Could not update clusterrolebinding "default-account-cluster-network-operator" (477 of 606) Cluster operator dns is still updating Could not update clusterrolebinding "default-account-openshift-machine-config-operator" (503 of 606) Could not update role "openshift-authentication/prometheus-k8s" (514 of 606): resource may have been deleted]
I0730 10:25:16.724712       1 task_graph.go:546] Result of work: []

Dunno what's up with that yet. Ah, maybe somebody doesn't like our version-converted Roles and such?

wking · 2020-07-30T13:18:29Z

Ah, version conversion still busted:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_cluster-version-operator/420/pull-ci-openshift-cluster-version-operator-master-e2e/1288770413639241728/artifacts/e2e/ipi-install-install/log-bundle-20200730102730.tar | tar -xOz log-bundle-20200730102730/bootstrap/containers/cluster-version-operator-3859ba2ba9e7592f8693b4ee362fda81b2c463b32f6970c7b1d2ea0f0a0226fd.log | grep 'role "' | tail -n1
E0730 10:27:31.386463       1 task.go:81] error running apply for clusterrole "cluster-node-tuning-operator" (294 of 606): resource name may not be empty

wking · 2020-07-30T21:43:22Z

/hold

No need to retest while I'm taking another pass at the conversion failure.

Pulling in [1,2]. Generated with: $ go get -u github.com/openshift/client-go@83993cebb5aee533bf2f2dded2d87b9e32835f4a go: github.com/openshift/client-go 94e1065ab1f8a34cd8a0e1edb9ba8be0326e77b5 => v0.0.0-20200723130357-94e1065ab1f8 $ go get -u github.com/openshift/library-go@cc498c355c99983057e9e01f3abcceb40ca5c298 go: github.com/openshift/library-go 1ed21c4fa86c2bf6034823231b8e52df4f068d58 => v0.0.0-20200724192307-1ed21c4fa86c $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.14.4 linux/amd64 This gets us up to 1.19 for the Kubernetes-based dependencies, and should hopefully allow me to bump openshift/api to pick up [3] without hitting: go: github.com/openshift/cluster-version-operator/lib imports k8s.io/client-go/kubernetes/scheme imports k8s.io/api/auditregistration/v1alpha1: package provided by k8s.io/api at latest version v0.18.6 but not at required version v0.19.0-rc.2 And the API bump in turn will allow us to start using the new Release type. [1]: openshift/client-go@94e1065 [2]: openshift/library-go@1ed21c4 [3]: openshift/api@3ae6754

Protecting us from [1,2]: converting (v1beta1.Role) to (v1.Role): unknown conversion Because merging and application are also version-dependent, it's hard to paper over this in resourceread with automatic type conversion. Although from [3]: Promotes the rbac.authorization.k8s.io/v1beta1 API to v1 with no changes so all we really need is the apiVersion bump. Anyhow, with this commit, I'm doubling down on the approach from 4ee7b07 (Add apiextensions.k8s.io/v1 support for CRDs, 2019-10-22, openshift#259) and collapsing the readers into two helpers that support all of our types and return runtime.Object. From 0a255ab (cvo: Use protobuf for sending events and other basic API commands, 2019-01-18, openshift#90), protobuf is more efficient, so we should use it where possible. And because all of this is very tedious to maintain by hand, there's now a Python generator to spit out all the boilerplate. [1]: kubernetes/kubernetes#90018 [2]: openshift#420 (comment) [3]: kubernetes/kubernetes#49642

Generated with: $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.14.4 linux/amd64

wking · 2020-07-31T03:57:26Z

e2e failed on an end user can use OLM can subscribe to the operator, which is the unrelated rhbz#1862322.

/override ci/prow/e2e

openshift-ci-robot · 2020-07-31T03:57:42Z

@wking: Overrode contexts on behalf of wking: ci/prow/e2e

In response to this:

e2e failed on an end user can use OLM can subscribe to the operator, which is the unrelated rhbz#1862322.

/override ci/prow/e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

wking · 2020-07-31T05:12:05Z

/hold cancel

vrutkovs

/lgtm

openshift-ci-robot · 2020-07-31T08:42:25Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LalatenduMohanty, vrutkovs, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [LalatenduMohanty,vrutkovs,wking]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2020-07-31T12:12:47Z

/retest

Please review the full test history for this PR and help us cut down flakes.

wking · 2020-07-31T13:26:18Z

Pod openshift-infra/recycler-for-nfs-c9bbq was pending entire time: unknown error. Huh.... Hopefully unrelated ;)

wking · 2020-07-31T14:54:58Z

Rebased around #410 with cd2d701 -> 4e12333 now that both it and #420 have landed.

When we moved to 0.19 Kube tooling in ef236c3 (vendor: Bump client-go and library-go to current 4.6 tips 2020-07-24, openshift#420), the tooling moved to klog v2. Port our local klog consumers so we don't have to worry about configuring multiple klogs [1]. Generated with: $ sed -i 's|^go 1.13|go 1.15|' go.mod $ sed -i 's|k8s.io/klog v1.0.0|k8s.io/klog/v2 v2.3.0|' go.mod $ sed -i 's|"k8s.io/klog"|"k8s.io/klog/v2"|' $(git grep -l k8s.io/klog cmd lib pkg) $ sed -i 's/klog.V($[0-9]$) /klog.V(\1).Enabled() /' $(git grep -l klog.V cmd lib pkg) $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.15.2 linux/amd64 The change to Go 1.15 catches us up with 1d9a319 (Dockerfile.rhel: Bump to Go 1.15, 2020-09-16, openshift#457). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1883705

When we moved to 0.19 Kube tooling in ef236c3 (vendor: Bump client-go and library-go to current 4.6 tips 2020-07-24, openshift#420), the tooling moved to klog v2. Port our local klog consumers so we don't have to worry about configuring multiple klogs [1]. Generated with: $ sed -i 's|^go 1.13|go 1.15|' go.mod $ sed -i 's|k8s.io/klog v1.0.0|k8s.io/klog/v2 v2.3.0|' go.mod $ sed -i 's|"k8s.io/klog"|"k8s.io/klog/v2"|' $(git grep -l k8s.io/klog cmd lib pkg) $ sed -i 's|klog.V($[0-9]$) |klog.V(\1).Enabled() |' $(git grep -l klog.V cmd lib pkg) $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.15.2 linux/amd64 The change to Go 1.15 catches us up with 1d9a319 (Dockerfile.rhel: Bump to Go 1.15, 2020-09-16, openshift#457). [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1883705

I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.

I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition

I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.

I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition

I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628). The always-fatal ApplyAPIServicev1 function gives us the required symmetry with 0afb8a8's DeleteAPIServicev1, although I'm a bit fuzzy on who would have been using an APIService manifest. I'm not excited about supporting a new manifest kind I know nothing about, and failing all attempts at non-delete reconciling should protect us from accidentally encouraging manifest authors to feed us this kind.

I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition

I'd accidentally broken this in e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), when I moved the health checks out of the Do methods and began passing typedObject into them. typedObject is the release manifest, which lacks the status attributes we want the health checks to cover. This also catches the generation script up with some past manual changes: * 0afb8a8 (Add a manifest annotation to be used for object deletion, 2020-08-17, openshift#438). * 05e1af7 (Log resource diffs on update only in reconcile mode, 2021-07-13, openshift#628).

I'd dropped this in cc9292a (lib/resourcebuilder: Replace wait-for with single-shot "is it alive now?", 2020-07-30, openshift#400), claiming: There's no object status for CRDs or DaemonSets that marks "we are really hurting". The v1.18.0 Kubernetes CRD and DaemonSet controllers do not set any conditions in their operand status (although the API for those conditions exists [2,3]). With this commit, we have very minimal wait logic for either. Sufficiently unhealthy DaemonSet should be reported on via their associated ClusterOperator, and sufficiently unhealthy CRD should be reported on when we fail to push any custom resources consuming them (Task.Run retries will give the API server time to ready itself after accepting a CRD update before the CVO fails its sync cycle). But from [1]: > It might take a few seconds for the endpoint to be created. You > can watch the Established condition of your > CustomResourceDefinition to be true or watch the discovery > information of the API server for your resource to show up. So I was correct that we will hear about CRD issues when we fail to push a dependent custom resource. But I was not correct in claiming that the CRD controller set no conditions. I was probably confused by e8ffccb (lib: Add autogeneration for some resource* functionality, 2020-07-29, openshift#420), which broke the health-check inputs as described in 002591d (lib/resourcebuilder: Use actual resource in check*Health calls, 2022-05-03, openshift#771). The code I removed in cc9292a was in fact looking at the Established condition already. This commit restores the Established check, but without the previous PollImmediateUntil wait. [1]: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#create-a-customresourcedefinition

openshift-ci-robot requested review from jottofar and smarterclayton July 29, 2020 16:16

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 29, 2020

LalatenduMohanty approved these changes Jul 29, 2020

View reviewed changes

openshift-ci-robot assigned LalatenduMohanty Jul 29, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 29, 2020

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 29, 2020

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2020

wking force-pushed the client-go-bump branch from c41eac0 to 55be2f2 Compare July 30, 2020 04:09

wking force-pushed the client-go-bump branch from 55be2f2 to 6f247d4 Compare July 30, 2020 04:48

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 30, 2020

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 30, 2020

wking force-pushed the client-go-bump branch from 6f247d4 to 8a020a7 Compare July 31, 2020 01:41

wking added 2 commits July 30, 2020 19:13

vendor: Include apiregistration/v1beta1

8230b3c

Generated with: $ go mod tidy $ go mod vendor $ git add -A go.* vendor using: $ go version go version go1.14.4 linux/amd64

wking force-pushed the client-go-bump branch from 50125c4 to 8230b3c Compare July 31, 2020 02:13

wking mentioned this pull request Jul 31, 2020

*: Port from Update to Release for ClusterVersion status #419

Merged

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 31, 2020

vrutkovs approved these changes Jul 31, 2020

View reviewed changes

openshift-ci-robot assigned vrutkovs Jul 31, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 31, 2020

openshift-merge-robot merged commit 5d06bfc into openshift:master Jul 31, 2020

wking deleted the client-go-bump branch July 31, 2020 14:04

wking mentioned this pull request Sep 29, 2020

Bug 1883705: *: Move to klog v2 #462

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420

vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420

wking commented Jul 29, 2020

LalatenduMohanty left a comment

openshift-bot commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020 •

edited

Loading

openshift-bot commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 30, 2020

vrutkovs commented Jul 30, 2020

LalatenduMohanty commented Jul 30, 2020 •

edited

Loading

wking commented Jul 30, 2020 •

edited

Loading

wking commented Jul 30, 2020

wking commented Jul 30, 2020

wking commented Jul 31, 2020

openshift-ci-robot commented Jul 31, 2020

wking commented Jul 31, 2020

vrutkovs left a comment

openshift-ci-robot commented Jul 31, 2020

openshift-bot commented Jul 31, 2020

wking commented Jul 31, 2020

wking commented Jul 31, 2020

vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420

vendor: Bump client-go and library-go to current 4.6 tips (2020-07-24) #420

Conversation

wking commented Jul 29, 2020

LalatenduMohanty left a comment

Choose a reason for hiding this comment

openshift-bot commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020 • edited Loading

openshift-bot commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 29, 2020

wking commented Jul 30, 2020

vrutkovs commented Jul 30, 2020

LalatenduMohanty commented Jul 30, 2020 • edited Loading

wking commented Jul 30, 2020 • edited Loading

wking commented Jul 30, 2020

wking commented Jul 30, 2020

wking commented Jul 31, 2020

openshift-ci-robot commented Jul 31, 2020

wking commented Jul 31, 2020

vrutkovs left a comment

Choose a reason for hiding this comment

openshift-ci-robot commented Jul 31, 2020

openshift-bot commented Jul 31, 2020

wking commented Jul 31, 2020

wking commented Jul 31, 2020

wking commented Jul 29, 2020 •

edited

Loading

LalatenduMohanty commented Jul 30, 2020 •

edited

Loading

wking commented Jul 30, 2020 •

edited

Loading