Cluster upgrade internal error caused by self signed cert #572

viane · 2021-05-23T01:18:10Z

Upgrade from 4.7.0 to 4.7.9, the x509 error occurred in 2 places, one is calling api-int.xxxx and prometheus-operator.openshift-monitoring.svc:8080, was able to fix the first one by manually update-ca-trust the cert from the API server. However not sure how to do with 2nd one since it's a cluster internal URI.

cluster-version-operator log shown below:

I0523 01:03:07.033591       1 cvo.go:481] Started syncing cluster version "openshift-cluster-version/version" (2021-05-23 01:03:07.033585342 +0000 UTC m=+82331.507064819)
I0523 01:03:07.041158       1 cvo.go:510] Desired version from spec is v1.Update{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", Force:false}
I0523 01:03:07.041264       1 sync_worker.go:227] Update work is equal to current target; no change required
I0523 01:03:07.041289       1 status.go:161] Synchronizing errs=field.ErrorList{} status=&cvo.SyncWorkerStatus{Generation:2, Step:"ApplyResources", Failure:error(nil), Done:8, Total:668, Completed:0, Reconciling:false, Initial:false, VersionHash:"qi_N6BhDM3k=", LastProgress:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}, Actual:v1.Release{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", URL:"https://access.redhat.com/errata/RHBA-2021:1365", Channels:[]string(nil)}, Verified:false}
I0523 01:03:07.041331       1 status.go:81] merge into existing history completed=false desired=v1.Release{Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", URL:"https://access.redhat.com/errata/RHBA-2021:1365", Channels:[]string{"candidate-4.7", "candidate-4.8", "fast-4.7", "stable-4.7"}} last=&v1.UpdateHistory{State:"Partial", StartedTime:v1.Time{Time:time.Time{wall:0x0, ext:63757219493, loc:(*time.Location)(0x223c360)}}, CompletionTime:(*v1.Time)(nil), Version:"4.7.9", Image:"quay.io/openshift-release-dev/ocp-release@sha256:5a5433a5f82a10c78783d7aed7d556d26602295ee8e9dcfaba97ebc1ab0bc2ac", Verified:true}
I0523 01:03:07.041474       1 cvo.go:483] Finished syncing cluster version "openshift-cluster-version/version" (7.885146ms)
E0523 01:03:07.136035       1 task.go:112] error running apply for prometheusrule "openshift-cluster-version/cluster-version-operator" (9 of 668): Internal error occurred: failed calling webhook "prometheusrules.openshift.io": Post "https://prometheus-operator.openshift-monitoring.svc:8080/admission-prometheusrules/validate?timeout=5s": x509: certificate signed by unknown authority
I0523 01:03:07.193126       1 cvo.go:554] Finished syncing available updates "openshift-cluster-version/version" (162.643079ms

The text was updated successfully, but these errors were encountered:

openshift-bot · 2021-08-21T02:54:42Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-09-20T08:43:23Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2021-10-20T09:11:14Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-10-20T09:11:39Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 21, 2021

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 20, 2021

openshift-ci bot closed this as completed Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster upgrade internal error caused by self signed cert #572

Cluster upgrade internal error caused by self signed cert #572

viane commented May 23, 2021

openshift-bot commented Aug 21, 2021

openshift-bot commented Sep 20, 2021

openshift-bot commented Oct 20, 2021

openshift-ci bot commented Oct 20, 2021

Cluster upgrade internal error caused by self signed cert #572

Cluster upgrade internal error caused by self signed cert #572

Comments

viane commented May 23, 2021

openshift-bot commented Aug 21, 2021

openshift-bot commented Sep 20, 2021

openshift-bot commented Oct 20, 2021

openshift-ci bot commented Oct 20, 2021