-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSR not getting approved for libvirt & baremetal platforms #1893
Comments
/priority critical-urgent |
/assign |
Some hopefully helpful log: $ oc get csr
NAME AGE REQUESTOR CONDITION
csr-q2ppk 12m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Approved,Issued
csr-xbdkl 11m system:node:pupu-tdcvn-master-0 Approved,Issued
csr-xbwjw 5m7s system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
$ oc logs pods/machine-approver-669dff69cb-rn2nj -n openshift-cluster-machine-approver
I0701 16:35:10.855320 1 config.go:33] using default as failed to load config /var/run/configmaps/config/config.yaml: open /var/run/configmaps/config/config.yaml: no such file or directory
I0701 16:35:10.855593 1 config.go:23] machine approver config: {NodeClientCert:{Disabled:false}}
E0701 16:35:10.856664 1 reflector.go:126] github.com/openshift/cluster-machine-approver/main.go:185: Failed to list *v1beta1.CertificateSigningRequest: Get https://127.0.0.1:6443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6443: connect: connection refused
E0701 16:35:11.858356 1 reflector.go:126] github.com/openshift/cluster-machine-approver/main.go:185: Failed to list *v1beta1.CertificateSigningRequest: Get https://127.0.0.1:6443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests?limit=500&resourceVersion=0: dial tcp 127.0.0.1:6443: connect: connection refused
# same message over and over again. |
After a bit more testing, I believe the:
error goes away after a while (I guess when the API server is finally up) but then we've another issue:
So I think the main issue we're facing here is the API server taking way too long on the nested virt environment to come up. The check for csr creation time was added in mid-May. |
With iptables flushed, now the Installer still fails (it didn't when started debugging this last week) and the issue is:
So most likely now we've two issue. I'll test again with full (default) firewall rules and see if we still get this error or not, to be sure. |
with the default iptables rules, it's exactly the same:
so maybe the actual issue we faced last week is somehow gone and now we've this issue? I say that cause last week flushing iptables allowed a successful cluster creation. |
Maybe related to https://bugzilla.redhat.com/show_bug.cgi?id=1723955 |
@cgwalters yeah, seems like the same issue to me. No upstream issue for this? |
Is it taking more than 10 minutes between when the Machine resource is created and when the CSR is created? If so, then the auto-approver will reject the request, and the user is required to manually approve the CSR. |
As of now because of openshift/installer#1893 csr approval not going through so as a workaround we need to approve it ourself.
As of now because of openshift/installer#1893 csr approval not going through so as a workaround we need to approve it ourself.
As of now because of openshift/installer#1893 csr approval not going through so as a workaround we need to approve it ourself.
I'm removing libvirt label since it doesn't seem to be specific to libvirt any more (#1893 (comment)). /remove-label platform/libvirt |
/unassign @praveenkumar |
hi @staebler , curious - where in the code we have this cut off time of 10 min ? |
|
Not sure why the bot relabled this as libvirt. /remove-label platform/libvirt |
This seems like this belongs on https://github.com/openshift/cluster-machine-approver /close |
@abhinavdahiya: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@abhinavdahiya openshift/cluster-machine-approver#36 /remove-priority critical-urgent |
Hi. For human like me who has no idea what's is going on in this thread, https://github.com/openshift/installer/tree/release-4.2/docs/dev/libvirt#console-doesnt-come-up |
Version: master
Platform: libvirt
What happened?
Our
e2e-libvirt
CI job currently doesn't succeed because of CSR not getting approved. This is likely due to misconfigured firewall but needs investigation.https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/1883/pull-ci-openshift-installer-master-e2e-libvirt/478/build-log.txt
/assign @praveenkumar
/label platform/libvirt
The text was updated successfully, but these errors were encountered: