When installing kubeflow from latest master and rc2 release cache-deployer-deployment fails
kubeflow cache-deployer-deployment-6f4bcc969-jxq2f 1/2 CrashLoopBackOff 23 102m
kubeflow cache-server-575d97c95-md8bg 0/2 Init:0/1 0 102m
Error logs for cache-deployer :
+ for x in $(seq 10)
++ kubectl get csr cache-server.kubeflow -o 'jsonpath={.status.certificate}'
+ serverCert=
+ [[ '' != '' ]]
+ sleep 1
+ [[ '' == '' ]]
+ echo 'ERROR: After approving csr cache-server.kubeflow, the signed certificate did not appear on the resource. Giving up after 10 attempts.'
ERROR: After approving csr cache-server.kubeflow, the signed certificate did not appear on the resource. Giving up after 10 attempts.
+ exit 1
cache-server.kubeflow looks like this :
$ kubectl get csr cache-server.kubeflow -o yaml
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
creationTimestamp: "2022-03-09T03:05:10Z"
name: cache-server.kubeflow
resourceVersion: "306173"
selfLink: /apis/certificates.k8s.io/v1/certificatesigningrequests/cache-server.kubeflow
uid: f6179659-fa58-4c3b-a9af-d69543667415
spec:
groups:
- system:serviceaccounts
- system:serviceaccounts:kubeflow
- system:authenticated
request: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURSBSRVFVRVNULS0tLS0KTUlJREdEQ0NBZ0FDQVFBd1NERXZNQzBHQTFVRUF3d21jM2x6ZEdWdE9tNXZaR1U2WTJGamFHVXRjMlZ5ZG1WeQpMbXQxWW1WbWJHOTNMbk4yWXpzeEZUQVRCZ05WQkFvTURITjVjM1JsYlRwdWIyUmxjekNDQVNJd0RRWUpLb1pJCmh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBT0lHanhzbS83NmdqRy8yWC91OXJpdVR0TUV6b1VJdDVmZFUKQWdIK0xGbVB5Vll4V1VGSmpEbTVhbUNmcmo1aUpIRll1NmxYMitER2gxWmNsMEVBZXJvYi83dm1VYUFOTUJaRwpVaUExMXFRN1ZkMGVuNll1ZnBqZ29CUVN0MTZnWC9FNTNralhQK1FXQ0ErRUNyY1czZG1UdHN4T09kSW9BODdRCjB5U0VhdFhSZFI4ODU5TjF4V0VFTnhXcElzUDdoL1lJSXJ1ZERDV1RiOU5qWWN1bDJCaU9FRXpuMzE2NkREditzQ1VPNW5NTkdkY0JEaVlaMU9SmRJWS9ycTN1MGkrTDNBUFN3aCtoL3V5S3U2Y0Jrc1dMT1ZpdlJHOXFXRnZhQWtDQXdFQUFhQ0JpakNCCmh3WUpLb1pJaHZjTkFRa09NWG93ZURBSkJnTlZIUk1FQWpBQU1Bc0dBMVVkRHdRRUF3SUY0REFUQmdOVkhTVUUKRERBS0JnZ3JCZ0VGQlFjREFUQkpCZ05WSFJFRVFqQkFnZ3hqWVdOb1pTMXpaWEoyWlhLQ0ZXTmhZMmhsTFhObApjblpsY2k1cmRXSmxabXh2ZDRJWlkyRmphR1V0YzJWeWRtVnlMbXQxWW1WbWJHOTNMbk4yWXpBTkJna3Foa2lHCjl3MEJBUXNGQUFPQ0FRRUF3OCtoMjlWdkk1UjhIMVNvdFFiV0FDTXIrQ2puNGJIaHpwWmh2VDZXcHNuRWZUMWUKbjdRY2RFTi9RZzZHbzJ6bi90eXRoZ0lFb2FlNnc4QTM0Ly9RUjRzbW54U3ZianZIM1BoR0wwY25DV0ZIZEI1bwpVSGZaWlAyS3RDeENPYzFKSXpYYzRURzZCNEg1WWU4OE16NUXXXXXXXXXXTdIeEdpY2EvU0J4Ck01UFBKcU5aUmFVckd6YUt4aStaVjFBYlkzRXVPZ1MrRG8vTFMvNmVmN2I0Z3JWYlBWVkxCVlVEUUFpVU8wK3EKbnRaa1BxTm1sSmIzS0xoWmpKbTJjRGlDaVlOZDBiOGNsUEtwNEZHSnhjMWpmQU1QY3kwangwU05USVhkZ05sVQorTU14anl6Q2wrQkZmWDN3dVV2K0E2ekFiUFBIUjRJaHZCWWprdz09Ci0tLS0tRU5EIENFUlRJRklDQVRFIFJFUVVFU1QtLS0tLQo=
signerName: kubernetes.io/kubelet-serving
uid: e98f0dd7-1480-42dd-b621-3b5790f53ce0
usages:
- digital signature
- key encipherment
- server auth
username: system:serviceaccount:kubeflow:kubeflow-pipelines-cache-deployer-sa
status:
conditions:
- lastTransitionTime: "2022-03-09T03:05:10Z"
lastUpdateTime: "2022-03-09T03:05:10Z"
message: This CSR was approved by kubectl certificate approve.
reason: KubectlApprove
status: "True"
type: Approved
Due to this cache-server pod is stuck in Init state :
en-xl2s8 webhook-tls-certs istiod-ca-cert istio-data istio-envoy]: timed out waiting for the condition
Warning FailedMount 99s (x107 over 3h23m) kubelet MountVolume.SetUp failed for volume "webhook-tls-certs" : secret "webhook-server-tls" not found
(This doesn't seem to affect the ability to run pipelines/notebook)
Tested on :
EKS - 1.19, 1.20 and 1.21
When installing kubeflow from latest master and rc2 release
cache-deployer-deploymentfailsError logs for
cache-deployer:cache-server.kubeflowlooks like this :Due to this
cache-serverpod is stuck in Init state :(This doesn't seem to affect the ability to run pipelines/notebook)
Tested on :
EKS - 1.19, 1.20 and 1.21