Failed to mount configmap/secret volume because of "no such file or directory" #1736

GsssC · 2020-05-27T07:44:26Z

What happened:
Failed to mount configmap/secret volume because of "no such file or directory". Although we can be sure that related resources are included in the sqlite.

I0527 10:35:29.789719     660 edged_volumes.go:54] Using volume plugin "kubernetes.io/empty-dir" to mount wrapped_kube-proxy
I0527 10:35:29.800195     660 process.go:685] get a message {Header:{ID:8b57a409-25c9-454e-a9ae-b23f0b1861a9 ParentID: Timestamp:1590546929789 ResourceVersion: Sync:true} Router:{Source:edged Group:meta Operation:query Resource:kube-system/configmap/kube-proxy} Content:<nil>}
I0527 10:35:29.800543     660 metaclient.go:121] send sync message kube-system/configmap/kube-proxy successed and response: {{ab5f3aab-11ff-48cf-8c3b-c5ded97678db 8b57a409-25c9-454e-a9ae-b23f0b1861a9 1590546929800  false} {metaManager meta response kube-system/configmap/kube-proxy} [{"data":{"config.conf":"apiVersion: kubeproxy.config.k8s.io/v1alpha1\nbindAddress: 0.0.0.0\nclientConnection:\n  acceptContentTypes: \"\"\n  burst: 0\n  contentType: \"\"\n  kubeconfig: /var/lib/kube-proxy/kubeconfig.conf\n  qps: 0\nclusterCIDR: 192.168.0.0/16\nconfigSyncPeriod: 0s\nconntrack:\n  maxPerCore: null\n  min: null\n  tcpCloseWaitTimeout: null\n  tcpEstablishedTimeout: null\nenableProfiling: false\nhealthzBindAddress: \"\"\nhostnameOverride: \"\"\niptables:\n  masqueradeAll: false\n  masqueradeBit: null\n  minSyncPeriod: 0s\n  syncPeriod: 0s\nipvs:\n  excludeCIDRs: null\n  minSyncPeriod: 0s\n  scheduler: \"\"\n  strictARP: false\n  syncPeriod: 0s\nkind: KubeProxyConfiguration\nmetricsBindAddress: \"\"\nmode: \"\"\nnodePortAddresses: null\noomScoreAdj: null\nportRange: \"\"\nudpIdleTimeout: 0s\nwinkernel:\n  enableDSR: false\n  networkName: \"\"\n  sourceVip: \"\"","kubeconfig.conf":"apiVersion: v1\nkind: Config\nclusters:\n- cluster:\n    certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt\n    server: https://10.10.102.78:6443\n  name: default\ncontexts:\n- context:\n    cluster: default\n    namespace: default\n    user: default\n  name: default\ncurrent-context: default\nusers:\n- name: default\n  user:\n    tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token"},"metadata":{"creationTimestamp":"2020-04-21T14:50:46Z","labels":{"app":"kube-proxy"},"name":"kube-proxy","namespace":"kube-system","resourceVersion":"193","selfLink":"/api/v1/namespaces/kube-system/configmaps/kube-proxy","uid":"5651c863-c755-4da4-8039-b251efc82470"}}]}
E0527 10:35:29.800949     660 configmap.go:249] Error creating atomic writer: stat /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory
W0527 10:35:29.801070     660 empty_dir.go:392] Warning: Unmount skipped because path does not exist: /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy
I0527 10:35:29.801109     660 record.go:24] Warning FailedMount MountVolume.SetUp failed for volume "kube-proxy" : stat /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory
E0527 10:35:29.801199     660 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\" (\"25e6f0ea-6364-4bcc-9937-9760b6ec956a\")" failed. No retries permitted until 2020-05-27 10:37:31.80112802 +0800 CST m=+2599.727653327 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\") pod \"kube-proxy-gbdgw\" (UID: \"25e6f0ea-6364-4bcc-9937-9760b6ec956a\") : stat /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory"

What you expected to happen:
mount successffuly
How to reproduce it (as minimally and precisely as possible):
Sorry for I can not provide the way of reproduction.
Anything else we need to know?:

Environment:

KubeEdge version(e.g. cloudcore/edgecore --version): v1.3.0

The text was updated successfully, but these errors were encountered:

GsssC · 2020-05-27T07:44:41Z

/assign

GsssC · 2020-05-27T08:12:51Z

Logs that may be bug-related

I0527 15:53:09.966617     660 edged.go:903] consumer: [0], worker get removed pod [kube-proxy-gbdgw]
I0527 15:53:09.966629     660 edged.go:975] start to consume removed pod [kube-proxy-gbdgw]
I0527 15:53:09.966652     660 edged.go:994] consume removed pod [kube-proxy-gbdgw] successfully
I0527 15:53:10.073450     660 reconciler.go:183] operationExecutor.UnmountVolume started for volume "kube-proxy-token-qzpzl" (UniqueName: "kubernetes.io/secret/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy-token-qzpzl") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a") 
I0527 15:53:10.073537     660 reconciler.go:183] operationExecutor.UnmountVolume started for volume "xtables-lock" (UniqueName: "kubernetes.io/host-path/25e6f0ea-6364-4bcc-9937-9760b6ec956a-xtables-lock") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a") 
I0527 15:53:10.073613     660 reconciler.go:183] operationExecutor.UnmountVolume started for volume "lib-modules" (UniqueName: "kubernetes.io/host-path/25e6f0ea-6364-4bcc-9937-9760b6ec956a-lib-modules") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a") 
I0527 15:53:10.073887     660 operation_generator.go:713] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/25e6f0ea-6364-4bcc-9937-9760b6ec956a-lib-modules" (OuterVolumeSpecName: "lib-modules") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a"). InnerVolumeSpecName "lib-modules". PluginName "kubernetes.io/host-path", VolumeGidValue ""
I0527 15:53:10.074642     660 operation_generator.go:713] UnmountVolume.TearDown succeeded for volume "kubernetes.io/host-path/25e6f0ea-6364-4bcc-9937-9760b6ec956a-xtables-lock" (OuterVolumeSpecName: "xtables-lock") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a"). InnerVolumeSpecName "xtables-lock". PluginName "kubernetes.io/host-path", VolumeGidValue ""
I0527 15:53:10.102565     660 operation_generator.go:713] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy-token-qzpzl" (OuterVolumeSpecName: "kube-proxy-token-qzpzl") pod "25e6f0ea-6364-4bcc-9937-9760b6ec956a" (UID: "25e6f0ea-6364-4bcc-9937-9760b6ec956a"). InnerVolumeSpecName "kube-proxy-token-qzpzl". PluginName "kubernetes.io/secret", VolumeGidValue ""

E0521 14:21:19.328467   19341 edged.go:939] Unable to mount volumes for pod "kube-proxy-gbdgw_kube-system(25e6f0ea-6364-4bcc-9937-9760b6ec956a)": unmounted volumes=[kube-proxy], unattached volumes=[xtables-lock lib-modules kube-proxy-token-qzpzl kube-proxy]: timed out waiting for the condition; skipping pod
I0521 14:21:19.328548   19341 edged.go:858] worker [4] get pod addition item [kube-proxy-gbdgw]
E0521 14:21:19.328570   19341 edged.go:861] consume pod addition backoff: Back-off consume pod [kube-proxy-gbdgw] addition  error, backoff: [5m0s]
I0521 14:21:19.328616   19341 edged.go:863] worker [4] backoff pod addition item [kube-proxy-gbdgw] failed, re-add to queue
E0521 14:21:19.328639   19341 edged.go:877] worker [4] handle pod addition item [kube-proxy-gbdgw] failed: unmounted volumes=[kube-proxy], unattached volumes=[xtables-lock lib-modules kube-proxy-token-qzpzl kube-proxy]: timed out waiting for the condition, re-add to queue
I0521 14:21:19.808038   19341 edged_status.go:186] Sync VolumesInUse: []

zzxgzgz · 2020-05-27T17:09:42Z

What happened:
I encounter the same issue when trying to run Mizar with KubeEdge.

A part of EdgeCore's log:

E0527 04:21:51.734263   29955 configmap.go:249] Error creating atomic writer: stat /var/lib/edged/pods/b9c5a329-d276-4a70-a168-c413774aa937/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory
W0527 04:21:51.734313   29955 empty_dir.go:392] Warning: Unmount skipped because path does not exist: /var/lib/edged/pods/b9c5a329-d276-4a70-a168-c413774aa937/volumes/kubernetes.io~configmap/kube-proxy
I0527 04:21:51.734341   29955 record.go:24] Warning FailedMount MountVolume.SetUp failed for volume "kube-proxy" : stat /var/lib/edged/pods/b9c5a329-d276-4a70-a168-c413774aa937/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory

Images running on the Edge Node:

root@ip-172-31-12-211:/var/lib/edged/pods/97e36ff1-7a5e-4cc0-8158-4aba1ead53f0/volumes# docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS               NAMES
b88dff023d67        fwnetworking/testpod   "/bin/sh -c /var/miz…"   9 hours ago         Up 9 hours                              k8s_pod0_pod0_default_97e36ff1-7a5e-4cc0-8158-4aba1ead53f0_0
40fe0451c55e        kubeedge/pause:3.1     "/pause"                 9 hours ago         Up 9 hours                              k8s_POD_pod0_default_97e36ff1-7a5e-4cc0-8158-4aba1ead53f0_0
c9b6dacc73ed        kubeedge/pause:3.1     "/pause"                 9 hours ago         Up 9 hours                              k8s_POD_mizar-operator-74b778447b-gwsrf_default_d8693fb8-1e49-4ba1-8daa-102692a19755_0
205636032aee        3439b7546f29           "/usr/local/bin/kube…"   11 hours ago        Up 11 hours                             k8s_kube-proxy_kube-proxy-5zbgt_kube-system_6aa19402-09ad-483d-905e-acc24a890b57_0
9248c9716362        kubeedge/pause:3.1     "/pause"                 11 hours ago        Up 11 hours                             k8s_POD_mizar-daemon-ntbkc_default_953935ee-7239-472e-993b-803761337b17_0
3e31578d543c        kubeedge/pause:3.1     "/pause"                 11 hours ago        Up 11 hours                             k8s_POD_kube-proxy-5zbgt_kube-system_6aa19402-09ad-483d-905e-acc24a890b57_0

What you expected to happen:
Mizar to be deployed successfully, which should look like this on a worker node:

140ec7b3e263        fwnetworking/endpointopr   "/bin/sh -c 'kopf ru…"   6 days ago          Up 6 days                               k8s_mizar-operator_mizar-operator-74b778447b-76z6j_default_077dfc71-e542-4b54-9c35-04891eb67356_0
4158514b1a22        fwnetworking/dropletd      "/bin/sh -c mizard"      6 days ago          Up 6 days                               k8s_mizar-daemon_mizar-daemon-6vzvj_default_a3752409-5541-4273-b6d6-67a6599af09c_0
c1a657836b01        k8s.gcr.io/pause:3.2       "/pause"                 6 days ago          Up 6 days                               k8s_POD_mizar-operator-74b778447b-76z6j_default_077dfc71-e542-4b54-9c35-04891eb67356_0
470b7468707d        k8s.gcr.io/pause:3.2       "/pause"                 6 days ago          Up 6 days                               k8s_POD_mizar-daemon-6vzvj_default_a3752409-5541-4273-b6d6-67a6599af09c_0
5c6e219c1d05        0d40868643c6               "/usr/local/bin/kube…"   6 days ago          Up 6 days                               k8s_kube-proxy_kube-proxy-rcxx6_kube-system_717c68e4-f703-4dfb-86b0-2a8d00f6047d_0
22b01d38e723        k8s.gcr.io/pause:3.2       "/pause"                 6 days ago          Up 6 days                               k8s_POD_kube-proxy-rcxx6_kube-system_717c68e4-f703-4dfb-86b0-2a8d00f6047d_0

How to reproduce it (as minimally and precisely as possible):

Create a K8s cluster(didn't use any CNI listed here as Mizar should also work as a CNI when deployed).
Run cloudcore on the master node, and edgecore on the edge node.
Then I run these commands in the /mizar folder to deploy it:

 ./install/create_service_account.sh
 ./install/create_crds.sh
 kubectl apply -f ./etc/deploy/daemon.deploy.yaml
 kubectl apply -f ./etc/deploy/operator.deploy.yaml

Anything else we need to know?
Below are some pre-steps to run Mizar successfully:

Clone the Mizar repo from Here.
Run all commands in https://github.com/futurewei-cloud/mizar/blob/dev-next/k8s/kind/Dockerfile on all worker nodes manually inside the mizar folder.
In your worker nodes' mizar folder, find mizar/mizar/daemon/app.py and delete line 40:
nsenter -t 1 -m -u -n -i rm /etc/cni/net.d/10-kindnet.conflist &&\
Open port 111, 622 for RPC communication by using command 'sudo ufw allow 111/udp', also open ports for UDP.
Security groups allow port 111, 622.

Environment:

CloudCore: KubeEdge v1.2.0-beta.0.231+1c8bfa95ced0d3-dirty
EdgeCore: KubeEdge v1.2.2-1+343d12f867c86d-dirty

GsssC · 2020-05-27T17:40:50Z

@zzxgzgz I run kubeproxy on purpose, do you not want to run it, but still appear?

zzxgzgz · 2020-05-27T17:44:33Z

@GsssC That's right. I did not deploy kube-proxy. Maybe the Mizar did. But according to the Mizar team, there's no proxy in the Mizar program.

GsssC · 2020-05-27T17:50:42Z

@zzxgzgz is there any kubeproxy pod deployed to edgenode, as shown in ’kubectl get pod -nkube-system -owide’

zzxgzgz · 2020-05-27T17:53:18Z

@GsssC Looks like there are one for cloud node and one for edge node:

root@ip-172-31-10-89:/home/ubuntu/keadm# kubectl get pod -nkube-system -owide
NAME                                      READY   STATUS    RESTARTS   AGE   IP              NODE               NOMINATED NODE   READINESS GATES
coredns-66bff467f8-h8bbf                  1/1     Running   0          21h   10.244.0.2      ip-172-31-10-89    <none>           <none>
coredns-66bff467f8-w72zd                  1/1     Running   0          21h   10.244.0.3      ip-172-31-10-89    <none>           <none>
etcd-ip-172-31-10-89                      1/1     Running   0          21h   172.31.10.89    ip-172-31-10-89    <none>           <none>
kube-apiserver-ip-172-31-10-89            1/1     Running   0          21h   172.31.10.89    ip-172-31-10-89    <none>           <none>
kube-controller-manager-ip-172-31-10-89   1/1     Running   0          21h   172.31.10.89    ip-172-31-10-89    <none>           <none>
kube-proxy-5zbgt                          1/1     Running   0          20h   172.31.12.211   ip-172-31-12-211   <none>           <none>
kube-proxy-rfx56                          1/1     Running   0          21h   172.31.10.89    ip-172-31-10-89    <none>           <none>
kube-scheduler-ip-172-31-10-89            1/1     Running   0          21h   172.31.10.89    ip-172-31-10-89    <none>           <none>

GsssC · 2020-05-27T18:03:22Z

@zzxgzgz yes，because kubeproxy is running as a deamonset. you can get it by ‘kubectl get all -o wide -nkube-system’，that is why kubeproxy is deployed on edgenode automatically.

But it’s not related with this issue.

zzxgzgz · 2020-05-27T18:09:46Z

@GsssC Thank you for explaining, that makes more sense now. Does it mean that it is normal for kube-proxy running on the edge node? Is this the reason why the other containers cannot mount successfully?

Just a FYI, when I tried to deploy a test pod on the edge node, it was successful, but the Mizar related pods are still failing.

GsssC · 2020-05-28T01:48:13Z

@zzxgzgz
Does it mean that it is normal for kube-proxy running on the edge node?

No, for edge scene kubeproxy should not running on the edgenode, you could modify the yaml of kubeproxy daemonset to avoid it.

But for mizar, sounds like a cni plugin. Does it need to connect to api-server via service clusterip, just like calico? If so it may need kubeproxy run well

Is this the reason why the other containers cannot mount successfully?

No, I am working on why fail to mount.

Just a FYI, when I tried to deploy a test pod on the edge node, it was successful, but the Mizar related pods are still failing.

There are many reasons why pod can not run well on edgenodes. I suggest that you could view the logs in mizar container.

zzxgzgz · 2020-05-29T01:49:44Z

@GsssC Thank you for your reply.

Yes, one of the functionalities of Mizar is to work as a CNI, and it requires kube-proxy. I suspect that's the reason why it doesn't run well with KubeEdge.

By the way, are you on KubeEdge's Slack channel? I'd like to ask you more questions about KubeEdge and it is more convenient to communicate on Slack.

Thank you.

GsssC · 2020-06-01T08:11:49Z

@zzxgzgz I am on KubeEdge Slack channel and named GsssC as well. But not often to use it.

GsssC · 2020-06-04T02:56:48Z

Hey guys! I have been tracking this bug for the last few days.For now, I can draw some conclusions:

Reason:
I still do not know what operation or reason causes the bug. If some one also meet this bug, plz tell me the detail.
Phenomena:
All pod (beside kube-proxy) which use configmaps mounted into pod will meet this bug, and pending because of configmap mount fail. And it logs :

nestedpendingoperations.go:270] Operation for "\"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\" (\"25e6f0ea-6364-4bcc-9937-9760b6ec956a\")" failed. No retries permitted until 2020-05-27 10:37:31.80112802 +0800 CST m=+2599.727653327 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\") pod \"kube-proxy-gbdgw\" (UID: \"25e6f0ea-6364-4bcc-9937-9760b6ec956a\") : stat /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory"

Temporary solution:
For exploring the code, we could delete the "ready file" in /var/lib/edged/pods/pod_uid/plugins to triger re-create directory.And wait for seconds, for me, everything gose ok.

GsssC · 2020-06-04T03:01:31Z

/cc @fisherxu @kevin-wangzefeng

zzxgzgz · 2020-06-04T03:46:23Z

Hey guys! I have been tracking this bug for the last few days.For now, I can draw some conclusions:

Reason:
I still do not know what operation or reason causes the bug. If some one also meet this bug, plz tell me the detail.

Phenomena:
All pod (beside kube-proxy) which use configmaps mounted into pod will meet this bug, and pending because of configmap mount fail. And it logs :
nestedpendingoperations.go:270] Operation for "\"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\" (\"25e6f0ea-6364-4bcc-9937-9760b6ec956a\")" failed. No retries permitted until 2020-05-27 10:37:31.80112802 +0800 CST m=+2599.727653327 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"kube-proxy\" (UniqueName: \"kubernetes.io/configmap/25e6f0ea-6364-4bcc-9937-9760b6ec956a-kube-proxy\") pod \"kube-proxy-gbdgw\" (UID: \"25e6f0ea-6364-4bcc-9937-9760b6ec956a\") : stat /var/lib/edged/pods/25e6f0ea-6364-4bcc-9937-9760b6ec956a/volumes/kubernetes.io~configmap/kube-proxy: no such file or directory"
Temporary solution:
For exploring the code, we could delete the "ready file" in /var/lib/edged/pods/pod_uid/plugins to triger re-create directory.And restart edgecore, for me, everything gose ok.

Thank you for your effort!

One question for the temporary solution:
Should we delete the ready files for the failing pods? Or for the kube-proxy? Also should we only delete the 'ready' files in those folders, or should we delete the folders as well?

Thank you again.

GsssC · 2020-06-04T03:59:32Z

Hey guys! I found a way to 100% reproduce the bug.

the normal status
stop edgecore by systemctl stop edgecore
remove container by docker rm -f `docker ps -aq`
restart edgecore.

we can see the secret and configmap directory delete by edgecore when it restarts. For secret, edgecore recreate it after seconds.But for configmap, it can not be recreated by edgecore automatically. May the bug will be shown in kubelet as well, and i will do more test later.

GsssC · 2020-06-04T04:05:17Z

@zzxgzgz just delete configmap ready file about specific pod. Delete whole plugins directory is a convient way, because secret reconcile is no problem.

zzxgzgz · 2020-06-04T16:41:37Z

@GsssC I tried your solution by deleteing the whole plugin folder. And now the kube-proxy container is giving me a different error:

E0604 16:28:25.286180       1 event.go:214] Unable to write event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ip-172-31-12-211.161564027a866f3d", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-172-31-12-211", UID:"ip-172-31-12-211", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kube-proxy.", Source:v1.EventSource{Component:"kube-proxy", Host:"ip-172-31-12-211"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbfae66114928fd3d, ext:36109497681, loc:(*time.Location)(0x28a6880)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbfae66114928fd3d, ext:36109497681, loc:(*time.Location)(0x28a6880)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}' (retry limit exceeded!)
E0604 16:28:38.254086       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Service: Get https://172.31.10.89:6443/api/v1/services?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
E0604 16:28:51.375479       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Endpoints: Get https://172.31.10.89:6443/api/v1/endpoints?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

This is what the container folder looks like currently:

Do you know what could be the reason of this, or how to fix it?

Thank you.

GsssC · 2020-06-04T17:00:46Z

@zzxgzgz Have you ever reset kubernetes cluster? It seems like your kubeproxy’s kubeconfig is not suitable for current k8s cluster(api-server)

zzxgzgz · 2020-06-04T17:14:54Z

@GsssC I created a new cluster (and then run cloudcore & edgecore) before trying your solution. Does that count?

I also looked into the log of cloudcore, it has a weird bug as well:

25781 I0604 16:59:43.753441    3505 messagehandler.go:217] event received for node ip-172-31-12-211 id: 784e4d5e-648a-48eb-abd4-28ddaec04a7b, parent_id: 0aa8b
      807-fde8-4f29-bef4-5cb3a8ccdedb, group: resource, source: twin, resource: node/ip-172-31-12-211/membership/detail, operation: get, content: {"event_type
      ":"group_membership_event","event_id":"123","group_id":"ip-172-31-12-211","operation":"detail","timestamp":1591287463751}
25782 I0604 16:59:43.753465    3505 upstream.go:86] Dispatch message: 784e4d5e-648a-48eb-abd4-28ddaec04a7b
25783 W0604 16:59:43.753487    3505 upstream.go:90] Parse message: 784e4d5e-648a-48eb-abd4-28ddaec04a7b resource type with error: unknown resource

I looked into the code, and this error was triggered because the resource is "node/edge-node/membership/detail", and in this function:
https://github.com/kubeedge/kubeedge/blob/master/cloud/pkg/devicecontroller/controller/upstream.go#L88
If you pass in "node/ip-172-31-12-211/membership/detail", it will always return an error as it is simply checking if the string passed in ("node/edge-node/membership/detail" in this case) contains
deviceconstants.ResourceTypeTwinEdgeUpdated("twin/edge_updated").

Do you think it might be related to this error?

GsssC · 2020-06-04T17:27:08Z

@zzxgzgz No，I think it’s not related to your problem.

I suggest that you should check sync-crds to confirm the new cluster’s kubeproxy configmap has been sent to edge, or edgecore will use last kubeproxy configmap in edge.db:

kubectl get crds
And then kubectl get “kubeedge.io related crd”

GsssC · 2020-06-04T17:33:25Z

@zzxgzgz If so, you need to delete configmap in /var/lib/kubeedge/edgecore.db to trigger edgecore query for new configmap

zzxgzgz · 2020-06-04T17:37:13Z

@GsssC I am able to get the crds:

root@ip-172-31-10-89:/home/go/src/github.com/mizar# kubectl  get crds
NAME                                           CREATED AT
bouncers.mizar.com                             2020-06-04T16:19:36Z
clusterobjectsyncs.reliablesyncs.kubeedge.io   2020-06-04T16:11:48Z
devicemodels.devices.kubeedge.io               2020-06-04T16:11:48Z
devices.devices.kubeedge.io                    2020-06-04T16:11:48Z
dividers.mizar.com                             2020-06-04T16:19:36Z
droplets.mizar.com                             2020-06-04T16:19:36Z
endpoints.mizar.com                            2020-06-04T16:19:36Z
nets.mizar.com                                 2020-06-04T16:19:36Z
objectsyncs.reliablesyncs.kubeedge.io          2020-06-04T16:11:49Z
vpcs.mizar.com                                 2020-06-04T16:19:37Z

However, I'm not sure about how to check the config map has been sent to edge.

Also, should I just delete the edgecore.db and crd records? Should I also restart edgecore? How do I delete crd records? Is it by doing :

kubectl delete crds

?

Thank you.

GsssC · 2020-06-04T17:39:46Z

@zzxgzgz try
kubectl get objectsyncs.reliablesyncs.kubeedge.io

zzxgzgz · 2020-06-04T17:41:38Z

Yes, I got something:

root@ip-172-31-10-89:/home/go/src/github.com/mizar# kubectl get objectsyncs.reliablesyncs.kubeedge.io
NAME                                                    AGE
ip-172-31-12-211.0c1faeb7-9a69-454a-92db-19ca3a33e106   80m
ip-172-31-12-211.6ebf7a85-81f1-4570-8411-b484e33dccd9   80m
ip-172-31-12-211.e3343f31-8284-447e-81af-0b58d8ed28dc   80m

Shall we continue this conversation on Slack? My username on Slack is Rio Zhu and I sent you a few messages before.

GsssC · 2020-06-04T17:46:18Z

@zzxgzgz try
kubectl get objectsyncs.reliablesyncs.kubeedge.io -o yaml

If no information about kubeproxy configmap, means it is never sent to edge.

GsssC · 2020-06-04T17:49:18Z

@zzxgzgz Sorry I am not used to use slack. And I think our conversation in github could help more people for bug fix.

zzxgzgz · 2020-06-04T17:54:54Z

You're right, I don't find any name related to the configMap:

apiVersion: v1
items:
- apiVersion: reliablesyncs.kubeedge.io/v1alpha1
  kind: ObjectSync
  metadata:
    creationTimestamp: "2020-06-04T16:19:43Z"
    generation: 1
    managedFields:
    - apiVersion: reliablesyncs.kubeedge.io/v1alpha1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:objectKind: {}
          f:objectName: {}
        f:status:
          .: {}
          f:objectResourceVersion: {}
      manager: cloudcore
      operation: Update
      time: "2020-06-04T16:59:53Z"
    name: ip-172-31-12-211.0c1faeb7-9a69-454a-92db-19ca3a33e106
    namespace: default
    resourceVersion: "9957"
    selfLink: /apis/reliablesyncs.kubeedge.io/v1alpha1/namespaces/default/objectsyncs/ip-172-31-12-211.0c1faeb7-9a69-454a-92db-19ca3a33e106
    uid: 6aa3bfe0-e751-4764-9a0e-e0537bd056fa
  spec:
    objectKind: pod
    objectName: mizar-daemon-hmsj9
  status:
    objectResourceVersion: "9956"
- apiVersion: reliablesyncs.kubeedge.io/v1alpha1
  kind: ObjectSync
  metadata:
    creationTimestamp: "2020-06-04T16:19:44Z"
    generation: 1
    managedFields:
    - apiVersion: reliablesyncs.kubeedge.io/v1alpha1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:objectKind: {}
          f:objectName: {}
        f:status:
          .: {}
          f:objectResourceVersion: {}
      manager: cloudcore
      operation: Update
      time: "2020-06-04T16:19:44Z"
    name: ip-172-31-12-211.6ebf7a85-81f1-4570-8411-b484e33dccd9
    namespace: default
    resourceVersion: "1943"
    selfLink: /apis/reliablesyncs.kubeedge.io/v1alpha1/namespaces/default/objectsyncs/ip-172-31-12-211.6ebf7a85-81f1-4570-8411-b484e33dccd9
    uid: 12b7a1f7-9abe-4a01-b2e6-735e6432cac0
  spec:
    objectKind: secret
    objectName: mizar-operator-token-q2rf2
  status:
    objectResourceVersion: "1872"
- apiVersion: reliablesyncs.kubeedge.io/v1alpha1
  kind: ObjectSync
  metadata:
    creationTimestamp: "2020-06-04T16:19:49Z"
    generation: 1
    managedFields:
    - apiVersion: reliablesyncs.kubeedge.io/v1alpha1
      fieldsType: FieldsV1
      fieldsV1:
        f:spec:
          .: {}
          f:objectKind: {}
          f:objectName: {}
        f:status:
          .: {}
          f:objectResourceVersion: {}
      manager: cloudcore
      operation: Update
      time: "2020-06-04T16:57:53Z"
    name: ip-172-31-12-211.e3343f31-8284-447e-81af-0b58d8ed28dc
    namespace: default
    resourceVersion: "9560"
    selfLink: /apis/reliablesyncs.kubeedge.io/v1alpha1/namespaces/default/objectsyncs/ip-172-31-12-211.e3343f31-8284-447e-81af-0b58d8ed28dc
    uid: 0b4dab0c-89bd-4c5d-a4ba-203939f7c655
  spec:
    objectKind: pod
    objectName: mizar-operator-74b778447b-l5s9f
  status:
    objectResourceVersion: "9556"
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Should each pod shown in this yaml file (mizar-operator/daemon in this case) include a configMap in their yaml files?

GsssC added the kind/bug Categorizes issue or PR as related to a bug. label May 27, 2020

kubeedge-bot assigned GsssC May 27, 2020

GsssC mentioned this issue May 27, 2020

nvidia-device-plugin daemonSet pod stuck in pending status #1602

Closed

GsssC closed this as completed Jun 4, 2020

GsssC reopened this Jun 4, 2020

GsssC mentioned this issue Jun 11, 2020

delay the volume mount until the pods are retrieved from metaManger and added to managers when edgecore starts #1809

Merged

kubeedge-bot closed this as completed in #1809 Jun 17, 2020

Rachel-Shao mentioned this issue Jun 16, 2021

dial websocket error(x509: certificate signed by unknown authority (possibly because of "x509: ECDSA verification failure" while trying to verify candidate authority certificate "KubeEdge")) #2903

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed to mount configmap/secret volume because of "no such file or directory" #1736

Failed to mount configmap/secret volume because of "no such file or directory" #1736

GsssC commented May 27, 2020

GsssC commented May 27, 2020

GsssC commented May 27, 2020 •

edited

Loading

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 28, 2020

zzxgzgz commented May 29, 2020

GsssC commented Jun 1, 2020

GsssC commented Jun 4, 2020 •

edited

Loading

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020 •

edited

Loading

GsssC commented Jun 4, 2020 •

edited

Loading

GsssC commented Jun 4, 2020 •

edited

Loading

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

Failed to mount configmap/secret volume because of "no such file or directory" #1736

Failed to mount configmap/secret volume because of "no such file or directory" #1736

Comments

GsssC commented May 27, 2020

GsssC commented May 27, 2020

GsssC commented May 27, 2020 • edited Loading

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 27, 2020

zzxgzgz commented May 27, 2020

GsssC commented May 28, 2020

zzxgzgz commented May 29, 2020

GsssC commented Jun 1, 2020

GsssC commented Jun 4, 2020 • edited Loading

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020 • edited Loading

GsssC commented Jun 4, 2020 • edited Loading

GsssC commented Jun 4, 2020 • edited Loading

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented Jun 4, 2020

GsssC commented Jun 4, 2020

zzxgzgz commented Jun 4, 2020

GsssC commented May 27, 2020 •

edited

Loading

GsssC commented Jun 4, 2020 •

edited

Loading

zzxgzgz commented Jun 4, 2020 •

edited

Loading

GsssC commented Jun 4, 2020 •

edited

Loading

GsssC commented Jun 4, 2020 •

edited

Loading