I use origin 3.7, both master and node on the same server.
After server reboot all pods fail to start with following errors in origin-node logs.
The only way I found is to reset and recreate docker storage. Is there a way to avoid/fix this bug?
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.140764 2616 cni.go:304] Error deleting network when building cni runtime conf: could not retrieve port mappings: checkpoint is not found.
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141255 2616 remote_runtime.go:114] StopPodSandbox "d39009bb25733767895364d47c4c7b156df4c58b6e3f512a3894a08890987f11" from runtime service failed: rpc e
rror: code = 2 desc = NetworkPlugin cni failed to teardown pod "hawkular-cassandra-1-p6lzt_openshift-infra" network: could not retrieve port mappings: checkpoint is not found.
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141340 2616 kuberuntime_manager.go:775] Failed to stop sandbox {"docker" "d39009bb25733767895364d47c4c7b156df4c58b6e3f512a3894a08890987f11"}
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141402 2616 remote_runtime.go:114] StopPodSandbox "75112e3ea11bdd0d2714fcd5afb1dd1f8ab69561ca2a980f4d6053fa073d27f7" from runtime service failed: rpc e
rror: code = 2 desc = NetworkPlugin cni failed to teardown pod "controller-manager-fs9br_kube-service-catalog" network: could not retrieve port mappings: checkpoint is not found.
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141440 2616 kuberuntime_manager.go:775] Failed to stop sandbox {"docker" "75112e3ea11bdd0d2714fcd5afb1dd1f8ab69561ca2a980f4d6053fa073d27f7"}
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141496 2616 kuberuntime_manager.go:570] killPodWithSyncResult failed: failed to "KillPodSandbox" for "7ddf9c90-472a-11e8-a67e-005056827a01" with KillPo
dSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"controller-manager-fs9br_kube-service-catalog\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141539 2616 pod_workers.go:186] Error syncing pod 7ddf9c90-472a-11e8-a67e-005056827a01 ("controller-manager-fs9br_kube-service-catalog(7ddf9c90-472a-11
e8-a67e-005056827a01)"), skipping: failed to "KillPodSandbox" for "7ddf9c90-472a-11e8-a67e-005056827a01" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"controller-
manager-fs9br_kube-service-catalog\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141491 2616 kuberuntime_manager.go:570] killPodWithSyncResult failed: failed to "KillPodSandbox" for "db907be0-46f8-11e8-b5d9-005056827a01" with KillPo
dSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"hawkular-cassandra-1-p6lzt_openshift-infra\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141675 2616 pod_workers.go:186] Error syncing pod db907be0-46f8-11e8-b5d9-005056827a01 ("hawkular-cassandra-1-p6lzt_openshift-infra(db907be0-46f8-11e8-
b5d9-005056827a01)"), skipping: failed to "KillPodSandbox" for "db907be0-46f8-11e8-b5d9-005056827a01" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"hawkular-cassa
ndra-1-p6lzt_openshift-infra\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141749 2616 remote_runtime.go:114] StopPodSandbox "e8573132f565875633fc034ca16296030f989350290800271b210400f1b8212b" from runtime service failed: rpc e
rror: code = 2 desc = NetworkPlugin cni failed to teardown pod "nodejs-mongodb-example-8-5n2jz_test1" network: could not retrieve port mappings: checkpoint is not found.
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141845 2616 kuberuntime_manager.go:775] Failed to stop sandbox {"docker" "e8573132f565875633fc034ca16296030f989350290800271b210400f1b8212b"}
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141896 2616 kuberuntime_manager.go:570] killPodWithSyncResult failed: failed to "KillPodSandbox" for "22fe7e4e-46fb-11e8-b5d9-005056827a01" with KillPo
dSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"nodejs-mongodb-example-8-5n2jz_test1\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:20 os-test3 origin-node[2542]: E0423 22:24:20.141941 2616 pod_workers.go:186] Error syncing pod 22fe7e4e-46fb-11e8-b5d9-005056827a01 ("nodejs-mongodb-example-8-5n2jz_test1(22fe7e4e-46fb-11e8-b5d9-0
05056827a01)"), skipping: failed to "KillPodSandbox" for "22fe7e4e-46fb-11e8-b5d9-005056827a01" with KillPodSandboxError: "rpc error: code = 2 desc = NetworkPlugin cni failed to teardown pod \"nodejs-mongodb-examp
le-8-5n2jz_test1\" network: could not retrieve port mappings: checkpoint is not found."
Apr 23 22:24:21 os-test3 origin-node[2542]: I0423 22:24:21.152335 2616 kuberuntime_manager.go:389] No ready sandbox for pod "hawkular-metrics-qlwst_openshift-infra(f1ab2b33-46f8-11e8-b5d9-005056827a01)" can be
found. Need to start a new one
Version
openshift v3.7.1+282e43f-42
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8
Steps To Reproduce
- just reboot
Current Result
Expected Result
Additional Information
I use origin 3.7, both master and node on the same server.
After server reboot all pods fail to start with following errors in origin-node logs.
The only way I found is to reset and recreate docker storage. Is there a way to avoid/fix this bug?
Version
openshift v3.7.1+282e43f-42
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8
Steps To Reproduce
Current Result
Expected Result
Additional Information