Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-system/kube-dns configmap miss #67633

Closed
daixiang0 opened this issue Aug 21, 2018 · 5 comments
Closed

kube-system/kube-dns configmap miss #67633

daixiang0 opened this issue Aug 21, 2018 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@daixiang0
Copy link
Contributor

daixiang0 commented Aug 21, 2018

/kind bug

What happened:
I install a fresh k8s, and find kube-dns configmap miss:

[root@dx-app deploy]# k get configmap --all-namespaces
NAMESPACE     NAME                                 DATA      AGE
kube-public   cluster-info                         3         6m
kube-system   calico-config                        6         5m
kube-system   extension-apiserver-authentication   6         6m
kube-system   kube-proxy                           2         6m
kube-system   kubeadm-config                       1         6m
NAMESPACE     NAME                                       READY     STATUS              RESTARTS   AGE       IP                NODE
kube-system   calico-kube-controllers-5b85d756c6-wkrj2   1/1       Running             0          5m        192.168.60.45     dx-app.novalocal
kube-system   calico-node-j7w7f                          2/2       Running             0          5m        192.168.60.45     dx-app.novalocal
kube-system   calico-node-w7rpc                          2/2       Running             0          4m        192.168.60.27     dx-computing.novalocal
kube-system   heapster-fc5d4f768-gt5hh                   1/1       Running             0          3m        196.171.106.196   dx-app.novalocal
kube-system   kube-apiserver-dx-app.novalocal            1/1       Running             0          3m        192.168.60.45     dx-app.novalocal
kube-system   kube-controller-manager-dx-app.novalocal   1/1       Running             0          5m        192.168.60.45     dx-app.novalocal
kube-system   kube-dns-86f4d74b45-2s82g                  3/3       Running             0          6m        196.171.106.193   dx-app.novalocal
kube-system   kube-proxy-452vt                           1/1       Running             0          6m        192.168.60.45     dx-app.novalocal
kube-system   kube-proxy-58gh8                           1/1       Running             0          4m        192.168.60.27     dx-computing.novalocal
kube-system   kube-scheduler-dx-app.novalocal            1/1       Running             1          5m        192.168.60.45     dx-app.novalocal
kube-system   kubernetes-dashboard-67f69f88b7-bzgkb      1/1       Running             0          3m        196.171.106.197   dx-app.novalocal

I know that kube-dns pod will use kube-dns configmap as a volume, but i can not find it.

The kube-dns pod says that all volume succeeded.

Events:
  Type     Reason                 Age              From                       Message
  ----     ------                 ----             ----                       -------
  Warning  FailedScheduling       8m (x8 over 9m)  default-scheduler          0/1 nodes are available: 1 node(s) were not ready.
  Normal   SuccessfulMountVolume  7m               kubelet, dx-app.novalocal  MountVolume.SetUp succeeded for volume "kube-dns-config"
  Normal   SuccessfulMountVolume  7m               kubelet, dx-app.novalocal  MountVolume.SetUp succeeded for volume "kube-dns-token-vbgnx"
  Normal   Pulled                 7m               kubelet, dx-app.novalocal  Container image "k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.8" already present on machine
  Normal   Scheduled              7m               default-scheduler          Successfully assigned kube-dns-86f4d74b45-2s82g to dx-app.novalocal
  Normal   Created                7m               kubelet, dx-app.novalocal  Created container
  Normal   Created                7m               kubelet, dx-app.novalocal  Created container
  Normal   Started                7m               kubelet, dx-app.novalocal  Started container
  Normal   Pulled                 7m               kubelet, dx-app.novalocal  Container image "k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.8" already present on machine
  Normal   Started                7m               kubelet, dx-app.novalocal  Started container
  Normal   Pulled                 7m               kubelet, dx-app.novalocal  Container image "k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.8" already present on machine
  Normal   Created                7m               kubelet, dx-app.novalocal  Created container
  Normal   Started                7m               kubelet, dx-app.novalocal  Started container
  Normal   SuccessfulMountVolume  6m               kubelet, dx-app.novalocal  MountVolume.SetUp succeeded for volume "kube-dns-token-vbgnx"
  Normal   SuccessfulMountVolume  6m               kubelet, dx-app.novalocal  MountVolume.SetUp succeeded for volume "kube-dns-config"

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-21T11:46:00Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-21T11:34:22Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Aug 21, 2018
@daixiang0
Copy link
Contributor Author

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 21, 2018
@daixiang0
Copy link
Contributor Author

And after reboot, k8s can not start as expected:

8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.482 [INFO][17331] calico.go 431: Extracted identifiers ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" Node="dx-app.novalocal" Orchestrator="k8s" WorkloadEndpoint="skyaxe--app--0.localdomain-k8s-kube--dns--86f4d74b45--b8n7h-eth0"
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.482 [WARNING][17331] k8s.go 328: WorkloadEndpoint does not exist in the datastore, moving forward with the clean up ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" WorkloadEndpoint="skyaxe--app--0.localdomain-k8s-kube--dns--86f4d74b45--b8n7h-eth0"
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: Calico CNI releasing IP address
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.541 [INFO][17362] utils.go 379: Configured environment: [CNI_COMMAND=DEL CNI_CONTAINERID=c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b CNI_NETNS= CNI_ARGS=IgnoreUnknown=1;IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system;K8S_POD_NAME=kube-dns-86f4d74b45-b8n7h;K8S_POD_INFRA_CONTAINER_ID=c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b CNI_IFNAME=eth0 CNI_PATH=/opt/calico/bin:/opt/cni/bin LANG=en_US.UTF-8 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin KUBELET_EXTRA_ARGS=--feature-gates=DevicePlugins=true KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin KUBELET_DNS_ARGS=--cluster-dns=12.96.0.10 --cluster-domain=cluster.local KUBELET_AUTHZ_ARGS=--authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt KUBELET_CADVISOR_ARGS=--cadvisor-port=8686 KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs KUBELET_CERTIFICATE_ARGS=--rotate-certificates=true --cert-dir=/var/lib/kubelet/pki ETCD_ENDPOINTS=https://10.0.0.10:2379 ETCD_KEY_FILE=/etc/cni/net.d/calico-tls/etcd-key ETCD_CERT_FILE=/etc/cni/net.d/calico-tls/etcd-cert ETCD_CA_CERT_FILE=/etc/cni/net.d/calico-tls/etcd-ca KUBECONFIG=/etc/cni/net.d/calico-kubeconfig]
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.552 [INFO][17362] calico-ipam.go 258: Releasing address using handleID ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" HandleID="k8s-pod-network.c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" Workload="skyaxe--app--0.localdomain-k8s-kube--dns--86f4d74b45--b8n7h-eth0"
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.552 [INFO][17362] ipam.go 1002: Releasing all IPs with handle 'k8s-pod-network.c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b'
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.552 [WARNING][17362] calico-ipam.go 265: Asked to release address but it doesn't exist. Ignoring ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" HandleID="k8s-pod-network.c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" Workload="skyaxe--app--0.localdomain-k8s-kube--dns--86f4d74b45--b8n7h-eth0"
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.552 [INFO][17362] calico-ipam.go 276: Releasing address using workloadID ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" HandleID="k8s-pod-network.c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b" Workload="skyaxe--app--0.localdomain-k8s-kube--dns--86f4d74b45--b8n7h-eth0"
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.553 [INFO][17362] ipam.go 1002: Releasing all IPs with handle 'kube-system.kube-dns-86f4d74b45-b8n7h'
8月 21 13:36:47 dx-app.novalocal kubelet[13641]: 2018-08-21 13:36:47.554 [INFO][17331] k8s.go 382: Teardown processing complete. ContainerID="c2214c767f8035467adac66feb33cbd2654679842c097f4ae8988e96e007a32b"

I think the kube-dns configmap miss lead reboot failure.

@krmayankk
Copy link

@daixiang0 it doensn't look like this is related to configmap of kube-dns missing. Note that the configmap is optional for kube-dns. Could be related to how Calico is configured @caseydavenport might have some pointers.

@daixiang0
Copy link
Contributor Author

@caseydavenport Hi, here is my calico config:

# Calico Version v3.1.1
# https://docs.projectcalico.org/v3.1/releases#v3.1.1
# This manifest includes the following component versions:
#   calico/node:v3.1.1
#   calico/cni:v3.1.1
#   calico/kube-controllers:v3.1.1

# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
  name: calico-config
  namespace: kube-system
data:
  # Configure this with the location of your etcd cluster.
  etcd_endpoints: "https://192.168.50.10:2379"

  # Configure the Calico backend to use.
  calico_backend: "bird"

  # The CNI network configuration to install on each node.
  cni_network_config: |-
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.0",
      "plugins": [
        {
          "type": "calico",
          "etcd_endpoints": "__ETCD_ENDPOINTS__",
          "etcd_key_file": "__ETCD_KEY_FILE__",
          "etcd_cert_file": "__ETCD_CERT_FILE__",
          "etcd_ca_cert_file": "__ETCD_CA_CERT_FILE__",
          "log_level": "info",
          "mtu": 1500,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "__KUBECONFIG_FILEPATH__"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        }
      ]
    }

  # If you're using TLS enabled etcd uncomment the following.
  # You must also populate the Secret below with these files.
  etcd_ca: "/calico-secrets/etcd-ca"
  etcd_cert: "/calico-secrets/etcd-cert"
  etcd_key: "/calico-secrets/etcd-key"

---

# The following contains k8s Secrets for use with a TLS enabled etcd cluster.
# For information on populating Secrets, see http://kubernetes.io/docs/user-guide/secrets/
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: calico-etcd-secrets
  namespace: kube-system
data:
  # Populate the following files with etcd TLS configuration if desired, but leave blank if
  # not using TLS for etcd.
  # This self-hosted install expects three files with the following names.  The values
  # should be base64 encoded strings of the entire contents of each file.
  etcd-key: "xxx"
  etcd-cert: "xxx"
  etcd-ca: "xxx"

---

# This manifest installs the calico/node container, as well
# as the Calico CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
  name: calico-node
  namespace: kube-system
  labels:
    k8s-app: calico-node
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  template:
    metadata:
      labels:
        k8s-app: calico-node
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      hostNetwork: true
      tolerations:
        # Make sure calico/node gets scheduled on all nodes.
        - effect: NoSchedule
          operator: Exists
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - effect: NoExecute
          operator: Exists
      serviceAccountName: calico-node
      # Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
      # deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
      terminationGracePeriodSeconds: 0
      containers:
        # Runs calico/node container on each Kubernetes node.  This
        # container programs network policy and routes on each
        # host.
        - name: calico-node
          image: quay.io/calico/node:v3.1.1
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Choose the backend to use.
            - name: CALICO_NETWORKING_BACKEND
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: calico_backend
            # Cluster type to identify the deployment type
            - name: CLUSTER_TYPE
              value: "k8s,bgp"
            # Disable file logging so `kubectl logs` works.
            - name: CALICO_DISABLE_FILE_LOGGING
              value: "true"
            # Set noderef for node controller.
            - name: CALICO_K8S_NODE_REF
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            # Set Felix endpoint to host default action to ACCEPT.
            - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
              value: "ACCEPT"
            # The default IPv4 pool to create on startup if none exists. Pod IPs will be
            # chosen from this range. Changing this value after installation will have
            # no effect. This should fall within `--cluster-cidr`.
            - name: CALICO_IPV4POOL_CIDR
              value: "196.168.0.0/12"
            - name: CALICO_IPV4POOL_IPIP
              value: "Always"
            # Disable IPv6 on Kubernetes.
            - name: FELIX_IPV6SUPPORT
              value: "false"
            # Set Felix logging to "info"
            - name: FELIX_LOGSEVERITYSCREEN
              value: "info"
            # Set MTU for tunnel device used if ipip is enabled
            - name: FELIX_IPINIPMTU
              value: "1440"
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Auto-detect the BGP IP address.
            - name: IP
              value: "autodetect"
            - name: IP_AUTODETECTION_METHOD
              value: "can-reach=192.168.50.1"
            - name: FELIX_HEALTHENABLED
              value: "true"
          securityContext:
            privileged: true
          resources:
            requests:
              cpu: 250m
          livenessProbe:
            httpGet:
              path: /liveness
              port: 9099
            periodSeconds: 10
            initialDelaySeconds: 10
            failureThreshold: 6
          readinessProbe:
            httpGet:
              path: /readiness
              port: 9099
            periodSeconds: 10
          volumeMounts:
            - mountPath: /lib/modules
              name: lib-modules
              readOnly: true
            - mountPath: /var/run/calico
              name: var-run-calico
              readOnly: false
            - mountPath: /var/lib/calico
              name: var-lib-calico
              readOnly: false
            - mountPath: /calico-secrets
              name: etcd-certs
        # This container installs the Calico CNI binaries
        # and CNI network config file on each node.
        - name: install-cni
          image: quay.io/calico/cni:v3.1.1
          command: ["/install-cni.sh"]
          env:
            # Name of the CNI config file to create.
            - name: CNI_CONF_NAME
              value: "10-calico.conflist"
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # The CNI network config to install on each node.
            - name: CNI_NETWORK_CONFIG
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: cni_network_config
          volumeMounts:
            - mountPath: /host/opt/cni/bin
              name: cni-bin-dir
            - mountPath: /host/etc/cni/net.d
              name: cni-net-dir
            - mountPath: /calico-secrets
              name: etcd-certs
      volumes:
        # Used by calico/node.
        - name: lib-modules
          hostPath:
            path: /lib/modules
        - name: var-run-calico
          hostPath:
            path: /var/run/calico
        - name: var-lib-calico
          hostPath:
            path: /var/lib/calico
        # Used to install CNI.
        - name: cni-bin-dir
          hostPath:
            path: /opt/cni/bin
        - name: cni-net-dir
          hostPath:
            path: /etc/cni/net.d
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400

---

# This manifest deploys the Calico Kubernetes controllers.
# See https://github.com/projectcalico/kube-controllers
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: calico-kube-controllers
  namespace: kube-system
  labels:
    k8s-app: calico-kube-controllers
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
  # The controllers can only have a single active instance.
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      name: calico-kube-controllers
      namespace: kube-system
      labels:
        k8s-app: calico-kube-controllers
    spec:
      # The controllers must run in the host network namespace so that
      # it isn't governed by policy that would prevent it from working.
      hostNetwork: true
      tolerations:
        # Mark the pod as a critical add-on for rescheduling.
        - key: CriticalAddonsOnly
          operator: Exists
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      serviceAccountName: calico-kube-controllers
      containers:
        - name: calico-kube-controllers
          image: quay.io/calico/kube-controllers:v3.1.1
          env:
            # The location of the Calico etcd cluster.
            - name: ETCD_ENDPOINTS
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_endpoints
            # Location of the CA certificate for etcd.
            - name: ETCD_CA_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_ca
            # Location of the client key for etcd.
            - name: ETCD_KEY_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_key
            # Location of the client certificate for etcd.
            - name: ETCD_CERT_FILE
              valueFrom:
                configMapKeyRef:
                  name: calico-config
                  key: etcd_cert
            # Choose which controllers to run.
            - name: ENABLED_CONTROLLERS
              value: policy,profile,workloadendpoint,node
          volumeMounts:
            # Mount in the etcd TLS secrets.
            - mountPath: /calico-secrets
              name: etcd-certs
      volumes:
        # Mount in the etcd TLS secrets with mode 400.
        # See https://kubernetes.io/docs/concepts/configuration/secret/
        - name: etcd-certs
          secret:
            secretName: calico-etcd-secrets
            defaultMode: 0400

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-kube-controllers
  namespace: kube-system

---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: calico-node
  namespace: kube-system

I know little about calico so i just follow doc to config it. I am not sure weather if relate to mul-network cards.

@daixiang0
Copy link
Contributor Author

daixiang0 commented Aug 27, 2018

I found the root cause:
I use a lvm to store etcd data but etcd service start before lvm mount, so that calico related info miss.
But i restart etcd service and kubelet service, calico related info still miss.
So it introduce a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

3 participants