kubelet can't mount the volume with `--cloud-provider=external` which is required by CCM #71018

yifan-gu · 2018-11-14T03:54:18Z

What happened:

Running CCM with --cloud-provider=aws.
According to the doc, the kubelet needs to run with --cloud-provider=external.

However then kubelet failed to mount the EBS volumes:

Mounting command: mount
Mounting arguments: -o bind /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 /var/lib/kubelet/pods/227ce278-e7c0-11e8-ae8b-06346f5010a2/volumes/kubernetes.io~aws-ebs/pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2
Output: mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 does not exist
E1114 03:49:28.077708    1548 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/aws-ebs/227ce278-e7c0-11e8-ae8b-06346f5010a2-pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2\" (\"227ce278-e7c0-11e8-ae8b-06346f5010a2\")" failed. No retries permitted until 2018-11-14 03:50:32.077674047 +0000 UTC m=+481.316888022 (durationBeforeRetry 1m4s). Error: "MountVolume.SetUp failed for volume \"pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2\" (UniqueName: \"kubernetes.io/aws-ebs/227ce278-e7c0-11e8-ae8b-06346f5010a2-pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2\") pod \"prometheus-prometheus-0\" (UID: \"227ce278-e7c0-11e8-ae8b-06346f5010a2\") : mount failed: exit status 32\nMounting command: mount\nMounting arguments: -o bind /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 /var/lib/kubelet/pods/227ce278-e7c0-11e8-ae8b-06346f5010a2/volumes/kubernetes.io~aws-ebs/pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2\nOutput: mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 does not exist\n\n"
E1114 03:50:27.196235    1548 kubelet.go:1616] Unable to mount volumes for pod "prometheus-prometheus-0_kube-system(227ce278-e7c0-11e8-ae8b-06346f5010a2)": timeout expired waiting for volumes to attach or mount for pod "kube-system"/"prometheus-prometheus-0". list of unmounted volumes=[prometheus-storage]. list of unattached volumes=[prometheus-storage config config-out prometheus-prometheus-rulefiles-0 prometheus-token-kqf4l]; skipping pod
E1114 03:50:27.196294    1548 pod_workers.go:186] Error syncing pod 227ce278-e7c0-11e8-ae8b-06346f5010a2 ("prometheus-prometheus-0_kube-system(227ce278-e7c0-11e8-ae8b-06346f5010a2)"), skipping: timeout expired waiting for volumes to attach or mount for pod "kube-system"/"prometheus-prometheus-0". list of unmounted volumes=[prometheus-storage]. list of unattached volumes=[prometheus-storage config config-out prometheus-prometheus-rulefiles-0 prometheus-token-kqf4l]
E1114 03:50:32.178373    1548 mount_linux.go:151] Mount failed: exit status 32
Mounting command: mount
Mounting arguments: -o bind /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 /var/lib/kubelet/pods/227ce278-e7c0-11e8-ae8b-06346f5010a2/volumes/kubernetes.io~aws-ebs/pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2
Output: mount: special device /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/us-west-1c/vol-0c1b6f36d694c79f2 does not exist
E1114 03:50:32.178464    1548 aws_ebs.go:419] Mount of disk /var/lib/kubelet/pods/227ce278-e7c0-11e8-ae8b-06346f5010a2/volumes/kubernetes.io~aws-ebs/pvc-227b754f-e7c0-11e8-ae8b-06346f5010a2 failed: mount failed: exit status 32

What you expected to happen:
kubelet should still be able to mount volumes.

How to reproduce it (as minimally and precisely as possible):

Run CCM with --cloud-provider=aws
Run kubelet with --cloud-provider=external
Create an EBS storage class
Create a pod that requires a persistent volume.

Anything else we need to know?:
My CCM yaml:

apiVersion: v1
kind: Pod
metadata:
  name: cloud-controller-manager
  namespace: kube-system
  labels:
    k8s-app: cloud-controller-manager
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
  containers:
  - name: cloud-controller-manager
    image: k8s.gcr.io/hyperkube:v1.11.2
    command:
    - ./hyperkube
    - cloud-controller-manager
    - --kubeconfig=/etc/kubernetes/kubeconfig
    - --leader-elect=true
    - --use-service-account-credentials
    - --profiling=false
    - --cloud-provider=aws
    - --cloud-config=/etc/kubernetes/cloud-config.ini
    - --configure-cloud-routes=false
    - --allocate-node-cidrs=true
    - --cluster-cidr=172.16.0.0/16
    - --feature-gates=ExpandPersistentVolumes=true,ExpandInUsePersistentVolumes=true,ExperimentalCriticalPodAnnotation=true,Initializers=true
    livenessProbe:
      httpGet:
        path: /healthz
        port: 10253
      initialDelaySeconds: 15
      timeoutSeconds: 1
    volumeMounts:
    - name: etc-kubernetes
      mountPath: /etc/kubernetes
      readOnly: true
  hostNetwork: true
  priorityClassName: system-node-critical
  volumes:
  - name: etc-kubernetes
    hostPath:
      path: /etc/kubernetes

Environment:

Kubernetes version (use kubectl version):

kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-30T21:39:16Z", GoVersion:"go1.11.1", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration:
AWS
OS (e.g. from /etc/os-release):

cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1855.4.0
VERSION_ID=1855.4.0
BUILD_ID=2018-09-11-0003
PRETTY_NAME="Container Linux by CoreOS 1855.4.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Kernel (e.g. uname -a):

uname -a
Linux ip-10-3-7-91.us-west-1.compute.internal 4.14.67-coreos #1 SMP Mon Sep 10 23:14:26 UTC 2018 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux

@kubernetes/sig-aws-misc @kubernetes/sig-storage-bugs
/cc @Quentin-M

/kind bug

The text was updated successfully, but these errors were encountered:

yifan-gu · 2018-11-14T03:56:41Z

Kinda similar to #70921 , but in my case the --cloud-provider flag needs to be set to external according to the doc.

yifan-gu · 2018-12-04T19:19:08Z

This issue is preventing us from switching to the cloud controller manager. Is anyone looking into this?
I also confirmed that this happens on 1.12.2 too.

gnufied · 2018-12-04T19:23:28Z

I do not think typical volume features will work with an external cloud controller manager. Unless you are using CSI+EBS driver(https://github.com/kubernetes-sigs/aws-ebs-csi-driver), in-tree EBS volumes won't work without cloudprovider configuration with controller-manager.

gnufied · 2018-12-04T19:25:15Z

In a nutshell - it is a known issue that, if you are using an external CCM and don't have cloudprovider configured with controller-manager, none of the volume features will work as expected.

That is why sig-storage is working on CSI, which allows external drivers to support attach/detach/provisioning etc.

fejta-bot · 2019-03-04T20:06:08Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

feiskyer · 2019-03-20T07:18:07Z

In a nutshell - it is a known issue that, if you are using an external CCM and don't have cloudprovider configured with controller-manager, none of the volume features will work as expected.

@andrewsykim I think this is true today? CSI is still the only solution for this?

msau42 · 2019-03-20T15:46:05Z

I'm not sure I understand why mount/unmount would be dependent on cloud provider. @gnufied can you clarify?

fejta-bot · 2019-04-19T15:48:25Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-05-19T16:32:26Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-05-19T16:32:34Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added sig/aws kind/bug Categorizes issue or PR as related to a bug. sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Nov 14, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2019

feiskyer mentioned this issue Mar 20, 2019

External kubernetes dependencies kubernetes-sigs/cloud-provider-azure#124

Closed

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 19, 2019

k8s-ci-robot closed this as completed May 19, 2019

nckturner mentioned this issue Sep 28, 2020

Add aws-cloud-controller-manager config to addons kubernetes/kops#9704

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubelet can't mount the volume with `--cloud-provider=external` which is required by CCM #71018

kubelet can't mount the volume with `--cloud-provider=external` which is required by CCM #71018

yifan-gu commented Nov 14, 2018 •

edited

yifan-gu commented Nov 14, 2018 •

edited

yifan-gu commented Dec 4, 2018

gnufied commented Dec 4, 2018

gnufied commented Dec 4, 2018

fejta-bot commented Mar 4, 2019

feiskyer commented Mar 20, 2019 •

edited

msau42 commented Mar 20, 2019

fejta-bot commented Apr 19, 2019

fejta-bot commented May 19, 2019

k8s-ci-robot commented May 19, 2019

kubelet can't mount the volume with --cloud-provider=external which is required by CCM #71018

kubelet can't mount the volume with --cloud-provider=external which is required by CCM #71018

Comments

yifan-gu commented Nov 14, 2018 • edited

yifan-gu commented Nov 14, 2018 • edited

yifan-gu commented Dec 4, 2018

gnufied commented Dec 4, 2018

gnufied commented Dec 4, 2018

fejta-bot commented Mar 4, 2019

feiskyer commented Mar 20, 2019 • edited

msau42 commented Mar 20, 2019

fejta-bot commented Apr 19, 2019

fejta-bot commented May 19, 2019

k8s-ci-robot commented May 19, 2019

kubelet can't mount the volume with `--cloud-provider=external` which is required by CCM #71018

kubelet can't mount the volume with `--cloud-provider=external` which is required by CCM #71018

yifan-gu commented Nov 14, 2018 •

edited

yifan-gu commented Nov 14, 2018 •

edited

feiskyer commented Mar 20, 2019 •

edited