cloudprovider/aws: EBS attachment fails: /dev/sdba is not a valid EBS device name (v1.3.0-beta.0) #27534

simonswine · 2016-06-16T11:23:48Z

I am having trouble on my AWS 1.3.0beta0 cluster to attach EBS with the controller-manager based attacher.

2016-06-16T10:50:46.081028537Z I0616 10:50:46.080908       1 reconciler.go:126] Started AttachVolume for volume "kubernetes.io/aws-ebs/aws://eu-west-1a/vol-9aa4532b" to node "ip-172-20-130-252.eu-west-1.compute.internal"
2016-06-16T10:50:46.594535809Z E0616 10:50:46.594230       1 attacher.go:78] Error attaching volume "aws://eu-west-1a/vol-9aa4532b": Error attaching EBS volume: InvalidParameterValue: Value (/dev/sdba) for parameter device is invalid. /dev/sdba is not a valid EBS device name.
2016-06-16T10:50:46.594571007Z  status code: 400, request id: 
2016-06-16T10:50:46.594578574Z E0616 10:50:46.594286       1 attacher_detacher.go:124] Attach operation for device "kubernetes.io/aws-ebs/aws://eu-west-1a/vol-9aa4532b" to node "ip-172-20-130-252.eu-west-1.compute.internal" failed with: Error attaching EBS volume: InvalidParameterValue: Value (/dev/sdba) for parameter device is invalid. /dev/sdba is not a valid EBS device name.
2016-06-16T10:50:46.594585498Z  status code: 400, request id:

There problem is here: https://github.com/kubernetes/kubernetes/blob/release-1.3/pkg/cloudprovider/providers/aws/aws.go#L1302

This is failing as the kube-controller-manager is running in a pod via kubelet manifests. I think if you have a cluster with different instance types mixed you ran into trouble as well, as the device names will be either (sdX or xvdX) and the cluster will do it based on the state of the master (controller-manager).

My work around for now is to enable a host mount of /dev for the controller manager. But this need to be addressed at some point.

    volumeMounts:
    - mountPath: /dev
      name: dev-host
  volumes:
  - hostPath:
      path: /dev
    name: dev-host

Is there any way we get the correct name from the AWS api?
I think it's depending on the kernel version if it is xvdX or sdX. Am I right on this?
A more ugly solution could be to annotate the node during registration, if it's has xvdX or sdX
device names

@justinsb maybe you can help me get a proper solution for this. I am happy to contribute a PR

The text was updated successfully, but these errors were encountered:

simonswine · 2016-06-16T11:39:06Z

I have just found this, I think we could get this information from AWS:

aws ec2 describe-images --image-ids image_id --query Images[].RootDeviceName

justinsb · 2016-06-16T13:06:43Z

That code is supposed to check, but it is quite possible I got it backwards looking at it. Also I agree that the logic is going to be problematic anyway with mixed instance types etc if we've moved attachment to KCM.

What AMI are you using? I may have just been "lucky" so far in the AMIs I've used, in that the heuristics have worked correctly.

Also, if you have time, would be great to know:

What /dev/ looks like on one of the problematic nodes (i.e. in particular we have /dev/sda and/or /dev/xvda)?
What is the output of the describe-images command for that AMI (particularly if it is private)?

I like the describe-images approach!

I'm marking this as 1.3 P1 (sorry @davidopp) as for now my working hypothesis is that this was introduced when we moved disk attachment to KCM (I believe previously the kubelet did it, but want to double-check)

* cf. kubernetes#27534

simonswine · 2016-06-16T15:27:10Z

@justinsb I tried to get a such environments with CoreOS amis: (all eu-west1)

ami-706cfd03 vtype is PV
ami-c36effb0 vtype is HVM

The describe instances returns different rootdevicenames:

aws ec2 describe-instances | jq '.Reservations[].Instances[] | .RootDeviceName , .ImageId'
"/dev/xvda"
"ami-c36effb0"
"/dev/sda"
"ami-706cfd03"

But if I ssh into the two instances, the devices are called the same for both:

ip-172-20-128-52 ~ # ls -l /dev/xvda
brw-rw---- 1 root disk 202, 0 Jun 16 13:41 /dev/xvda
core@ip-172-20-130-251 ~ $ ls -l /dev/xvda
brw-rw---- 1 root disk 202, 0 Jun 16 10:14 /dev/xvda

I think we have to somehow signal the devicename from the Kubelet to the KCM to solve this properly.

My understanding is to get the sdX names we have to use an pretty old kernel. Maybe it'a good idea to change the default behaviour, see PR #27545. So fallback to '/dev/xvdX'

erictune · 2016-06-17T15:19:27Z

I don't see any indication that this is a regression. I am kicking it out of milestone 1.3. Explain how it is a regression or otherwise super-important to get it back on the milestone.

justinsb · 2016-06-17T15:37:37Z

It is a regression: volume mounting is broken on AWS. My believe is that it happens because we have moved volume mounting to KCM, and/or KCM runs in a container. Between those two, volume mounting just doesn't work right now.

We are using HVM style names, which cannot be paravirtual style names. See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html This also fixes problems introduced when moving volume mounting to KCM. Fix kubernetes#27534

Automatic merge from submit-queue AWS volumes: Use /dev/xvdXX names with EC2 We are using HVM style names, which cannot be paravirtual style names. See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html This also fixes problems introduced when moving volume mounting to KCM. Fix #27534

justinsb added this to the v1.3 milestone Jun 16, 2016

justinsb added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/platform/aws labels Jun 16, 2016

simonswine added a commit to simonswine/kubernetes that referenced this issue Jun 16, 2016

cloudprovider/aws: prefer '/dev/xvdX' over '/dev/sdX' for device names

5224e82

* cf. kubernetes#27534

simonswine mentioned this issue Jun 16, 2016

cloudprovider/aws: prefer '/dev/xvdX' over '/dev/sdX' for device names #27545

Closed

erictune removed this from the v1.3 milestone Jun 17, 2016

justinsb added this to the v1.3 milestone Jun 17, 2016

justinsb mentioned this issue Jun 17, 2016

AWS volumes: Use /dev/xvdXX names with EC2 #27628

Merged

k8s-github-robot assigned justinsb Jun 17, 2016

k8s-github-robot closed this as completed in #27628 Jun 19, 2016

Thermi mentioned this issue Mar 1, 2017

Volumes broken on AWS #42293

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cloudprovider/aws: EBS attachment fails: /dev/sdba is not a valid EBS device name (v1.3.0-beta.0) #27534

cloudprovider/aws: EBS attachment fails: /dev/sdba is not a valid EBS device name (v1.3.0-beta.0) #27534

simonswine commented Jun 16, 2016 •

edited

simonswine commented Jun 16, 2016

justinsb commented Jun 16, 2016

simonswine commented Jun 16, 2016

erictune commented Jun 17, 2016

justinsb commented Jun 17, 2016

cloudprovider/aws: EBS attachment fails: /dev/sdba is not a valid EBS device name (v1.3.0-beta.0) #27534

cloudprovider/aws: EBS attachment fails: /dev/sdba is not a valid EBS device name (v1.3.0-beta.0) #27534

Comments

simonswine commented Jun 16, 2016 • edited

simonswine commented Jun 16, 2016

justinsb commented Jun 16, 2016

simonswine commented Jun 16, 2016

erictune commented Jun 17, 2016

justinsb commented Jun 17, 2016

simonswine commented Jun 16, 2016 •

edited