Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to mount Azure Disk as a PV when ADE is enabled #66443

Closed
tanaka-takayoshi opened this issue Jul 20, 2018 · 22 comments
Closed

Failed to mount Azure Disk as a PV when ADE is enabled #66443

tanaka-takayoshi opened this issue Jul 20, 2018 · 22 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@tanaka-takayoshi
Copy link

/kind bug

What happened:
kubernetes failed to mount directory in Azure Disk as a PV when Azure Disk Encryption is enabled in node host.

What you expected to happen:
Successfully to mount the disk.

How to reproduce it (as minimally and precisely as possible):
I confirmed the issue in AKS (kubernetes 1.9.9) and OpenShift Enterprise 3.9 on Azure. Here is how to reproduce with AKS.

Create a single node AKS.

az group create -n aks-ade -l westeurope
az aks create -g aks-ade -n aks-ade-cluster  --node-count  1

Then enable ADE for created VM by AKS.

AZGROUP=<resource_group_of_created_VM>
az ad app create --display-name aks-ade-vault-app --identifier-uris https://aks-ade-vault-app --password Password
az ad sp create --id <app_id>
az keyvault create -n aks-ade-vault -g $AZGROUP --enabled-for-disk-encryption True
az keyvault set-policy --name aks-ade-vault --spn <app_id> --key-permissions wrapKey  --secret-permissions set
az keyvault key create --vault-name aks-ade-vault --name aks-ade-node1-key --protection software
az vm encryption enable -g $AZGROUP -n <created_VM_name> --aad-client-id <app_id> --aad-client-secret Password --disk-encryption-keyvault aks-ade-vault  --key-encryption-key aks-ade-node1-key  --volume-type DATA

AKS creates the default storage class automatically.

$ kubectl get sc default -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"storage.k8s.io/v1beta1","kind":"StorageClass","metadata":{"annotations":{"storageclass.beta.kubernetes.io/is-default-class":"true"},"labels":{"kubernetes.io/cluster-service":"true"},"name":"default","namespace":""},"parameters":{"kind":"Managed","storageaccounttype":"Standard_LRS"},"provisioner":"kubernetes.io/azure-disk"}
    storageclass.beta.kubernetes.io/is-default-class: "true"
  creationTimestamp: 2018-07-20T06:36:42Z
  labels:
    kubernetes.io/cluster-service: "true"
  name: default
  resourceVersion: "290"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/default
  uid: ...
parameters:
  kind: Managed
  storageaccounttype: Standard_LRS
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Delete

Then create a PVC and pod.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azure-managed-disk
  annotations:
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
kind: Pod
apiVersion: v1
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/data1"
        name: volume
  volumes:
    - name: volume
      persistentVolumeClaim:
        claimName: azure-managed-disk

Then I could see the failed event.

$ kubectl get event
LAST SEEN   FIRST SEEN   COUNT     NAME                     KIND      SUBOBJECT   TYPE      REASON        SOURCE                              MESSAGE
16m         3h           86        mypod.15430209f8eb86ef   Pod                   Warning   FailedMount   kubelet, aks-nodepool1-nnnnnnn-0   Unable to mount volumes for pod "mypod_default(uuid)": timeout expired waiting for volumes to attach/mount for pod "default"/"mypod". list of unattached/unmounted volumes=[volume]
1m          3h           116       mypod.1543021e8b8c38f3   Pod                   Warning   FailedMount   kubelet, aks-nodepool1-nnnnnnn-0   (combined from similar events): MountVolume.MountDevice failed for volume "pvc-(uuid)" : azureDisk - mountDevice:FormatAndMount failed with mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m3045819910 --scope -- mount -o defaults /dev/disk/by-id/scsi-(uuid) /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m3045819910
Output: Running scope as unit run-r42ea8854f2e04ee7a534ff50f3f858dc.scope.
mount: /dev/sdc is already mounted or /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/m3045819910 busy

Anything else we need to know?:
I confirmed the issue only on AKS and OpenShift, but I wonder this could happen on kubernetes on Azure with Azure Cloud Provder enabled.
I checked the closed issue #57070 . I'm wondering the issue is caused because kubernetes doesn't consider /dev/sdc has been already mounted as a BEK volume by ADE.

kubenertes excludes disks used by azure as resource and OS root in /dev/disk/azure. However, BEK Volume doesn't show up under /dev/disk/azure. I expect this is the reason.

https://github.com/kubernetes/kubernetes/blob/v1.9.9/pkg/volume/azure_dd/azure_common_linux.go#L31-L47

Environment:

  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.9", GitCommit:"57729ea3d9a1b75f3fc7bbbadc597ba707d47c8a", GitTreeState:"clean", BuildDate:"2018-06-29T01:07:01Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    Azure (AKS)

  • OS (e.g. from /etc/os-release):

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.4 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.4 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel (e.g. uname -a):
$ uname -a
Linux cc-5e6a90d6-1543821320-96hfn 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
    No

  • Others:

@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Jul 20, 2018
@tanaka-takayoshi
Copy link
Author

/sig azure

@k8s-ci-robot k8s-ci-robot added sig/azure and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 20, 2018
@andyzhangx
Copy link
Member

@tanaka-takayoshi thanks for the deep investigation. Then how is the PV disk mounted on VM? could you run following command to check? Thanks.

sudo apt install tree -y
tree /dev/disk/

/assign

@andyzhangx
Copy link
Member

andyzhangx commented Jul 20, 2018

@tanaka-takayoshi I found the root cause, by encrpyting the vm disk, there would be a new disk /dev/sdc, the azure cloud provider would recognize the /dev/sdc disk instead of /dev/sdd which is wrong

$ sudo ls -lt  /sys/bus/scsi/devices/2\:0\:1\:0/block
drwxr-xr-x 10 root root 0 Jul 20 13:14 sdc

$ sudo ls -lt  /sys/bus/scsi/devices/4\:0\:0\:0/block
drwxr-xr-x 10 root root 0 Jul 20 13:14 sdd

$ sudo tree /dev/disk/azure/
/dev/disk/azure/
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd

@andyzhangx
Copy link
Member

andyzhangx commented Jul 20, 2018

There are two ways to fix this issue to support disk encryption since there would be a new disk /dev/sdc after disk encryption:

  • simple way
    always find device bigger than /sys/bus/scsi/devices/3, it looks correct per our experience in multiple OS, e.g. Ubuntu, CoreOS, Redhat...

    // as observed, targets 0-3 are used by OS disks. Skip them
    if target <= 3 {
    continue
    }

  • rewrite findDiskByLunWithConstraint func, get disk path by lun number x directly: /dev/disk/azure/scsi1/lunx

    func findDiskByLunWithConstraint(lun int, io ioHandler, azureDisks []string) (string, error) {

I would like to know your thoughts(@rootfs @karataliu) before fixing this issue, thanks.
FYI @jsafrane @khenidak @brendandburns @feiskyer

@tanaka-takayoshi
Copy link
Author

@andyzhangx Yes. That's what I want to explain.

@pdwyer
Copy link

pdwyer commented Jul 20, 2018

If you changed to always find a device bigger that 3 aren't you going to break situations where ADE is not involved?

If the bek volume was labelled in a similar manor to root and resources disks, a udev rule could be created and it could be excluded in a similar way as is done in func listAzureDiskPath?

@andyzhangx
Copy link
Member

If you changed to always find a device bigger that 3 aren't you going to break situations where ADE is not involved?

no, the new data disk device would be always bigger than 3 no matter ADE is enabled or not.

The data disk could always be found under /dev/disk/azure/scsi1,

$ sudo tree /dev/disk/azure
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd
    âââ lun1 -> ../../../sde

@andyrd2405
Copy link

To prove the original issue raised I tried the following to add BEK as a sym link in /dev/disk/azure as below.

lrwxrwxrwx. 1 root root 9 Jul 17 15:32 root -> ../../sda
drwxr-xr-x. 7 root root 140 Jul 17 15:32 ..
lrwxrwxrwx. 1 root root 10 Jul 17 15:32 root-part2 -> ../../sda2
lrwxrwxrwx. 1 root root 10 Jul 17 15:32 root-part1 -> ../../sda1
lrwxrwxrwx. 1 root root 9 Jul 17 15:33 resource -> ../../sdb
lrwxrwxrwx. 1 root root 10 Jul 17 15:33 resource-part1 -> ../../sdb1
lrwxrwxrwx. 1 root root 9 Jul 20 15:24 bek -> ../../sdc
lrwxrwxrwx. 1 root root 10 Jul 20 15:25 bek-part1 -> ../../sdc1
drwxr-xr-x. 2 root root 180 Jul 20 15:25 .

When starting a pod with attached storage it now correctly uses /dev/sdd as the next device for the mounted storage.

Having a udev environment var set as fabric_name=BEK could fix this with a simple addition of a udev rule which is exactly how the existing root and resource mounts are handled.

@andyzhangx
Copy link
Member

andyzhangx commented Jul 21, 2018

@andyrd2405 Thanks, that would be a better solution without changing k8s code. I just verified that add one more udev rule(see below, may require reboot) in /etc/udev/rules.d/66-azure-storage.rules would fix this issue:

ATTRS{device_id}=="?00000000-0000-", ENV{fabric_name}="root", GOTO="azure_names"
ATTRS{device_id}=="?00000000-0001-
", ENV{fabric_name}="resource", GOTO="azure_names"
ATTRS{device_id}=="?00000001-0001-*", ENV{fabric_name}="BEK", GOTO="azure_names"

$ sudo tree /dev/disk/azure
âââ BEK -> ../../sdc
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd

I will contact with azure team, try to make this rule as default in linux distro.

@andyrd2405
Copy link

@andyzhangx Thank you. Can I just confirm you tested this with just the additional UDEV rule? Does the device presented from Azure not need to have the fabric_name set first ?
In the mean time I will do some further tests.

@andyzhangx
Copy link
Member

andyzhangx commented Jul 23, 2018

@andyrd2405 In my testing, insert following UDEV rule(may require reboot) would work:

ATTRS{device_id}=="?00000001-0001-*", ENV{fabric_name}="BEK", GOTO="azure_names"

The BEK name is set by disk encryption tool in my env. You don't need to set fabric_name, only insert that UDEV rule and reboot would work.

@andyrd2405
Copy link

@andyzhangx Yes I can confirm I added the UDEV rule to my VM's and stopped to deallocate and then start. Once the machine is up I can see the below under /dev/disk/azure

lrwxrwxrwx. 1 root root 9 Jul 23 08:17 root -> ../../sda
drwxr-xr-x. 7 root root 140 Jul 23 08:17 ..
lrwxrwxrwx. 1 root root 9 Jul 23 08:17 BEK -> ../../sdc
lrwxrwxrwx. 1 root root 10 Jul 23 08:17 root-part2 -> ../../sda2
lrwxrwxrwx. 1 root root 10 Jul 23 08:17 BEK-part1 -> ../../sdc1
lrwxrwxrwx. 1 root root 10 Jul 23 08:17 root-part1 -> ../../sda1
lrwxrwxrwx. 1 root root 9 Jul 23 08:17 resource -> ../../sdb
drwxr-xr-x. 2 root root 180 Jul 23 08:17 .
lrwxrwxrwx. 1 root root 10 Jul 23 08:17 resource-part1 -> ../../sdb1

I will try a further mount / pod test but I think it should be good.

@andyrd2405
Copy link

@andyzhangx Just to confirm I have retested with POD's and mounted storage and all now works as expected after the UDEV rule change. Thank you for the help and support.

@andyzhangx
Copy link
Member

@andyrd2405 that's great. Is it possible to file an issue to make that UDEV rule into RHEL image on Azure? @pdwyer I am not sure who is the right contact at this moment.

@pdwyer
Copy link

pdwyer commented Jul 27, 2018

@andyzhangx looks like rules are provided in WALinuxAgent
https://github.com/Azure/WALinuxAgent/blob/master/config/66-azure-storage.rules

So I assume that is where we need to get the issue filed.

andyzhangx added a commit to andyzhangx/WALinuxAgent that referenced this issue Aug 7, 2018
Recently we are trying to do azure disk encryption on Ubuntu & RedHat and found that a UDEV rule is missing, the new UDEV rule is like following in `/etc/udev/rules.d/66-azure-storage.rules`
```
ATTRS{device_id}=="?00000001-0001-*", ENV{fabric_name}="BEK", GOTO="azure_names"
```
After disk encryption:
 - Without this UDEV rule
```
$ sudo tree /dev/disk/azure
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd
```
 - With this UDEV rule
```
$ sudo tree /dev/disk/azure
âââ BEK -> ../../sdc
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd
```

All azure os, resource and bek disks should be under `/dev/disk/azure` directory, that's a rule.
The item `BEK -> ../../sdc` is very important in kubernetes, without this item, new data disk could not be recognized, details info could be found here: kubernetes/kubernetes#66443 (comment)
@andyzhangx
Copy link
Member

@pdwyer Thanks, I created a PR there: Azure/WALinuxAgent#1287

hglkrijger pushed a commit to Azure/WALinuxAgent that referenced this issue Aug 16, 2018
Recently we are trying to do azure disk encryption on Ubuntu & RedHat and found that a UDEV rule is missing, the new UDEV rule is like following in `/etc/udev/rules.d/66-azure-storage.rules`
```
ATTRS{device_id}=="?00000001-0001-*", ENV{fabric_name}="BEK", GOTO="azure_names"
```
After disk encryption:
 - Without this UDEV rule
```
$ sudo tree /dev/disk/azure
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd
```
 - With this UDEV rule
```
$ sudo tree /dev/disk/azure
âââ BEK -> ../../sdc
âââ resource -> ../../sdb
âââ resource-part1 -> ../../sdb1
âââ root -> ../../sda
âââ root-part1 -> ../../sda1
âââ scsi1
    âââ lun0 -> ../../../sdd
```

All azure os, resource and bek disks should be under `/dev/disk/azure` directory, that's a rule.
The item `BEK -> ../../sdc` is very important in kubernetes, without this item, new data disk could not be recognized, details info could be found here: kubernetes/kubernetes#66443 (comment)
@andyzhangx
Copy link
Member

andyzhangx commented Sep 18, 2018

new UDEV rule has been added into wala agent rules, close this issue now
/close

fixed in WALinuxAgent-2.2.32

@k8s-ci-robot
Copy link
Contributor

@andyzhangx: Closing this issue.

In response to this:

new UDEV rule has been added into wala agent rules, close this issue now
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@gitphill
Copy link

gitphill commented Nov 6, 2018

@tanaka-takayoshi @andyzhangx
Did you find that the new data disks were encrypted once attached?
We have encryption enabled with --volume-type ALL but the new data disks attached by Kubernetes never show up in the ADE log - {"os": "Encrypted", "data": "NotMounted"}

@andyzhangx
Copy link
Member

The new data disk won't be encrypted even you specify as --volume-type ALL, you must use az vm encryption ... command to encrypt that disk explicitly.

@gitphill
Copy link

gitphill commented Nov 6, 2018

ok thanks @andyzhangx - What's your experience with that? When I run az vm encryption enable it reboots the box which takes a while to come back. In the mean time Kubernetes relocates the containers and volumes that were on the node

@andyzhangx
Copy link
Member

@gitphill if you use --volume-type OS or --volume-type ALL, az vm encryption may make VM into failed state, --volume-type DATA would be better. Anyway, disk encryption is not so stable if you encypt an running k8s agent node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants