Topology awareness: CSI Controller creates volumes in its own location #42

bascht · 2019-07-23T20:29:58Z

This is a followup of my comment in #11. I can reproduce the behaviour with hetznercloud/hcloud-csi-driver:1.1.4 on Kubernetes v1.14.3.

The CSI controller is deployed on a node located in fsn1:

$ kubectl -n kube-system get pod hcloud-csi-controller-0 -o wide
NAME                      READY   STATUS    RESTARTS   AGE   IP           NODE                          NOMINATED NODE   READINESS GATES
hcloud-csi-controller-0   4/4     Running   0          8h    10.42.4.20   k8s-infrastructure-worker-1   <none>           <none>

$ kubectl get node --selector csi.hetzner.cloud/location=fsn1,role=worker
NAME                          STATUS   ROLES                      AGE   VERSION
k8s-infrastructure-worker-1   Ready    controlplane,etcd,worker   23h   v1.14.3

I used the following config to create 3 deployments of 1 pod each in fsn1, nbg1 and hel1:

---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: ubuntu-fsn1
  namespace: default
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: location
                operator: In
                values:
                - fsn1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-fsn1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-fsn1
      volumes:
      - name: ubuntu-fsn1
        persistentVolumeClaim:
          claimName: ubuntu-fsn1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-fsn1
  labels:
    app: csi-test
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: ubuntu-nbg1
  namespace: default
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: location
                operator: In
                values:
                - nbg1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-nbg1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-nbg1
      volumes:
      - name: ubuntu-nbg1
        persistentVolumeClaim:
          claimName: ubuntu-nbg1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-nbg1
  labels:
    app: csi-test
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

---
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: ubuntu-hel1
  namespace: default
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: location
                operator: In
                values:
                - hel1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-hel1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-hel1
      volumes:
      - name: ubuntu-hel1
        persistentVolumeClaim:
          claimName: ubuntu-hel1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-hel1
  labels:
    app: csi-test
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

$ kubectl apply -f deployment.yaml
deployment.apps/ubuntu-fsn1 created
persistentvolumeclaim/ubuntu-fsn1 created
deployment.apps/ubuntu-nbg1 created
persistentvolumeclaim/ubuntu-nbg1 created
deployment.apps/ubuntu-hel1 created
persistentvolumeclaim/ubuntu-hel1 created

Only fsn1 will actually start:

kubectl get pods -o wide
NAME                           READY   STATUS    RESTARTS   AGE   IP           NODE                       NOMINATED NODE   READINESS GATES
ubuntu-fsn1-6d5c54d48c-p7k7q   1/1     Running   0          86s   10.42.3.42   k8s-infrastructure-web-1   <none>           <none>
ubuntu-hel1-7596c45f45-q7fxx   0/1     Pending   0          86s   <none>       <none>                     <none>           <none>
ubuntu-nbg1-84bb6b947c-wkh7j   0/1     Pending   0          86s   <none>       <none>                     <none>           <none>

Checking for example the hel1 pod:

describe pod ubuntu-hel1-7596c45f45-q7fxx
Name:               ubuntu-hel1-7596c45f45-q7fxx
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               <none>
Labels:             app=csi-test
                    pod-template-hash=7596c45f45
Annotations:        <none>
Status:             Pending
IP:
Controlled By:      ReplicaSet/ubuntu-hel1-7596c45f45
Containers:
  ubuntu-hel1:
    Image:        ubuntu:xenial
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /mnt from ubuntu-hel1 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xyz (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  ubuntu-hel1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  ubuntu-hel1
    ReadOnly:   false
  default-token-dbzkq:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xyz
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  Warning  FailedScheduling  2m48s (x2 over 2m48s)  default-scheduler  persistentvolumeclaim "ubuntu-hel1" not found
  Warning  FailedScheduling  2m43s                  default-scheduler  pv "pvc-9e87df90-ad86-11e9-896e-9600002bd1f9" node affinity doesn't match node "k8s-infrastructure-web-3": No matching
NodeSelectorTerms
  Warning  FailedScheduling  5s (x3 over 2m43s)     default-scheduler  0/6 nodes are available: 2 node(s) had volume node affinity conflict, 4 node(s) didn't match node selector.

Checking via the hcloud cli shows the volumes in fsn1:

$ hcloud volume list
ID        NAME                                       SIZE    SERVER    LOCATION
2956880   pvc-9e16261e-ad86-11e9-896e-9600002bd1f9   10 GB   2998843   fsn1
2956881   pvc-9e87df90-ad86-11e9-896e-9600002bd1f9   10 GB   -         fsn1
2956882   pvc-9e4d1749-ad86-11e9-896e-9600002bd1f9   10 GB   -         fsn1

And it looks like we are already on to something – if I check the csi-provisioner log, I found the server could not find the requested resource (get csinodeinfos.csi.storage.k8s.io k8s-infrastructure-web-1:

I0723 20:15:31.485128       1 controller.go:926] provision "default/ubuntu-fsn1" class "hcloud-volumes": started
I0723 20:15:31.557402       1 controller.go:188] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0723 20:15:31.557769       1 event.go:221] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ubuntu-fsn1", UID:"9e16261e-ad86-11e9-896e-9600002bd1f9", APIVersion:"v1", ResourceVersion:"221469", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/ubuntu-fsn1"
I0723 20:15:31.557436       1 controller.go:189] GRPC request: {}
I0723 20:15:31.560208       1 controller.go:191] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
I0723 20:15:31.563301       1 controller.go:192] GRPC error: <nil>
I0723 20:15:31.563533       1 controller.go:188] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0723 20:15:31.563756       1 controller.go:189] GRPC request: {}
I0723 20:15:31.566767       1 controller.go:191] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}}]}
I0723 20:15:31.570955       1 controller.go:192] GRPC error: <nil>
I0723 20:15:31.571159       1 controller.go:188] GRPC call: /csi.v1.Identity/GetPluginInfo
I0723 20:15:31.571184       1 controller.go:189] GRPC request: {}
I0723 20:15:31.574111       1 controller.go:191] GRPC response: {"name":"csi.hetzner.cloud","vendor_version":"1.1.4"}
I0723 20:15:31.575313       1 controller.go:192] GRPC error: <nil>
W0723 20:15:31.579357       1 topology.go:171] error getting CSINodeInfo for selected node "k8s-infrastructure-web-1": the server could not find the requested resource (get csinodeinfos.csi.storage.k8s.io k8s-infrastructure-web-1); proceeding to provision without topology information
I0723 20:15:31.579408       1 controller.go:544] CreateVolumeRequest {Name:pvc-9e16261e-ad86-11e9-896e-9600002bd1f9 CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX
_sizecache:0}
I0723 20:15:31.579827       1 controller.go:188] GRPC call: /csi.v1.Controller/CreateVolume
I0723 20:15:31.579961       1 controller.go:189] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-9e16261e-ad86-11e9-896e-9600002bd1f9","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}

Without into digging the code I am not sure what triggers the error, do you have any hints how I could debug this any further?

The text was updated successfully, but these errors were encountered:

thcyron · 2019-07-24T08:05:50Z

Looks like we need to release a new version of the driver for Kubernetes 1.14 where CSINodeInfo was renamed to CSINode.

costela · 2019-07-28T11:11:31Z

I have another issue down the line which might share a root cause. When trying to use the CSI driver on a 1.14 cluster with --authorization-mode=Node and --enable-admission-plugins=...,NodeRestriction,..., we get:

E0727 14:03:55.220699    2205 csi_attacher.go:93] kubernetes.io/csi: attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden: User "system:node:master1" cannot create resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: can only get individual resources of this type

I suspect this might be related, because the call to create the VolumeAttachment happens from the master node, instead of the worker node where the pod is scheduled, so the admission is denied by the node authorizer (AFAICT, it only allows create requests for VolumeAttachment to be made by the Node actually running the pod using the volume).

costela · 2019-07-30T17:57:10Z

@bascht could you please try the new deploy YAML at #43 ?
It unfortunately doesn't solve my problem, but I suspect it might solve yours.

@thcyron please correct me if I'm mistaken, but doesn't the k8s builtin node manager create the CSINode objects? (supposedly in response to the node-driver-registrar). And aren't these objects only used internally by CSI machinery (provisioner, attacher, etc)? If that's true, than the simple image updates in #43 should be enough for this, right? No changes to the driver should be needed.

bascht · 2019-07-31T00:55:45Z

@costela Looking good:

the corresponding storage class is created:

$ kubectl apply -f https://raw.githubusercontent.com/costela/hetzner-csi-driver/kubernetes_1.14_deploy/deploy/kubernetes/hcloud-csi.yml
csidriver.storage.k8s.io/csi.hetzner.cloud created
storageclass.storage.k8s.io/hcloud-volumes unchanged
serviceaccount/hcloud-csi unchanged
clusterrole.rbac.authorization.k8s.io/hcloud-csi configured
clusterrolebinding.rbac.authorization.k8s.io/hcloud-csi unchanged
statefulset.apps/hcloud-csi-controller configured
daemonset.apps/hcloud-csi-node configured

And after applying my test Yaml - all volumes bound:

$ kubectl get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS     AGE
ubuntu-fsn1   Bound    pvc-56c0d7c5-b32d-11e9-b010-9600002bd1f9   10Gi       RWO            hcloud-volumes   59s
ubuntu-hel1   Bound    pvc-56fdff95-b32d-11e9-b010-9600002bd1f9   10Gi       RWO            hcloud-volumes   59s
ubuntu-nbg1   Bound    pvc-56e22871-b32d-11e9-b010-9600002bd1f9   10Gi       RWO            hcloud-volumes   59s

all neat in their respective locations:

@costela Thanks a bunch for the quick update! <3

Should I keep this issue open for the followup?

costela · 2019-07-31T01:04:53Z

@bascht nice to hear!
We can probably keep the issue open until my PR is merged (and as for my issue: I'm not even sure it's the driver's fault, TBH; still poking around)

bascht mentioned this issue Jul 23, 2019

Implement topology awareness #11

Merged

3 tasks

costela mentioned this issue Jul 31, 2019

update deployment for k8s 1.14 #43

Merged

thcyron closed this as completed in #43 Jul 31, 2019

ilyasotkov mentioned this issue Aug 10, 2019

[rancher/k3s only] attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden #46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topology awareness: CSI Controller creates volumes in its own location #42

Topology awareness: CSI Controller creates volumes in its own location #42

bascht commented Jul 23, 2019

thcyron commented Jul 24, 2019

costela commented Jul 28, 2019

costela commented Jul 30, 2019

bascht commented Jul 31, 2019

costela commented Jul 31, 2019

Topology awareness: CSI Controller creates volumes in its own location #42

Topology awareness: CSI Controller creates volumes in its own location #42

Comments

bascht commented Jul 23, 2019

thcyron commented Jul 24, 2019

costela commented Jul 28, 2019

costela commented Jul 30, 2019

bascht commented Jul 31, 2019

costela commented Jul 31, 2019