Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csi-controller v2.0.0-rc.1 does not start when no vSAN file service is enabled #193

Closed
larhauga opened this issue Apr 23, 2020 · 1 comment · Fixed by #194
Closed

csi-controller v2.0.0-rc.1 does not start when no vSAN file service is enabled #193

larhauga opened this issue Apr 23, 2020 · 1 comment · Fixed by #194
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@larhauga
Copy link
Contributor

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
vsphere csi controller deployment does not start when the controller does not succeed to find file service backed by vSAN. In an environment where vSAN is not present, the getDsToFileServiceEnabled function fails, and the controller does not successfully start.

{"level":"error","time":"2020-04-22T09:06:11.285949514Z","caller":"common/vsphereutil.go:488","msg":"failed to get Datastore managed objects from datastore objects. dsObjList: [], properties: [info summary], err: object references is empty"..."stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.getDsToFileServiceEnable
dMap\n\t/build/pkg/csi/service/common/vsphereutil.go:488...
{"level":"error","time":"2020-04-22T09:06:11.286051279Z","caller":"common/vsphereutil.go:420","msg":"failed to query if file service is enabled on vsan datastores or not. error: object refere
nces is empty","TraceId":"e02723e7-6e50-41a0-b37d-28dae787f39b","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/common.IsFileServiceEnabled\n
{"level":"error","time":"2020-04-22T09:06:11.286086428Z","caller":"vanilla/controller.go:124","msg":"file service enablement check failed for datastore specified in TargetvSANFileShareDatast$
reURLs. err=object references is empty","TraceId":"e02723e7-6e50-41a0-b37d-28dae787f39b","stacktrace":"sigs.k8s.io/vsphere-csi-driver/pkg/csi/service/vanilla.(*controller).Init\n\t/build/pkg$
csi/service/vanilla/controller.go:124
{"level":"error","time":"2020-04-22T09:06:11.286116701Z","caller":"service/service.go:122","msg":"failed to init controller. Error: file service enablement check failed for datastore specifi$
d in TargetvSANFileShareDatastoreURLs

This happens when TargetvSANFileShareDatastoreURLs is not configured.

What you expected to happen:
Expected the controller to successfully start, and to be able to provide the devices through the storage class.

How to reproduce it (as minimally and precisely as possible):
In an environment without vSAN or file services (vSphere 6.7u3) the controller does not start.

  csi-vsphere.conf: |
    [Global]
    cluster-id = "clusterid"

    insecure-flag = "true"
    datacenters = "dc1,dc2"

    secret-namespace = "vsphere" # overridden by env vars from secrets
    secret-name = "cpi-global-secret" # overridden by env vars from secrets

    [VirtualCenter "<replaced>"]

    [Labels]
    region = k8s-region
    zone = k8s-zone
        - name: vsphere-csi-controller
          image: gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "rm -rf /var/lib/kubelet/plugins_registry/csi.vsphere.vmware.com"]
          imagePullPolicy: "Always"
          env:
            - name: CSI_ENDPOINT
              value: unix:///var/lib/kubelet/plugins_registry/csi.sock
            - name: X_CSI_MODE
              value: "controller"
            - name: VSPHERE_CSI_CONFIG
              value: "/etc/cloud/csi-vsphere.conf"
            - name: LOGGER_LEVEL
              value: "DEVELOPMENT" # "PRODUCTION" # Options: DEVELOPMENT, PRODUCTION
            - name: VSPHERE_USER
              valueFrom:
                secretKeyRef:
                  name: cpi-global-secret
                  key: username
            - name: VSPHERE_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: cpi-global-secret
                  key: password
          volumeMounts:
            - mountPath: /etc/cloud
              name: vsphere-config-volume
              readOnly: true
            - mountPath: /var/lib/kubelet/plugins_registry/
              name: socket-dir
          ports:
            - name: healthz
              containerPort: 9808
              protocol: TCP
          livenessProbe:
            httpGet:
              path: /healthz
              port: healthz
            initialDelaySeconds: 10
            timeoutSeconds: 3
            periodSeconds: 5
            failureThreshold: 3

Anything else we need to know?:
760b9ab looks to have introduced this bug, where previous errors would be ignored.

Environment:

  • csi-vsphere version: v2.0.0-rc.1
  • vsphere-cloud-controller-manager version:
  • Kubernetes version: 1.16.2
  • vSphere version: 6.7u3
  • OS (e.g. from /etc/os-release): Red Hat Enterprise Linux CoreOS 43.81.202003310153.0 (Ootpa)
  • Kernel (e.g. uname -a): 4.18.0-147.5.1.el8_1.x86_64
  • Install tools:
  • Others:
@symbian4sj
Copy link

hello!!
csi-controller v2.0.0-rc.1 container doesn't start with following error.

kubectl describe pod vsphere-csi-node-wbnn4 -n=kube-system
Events:
Type Reason Age From Message


Normal Scheduled 29m default-scheduler Successfully assigned kube-system/vsphere-csi-controller-76b8d7d97b-gz4j9 to seliicbl01526-k8sw
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "quay.io/k8scsi/csi-attacher:v2.0.0"
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "quay.io/k8scsi/csi-attacher:v2.0.0"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container csi-attacher
Normal Started 29m kubelet, seliicbl01526-k8sw Started container csi-attacher
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "quay.io/k8scsi/csi-resizer:v0.3.0"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container csi-resizer
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "quay.io/k8scsi/csi-resizer:v0.3.0"
Normal Started 29m kubelet, seliicbl01526-k8sw Started container csi-resizer
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1"
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "gcr.io/cloud-provider-vsphere/csi/release/driver:v2.0.0-rc.1"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container vsphere-csi-controller
Normal Started 29m kubelet, seliicbl01526-k8sw Started container vsphere-csi-controller
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "quay.io/k8scsi/livenessprobe:v1.1.0"
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.0.0-rc.1"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container liveness-probe
Normal Started 29m kubelet, seliicbl01526-k8sw Started container liveness-probe
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "quay.io/k8scsi/livenessprobe:v1.1.0"
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "gcr.io/cloud-provider-vsphere/csi/release/syncer:v2.0.0-rc.1"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container vsphere-syncer
Normal Started 29m kubelet, seliicbl01526-k8sw Started container vsphere-syncer
Normal Pulling 29m kubelet, seliicbl01526-k8sw Pulling image "quay.io/k8scsi/csi-provisioner:v1.4.0"
Normal Pulled 29m kubelet, seliicbl01526-k8sw Successfully pulled image "quay.io/k8scsi/csi-provisioner:v1.4.0"
Normal Created 29m kubelet, seliicbl01526-k8sw Created container csi-provisioner
Normal Started 29m kubelet, seliicbl01526-k8sw Started container csi-provisioner
Warning Unhealthy 24m (x19 over 29m) kubelet, seliicbl01526-k8sw Liveness probe failed: Get http://10.1.248.1:9808/healthz: dial tcp 10.1.248.1:9808: connect: connection refused
Warning BackOff 14m (x80 over 28m) kubelet, seliicbl01526-k8sw Back-off restarting failed container
Warning BackOff 9m16s (x102 over 28m) kubelet, seliicbl01526-k8sw Back-off restarting failed container
Warning BackOff 4m17s (x106 over 27m) kubelet, seliicbl01526-k8sw Back-off restarting failed container

kubectl get pods -A -o wide
kube-system vsphere-csi-controller-76b8d7d97b-gz4j9 2/6 CrashLoopBackOff 45 32m 10.1.248.1 seliicbl01526-k8sw

curl http://10.1.248.1:9808
curl: (7) Failed to connect to 10.1.248.1 port 9808: Connection refused

ping 10.1.248.1
PING 10.1.248.1 (10.1.248.1) 56(84) bytes of data.
64 bytes from 10.1.248.1: icmp_seq=1 ttl=64 time=0.032 ms
64 bytes from 10.1.248.1: icmp_seq=2 ttl=64 time=0.021 ms
^C
--- 10.1.248.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1006ms
rtt min/avg/max/mdev = 0.021/0.026/0.032/0.007 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
3 participants