Trident Daemonset Appears to Ignore Node Taints

**Describe the bug**
Trident appears to ignore node taints and its `daemonset` deploy pods on all nodes rather than the nodes without a `NoSchedule` taint such as:

```yaml
spec:
  taints:
  - effect: NoSchedule
    key: juju.is/kubernetes-control-plane
    value: "true
```

The issue appeared when upgrading Charmed Kubernetes from v1.23 to v1.24.  The puzzling bit is the `daemonset` definition has the following node selector:

```yaml
Node-Selector:  kubernetes.io/arch=amd64,kubernetes.io/os=linux
```

Irrespective, I would have expected the taint to apply as per test trying to schedule pods onto these nodes, which as expected failed without the correct taints.

**Environment**
Provide accurate information about the environment to help us reproduce the issue.

- Trident version: 21.04.1 and 22.07.0
- Trident installation flags used: ./tridentctl -n trident install
- Container runtime: containerd 1.5.5-0ubuntu3~18.04.2 via apt install
- Kubernetes version: v1.24.3
- Kubernetes orchestrator: [Charmed Kubernetes](https://ubuntu.com/kubernetes/docs/1.24/components) 
- Kubernetes enabled feature gates: N/A
- OS: Ubuntu 18.04.1
- NetApp backend types: `ontap-nas`
- Other:

**To Reproduce**
Node has the following key elements:

```yaml
apiVersion: v1
kind: Node
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/os: linux
spec:
  taints:
  - effect: NoSchedule
    key: juju.is/kubernetes-control-plane
    value: "true"
```

The trident `daemonset` looks like this:

```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "2"
  creationTimestamp: "2022-08-04T18:57:36Z"
  generation: 2
  labels:
    app: node.csi.trident.netapp.io
    kubectl.kubernetes.io/default-container: trident-main
  name: trident-csi
  namespace: trident
  resourceVersion: "171442562"
  uid: 9a20849e-24ae-407a-a9bc-889113ecfdd9
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: node.csi.trident.netapp.io
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: node.csi.trident.netapp.io
    spec:
      containers:
      - args:
        - --no_persistence
        - --rest=false
        - --csi_node_name=$(KUBE_NODE_NAME)
        - --csi_endpoint=$(CSI_ENDPOINT)
        - --csi_role=node
        - --log_format=text
        - --http_request_timeout=1m30s
        - --https_rest
        - --https_port=17546
        command:
        - /trident_orchestrator
        env:
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: CSI_ENDPOINT
          value: unix://plugin/csi.sock
        - name: PATH
          value: /netapp:/bin
        image: netapp/trident:22.07.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /liveness
            port: 17546
            scheme: HTTPS
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: trident-main
        readinessProbe:
          failureThreshold: 5
          httpGet:
            path: /readiness
            port: 17546
            scheme: HTTPS
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          privileged: true
        startupProbe:
          failureThreshold: 5
          httpGet:
            path: /liveness
            port: 17546
            scheme: HTTPS
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /plugin
          name: plugin-dir
        - mountPath: /var/lib/kubelet/plugins
          mountPropagation: Bidirectional
          name: plugins-mount-dir
        - mountPath: /var/lib/kubelet/pods
          mountPropagation: Bidirectional
          name: pods-mount-dir
        - mountPath: /dev
          name: dev-dir
        - mountPath: /sys
          name: sys-dir
        - mountPath: /host
          mountPropagation: Bidirectional
          name: host-dir
        - mountPath: /var/lib/trident/tracking
          mountPropagation: Bidirectional
          name: trident-tracking-dir
        - mountPath: /certs
          name: certs
          readOnly: true
      - args:
        - --v=2
        - --csi-address=$(ADDRESS)
        - --kubelet-registration-path=$(REGISTRATION_PATH)
        env:
        - name: ADDRESS
          value: /plugin/csi.sock
        - name: REGISTRATION_PATH
          value: /var/lib/kubelet/plugins/csi.trident.netapp.io/csi.sock
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.5.1
        imagePullPolicy: IfNotPresent
        name: driver-registrar
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /plugin
          name: plugin-dir
        - mountPath: /registration
          name: registration-dir
      dnsPolicy: ClusterFirstWithHostNet
      hostIPC: true
      hostNetwork: true
      hostPID: true
      nodeSelector:
        juju-application: kubernetes-worker
        kubernetes.io/arch: amd64
        kubernetes.io/os: linux
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: trident-csi
      serviceAccountName: trident-csi
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /var/lib/kubelet/plugins/csi.trident.netapp.io/
          type: DirectoryOrCreate
        name: plugin-dir
      - hostPath:
          path: /var/lib/kubelet/plugins_registry/
          type: Directory
        name: registration-dir
      - hostPath:
          path: /var/lib/kubelet/plugins
          type: DirectoryOrCreate
        name: plugins-mount-dir
      - hostPath:
          path: /var/lib/kubelet/pods
          type: DirectoryOrCreate
        name: pods-mount-dir
      - hostPath:
          path: /dev
          type: Directory
        name: dev-dir
      - hostPath:
          path: /sys
          type: Directory
        name: sys-dir
      - hostPath:
          path: /
          type: Directory
        name: host-dir
      - hostPath:
          path: /var/lib/trident/tracking
          type: DirectoryOrCreate
        name: trident-tracking-dir
      - name: certs
        projected:
          defaultMode: 420
          sources:
          - secret:
              name: trident-csi
          - secret:
              name: trident-encryption-keys
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 4
  desiredNumberScheduled: 4
  numberAvailable: 4
  numberMisscheduled: 0
  numberReady: 4
  observedGeneration: 2
  updatedNumberScheduled: 4
```

**Expected behavior**
My expectation is that the `daemonset` would respect the node taints and *not* schedule pods on those nodes.  To prove to myself the taints work, I tried to schedule a pod using the following yaml and as expected it failed:

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: ubuntu-juju-dns
spec:
  containers:
  - name: ubuntu
    image: docker.io/ubuntu
    imagePullPolicy: IfNotPresent
    command:
    - "/bin/sh"
    args:
    - "-c"
    - "sleep 100000"
  nodeSelector:
    kubernetes.io/hostname: juju-afc56e-21-lxd-2
```

**Additional context**
As I upgraded between v1.23 and v1.24, which had an existing trident installation, its possible the upgrade process is a factor but in my case would be impossible to reproduce.

I want to say, otherwise I've had a flawless experience with NetApp Trident - keep up the good work :smile: !

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trident Daemonset Appears to Ignore Node Taints #753

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Trident Daemonset Appears to Ignore Node Taints #753

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions