Skip to content

Trident Daemonset Appears to Ignore Node Taints #753

@ghost

Description

Describe the bug
Trident appears to ignore node taints and its daemonset deploy pods on all nodes rather than the nodes without a NoSchedule taint such as:

spec:
  taints:
  - effect: NoSchedule
    key: juju.is/kubernetes-control-plane
    value: "true

The issue appeared when upgrading Charmed Kubernetes from v1.23 to v1.24. The puzzling bit is the daemonset definition has the following node selector:

Node-Selector:  kubernetes.io/arch=amd64,kubernetes.io/os=linux

Irrespective, I would have expected the taint to apply as per test trying to schedule pods onto these nodes, which as expected failed without the correct taints.

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 21.04.1 and 22.07.0
  • Trident installation flags used: ./tridentctl -n trident install
  • Container runtime: containerd 1.5.5-0ubuntu3~18.04.2 via apt install
  • Kubernetes version: v1.24.3
  • Kubernetes orchestrator: Charmed Kubernetes
  • Kubernetes enabled feature gates: N/A
  • OS: Ubuntu 18.04.1
  • NetApp backend types: ontap-nas
  • Other:

To Reproduce
Node has the following key elements:

apiVersion: v1
kind: Node
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/os: linux
    kubernetes.io/arch: amd64
    kubernetes.io/os: linux
spec:
  taints:
  - effect: NoSchedule
    key: juju.is/kubernetes-control-plane
    value: "true"

The trident daemonset looks like this:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  annotations:
    deprecated.daemonset.template.generation: "2"
  creationTimestamp: "2022-08-04T18:57:36Z"
  generation: 2
  labels:
    app: node.csi.trident.netapp.io
    kubectl.kubernetes.io/default-container: trident-main
  name: trident-csi
  namespace: trident
  resourceVersion: "171442562"
  uid: 9a20849e-24ae-407a-a9bc-889113ecfdd9
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: node.csi.trident.netapp.io
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: node.csi.trident.netapp.io
    spec:
      containers:
      - args:
        - --no_persistence
        - --rest=false
        - --csi_node_name=$(KUBE_NODE_NAME)
        - --csi_endpoint=$(CSI_ENDPOINT)
        - --csi_role=node
        - --log_format=text
        - --http_request_timeout=1m30s
        - --https_rest
        - --https_port=17546
        command:
        - /trident_orchestrator
        env:
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: CSI_ENDPOINT
          value: unix://plugin/csi.sock
        - name: PATH
          value: /netapp:/bin
        image: netapp/trident:22.07.0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /liveness
            port: 17546
            scheme: HTTPS
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: trident-main
        readinessProbe:
          failureThreshold: 5
          httpGet:
            path: /readiness
            port: 17546
            scheme: HTTPS
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        securityContext:
          allowPrivilegeEscalation: true
          privileged: true
        startupProbe:
          failureThreshold: 5
          httpGet:
            path: /liveness
            port: 17546
            scheme: HTTPS
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 1
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /plugin
          name: plugin-dir
        - mountPath: /var/lib/kubelet/plugins
          mountPropagation: Bidirectional
          name: plugins-mount-dir
        - mountPath: /var/lib/kubelet/pods
          mountPropagation: Bidirectional
          name: pods-mount-dir
        - mountPath: /dev
          name: dev-dir
        - mountPath: /sys
          name: sys-dir
        - mountPath: /host
          mountPropagation: Bidirectional
          name: host-dir
        - mountPath: /var/lib/trident/tracking
          mountPropagation: Bidirectional
          name: trident-tracking-dir
        - mountPath: /certs
          name: certs
          readOnly: true
      - args:
        - --v=2
        - --csi-address=$(ADDRESS)
        - --kubelet-registration-path=$(REGISTRATION_PATH)
        env:
        - name: ADDRESS
          value: /plugin/csi.sock
        - name: REGISTRATION_PATH
          value: /var/lib/kubelet/plugins/csi.trident.netapp.io/csi.sock
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.5.1
        imagePullPolicy: IfNotPresent
        name: driver-registrar
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /plugin
          name: plugin-dir
        - mountPath: /registration
          name: registration-dir
      dnsPolicy: ClusterFirstWithHostNet
      hostIPC: true
      hostNetwork: true
      hostPID: true
      nodeSelector:
        juju-application: kubernetes-worker
        kubernetes.io/arch: amd64
        kubernetes.io/os: linux
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: trident-csi
      serviceAccountName: trident-csi
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - hostPath:
          path: /var/lib/kubelet/plugins/csi.trident.netapp.io/
          type: DirectoryOrCreate
        name: plugin-dir
      - hostPath:
          path: /var/lib/kubelet/plugins_registry/
          type: Directory
        name: registration-dir
      - hostPath:
          path: /var/lib/kubelet/plugins
          type: DirectoryOrCreate
        name: plugins-mount-dir
      - hostPath:
          path: /var/lib/kubelet/pods
          type: DirectoryOrCreate
        name: pods-mount-dir
      - hostPath:
          path: /dev
          type: Directory
        name: dev-dir
      - hostPath:
          path: /sys
          type: Directory
        name: sys-dir
      - hostPath:
          path: /
          type: Directory
        name: host-dir
      - hostPath:
          path: /var/lib/trident/tracking
          type: DirectoryOrCreate
        name: trident-tracking-dir
      - name: certs
        projected:
          defaultMode: 420
          sources:
          - secret:
              name: trident-csi
          - secret:
              name: trident-encryption-keys
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 4
  desiredNumberScheduled: 4
  numberAvailable: 4
  numberMisscheduled: 0
  numberReady: 4
  observedGeneration: 2
  updatedNumberScheduled: 4

Expected behavior
My expectation is that the daemonset would respect the node taints and not schedule pods on those nodes. To prove to myself the taints work, I tried to schedule a pod using the following yaml and as expected it failed:

apiVersion: v1
kind: Pod
metadata:
  name: ubuntu-juju-dns
spec:
  containers:
  - name: ubuntu
    image: docker.io/ubuntu
    imagePullPolicy: IfNotPresent
    command:
    - "/bin/sh"
    args:
    - "-c"
    - "sleep 100000"
  nodeSelector:
    kubernetes.io/hostname: juju-afc56e-21-lxd-2

Additional context
As I upgraded between v1.23 and v1.24, which had an existing trident installation, its possible the upgrade process is a factor but in my case would be impossible to reproduce.

I want to say, otherwise I've had a flawless experience with NetApp Trident - keep up the good work 😄 !

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions