Systemd: Failed to set up mount unit: Invalid argument #11575

DjVinnii · 2023-01-21T08:57:23Z

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:

NOTE: This behavior started after I upgraded Kubernetes from 1.23 to 1.24

As soon as my nodes get any pod assigned with an RBD PVC, it starts spamming the following lines in /var/log/syslog:

Jan 21 09:19:17 am5-k8s-node-01 systemd[3013626]: message repeated 8 times: [ Failed to set up mount unit: Invalid argument]
Jan 21 09:19:17 am5-k8s-node-01 systemd[1]: Failed to set up mount unit: Invalid argument

In December I did some investigation and it looks like it all starts with te following logs:

...
Dec 31 13:15:14 am5-k8s-node-01 kernel: [  876.509387] libceph: mon0 (1)10.152.183.102:6789 session established
Dec 31 13:15:14 am5-k8s-node-01 kernel: [  876.510483] libceph: client56622929 fsid b6f6bb22-151b-4bc1-bf92-1b2b68eee1d3
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.610420] rbd: rbd1: capacity 8589934592 features 0x1
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.614436] rbd: rbd4: capacity 268435456000 features 0x1
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.618442] rbd: rbd2: capacity 8589934592 features 0x1
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.642357] rbd: rbd0: capacity 8589934592 features 0x1
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.642362] rbd: rbd3: capacity 107374182400 features 0x1
Dec 31 13:15:15 am5-k8s-node-01 multipathd[1146]: rbd1: HDIO_GETGEO failed with 25
Dec 31 13:15:15 am5-k8s-node-01 multipathd[1146]: rbd1: failed to get udev uid: Invalid argument
Dec 31 13:15:15 am5-k8s-node-01 multipathd[1146]: rbd1: failed to get unknown uid: Invalid argument
Dec 31 13:15:15 am5-k8s-node-01 microk8s.daemon-kubelite[1400]: I1231 13:15:15.165343    1400 operation_generator.go:1555] "Controller attach succeeded for volume \"pvc-f3482fb2-a883-46ad-b47b-07d3cc6973cf\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000002-23e9562c-6bfd-11ed-aa40-c67613b008e6\") pod \"redis-replicas-0\" (UID: \"9cff0ba0-b4a2-4ba6-8c76-e95d8c800658\") device path: \"\"" pod="<redacted>/redis-replicas-0"
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.756571] EXT4-fs (rbd1): mounted filesystem with ordered data mode. Opts: (null)
Dec 31 13:15:15 am5-k8s-node-01 systemd[2443]: Failed to set up mount unit: Invalid argument
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.791752] EXT4-fs (rbd0): mounted filesystem with ordered data mode. Opts: (null)
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.831471] EXT4-fs (rbd4): mounted filesystem with ordered data mode. Opts: (null)
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.872417] EXT4-fs (rbd3): mounted filesystem with ordered data mode. Opts: (null)
Dec 31 13:15:15 am5-k8s-node-01 kernel: [  876.894296] EXT4-fs (rbd2): mounted filesystem with ordered data mode. Opts: (null)
Dec 31 13:15:15 am5-k8s-node-01 systemd[1]: Failed to set up mount unit: Invalid argument
...

However it looks like all PVC's are correctly mounted to the pods.

I have a feeling the mount path might be too long. For example /dev/rbd0 is mounted as follows (mount | grep rbd):

/dev/rbd0 on /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/32030ca958fe47c1e0da9fd8df2c7727edec6f59a544669ed4e4835bc15e1566/globalmount/0001-0009-rook-ceph-0000000000000002-0251e137-0a3e-11ec-99db-724c03a610e7 type ext4 (rw,relatime,stripe=16,_netdev)
/dev/rbd0 on /var/snap/microk8s/common/var/lib/kubelet/pods/deaf89a8-e96a-46d6-bb17-7a21702c049a/volumes/kubernetes.io~csi/pvc-2759dd8c-8a9c-455d-8b80-95bb5c40ecf6/mount type ext4 (rw,relatime,stripe=16,_netdev)

If I'm correct the max length is 255, but length of the first mount is 276 according to the following command: systemd-escape /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/32030ca958fe47c1e0da9fd8df2c7727edec6f59a544669ed4e4835bc15e1566/globalmount/0001-0009-rook-ceph-0000000000000002-0251e137-0a3e-11ec-99db-724c03a610e7 | wc -c

Expected behavior:
Syslog should not be spammed with the above mentioned errors

How to reproduce it (minimal and precise):

Start with node without any workloads mounting RBD PVC's (drain the node). /var/log/syslog is clean and 'mount | grep rbd' shows nothing
Schedule a workload with a RBD PVC on the node. /var/log/syslog is getting spammed with Failed to set up mount unit: Invalid argument.

File(s) to submit:

Cluster CR (custom resource), typically called cluster.yaml, if necessary

#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.

# For example, to create the cluster:
#   kubectl create -f crds.yaml -f common.yaml -f operator.yaml
#   kubectl create -f cluster.yaml
#################################################################################################################

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph # namespace:cluster
spec:
  cephVersion:
    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v16 is Pacific, and v17 is Quincy.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v17 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    # If you want to be more precise, you can always use a timestamp tag such quay.io/ceph/ceph:v17.2.3-20220805
    # This tag might not contain a new Ceph version, just security fixes from the underlying operating system, which will reduce vulnerabilities
    image: quay.io/ceph/ceph:v17.2.5
    # Whether to allow unsupported versions of Ceph. Currently `pacific` and `quincy` are supported.
    # Future versions such as `reef` (v18) would require this to be set to `true`.
    # Do not set to true in production.
    allowUnsupported: false
  # The path on the host where configuration files will be persisted. Must be specified.
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook
  # Whether or not upgrade should continue even if a check fails
  # This means Ceph's status could be degraded and we don't recommend upgrading but you might decide otherwise
  # Use at your OWN risk
  # To understand Rook's upgrade process of Ceph, read https://rook.io/docs/rook/latest/ceph-upgrade.html#ceph-version-upgrades
  skipUpgradeChecks: false
  # Whether or not continue if PGs are not clean during an upgrade
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  # WaitTimeoutForHealthyOSDInMinutes defines the time (in minutes) the operator would wait before an OSD can be stopped for upgrade or restart.
  # If the timeout exceeds and OSD is not ok to stop, then the operator would skip upgrade for the current OSD and proceed with the next one
  # if `continueUpgradeAfterChecksEvenIfNotHealthy` is `false`. If `continueUpgradeAfterChecksEvenIfNotHealthy` is `true`, then operator would
  # continue with the upgrade of an OSD even if its not ok to stop after the timeout. This timeout won't be applied if `skipUpgradeChecks` is `true`.
  # The default wait timeout is 10 minutes.
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    # Set the number of mons to be started. Generally recommended to be 3.
    # For highest availability, an odd number of mons should be specified.
    count: 3
    # The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
    # Mons should only be allowed on the same node for test environments where data loss is acceptable.
    allowMultiplePerNode: false
  mgr:
    # When higher availability of the mgr is needed, increase the count to 2.
    # In that case, one mgr will be active and one in standby. When Ceph updates which
    # mgr is active, Rook will update the mgr services to match the active mgr.
    count: 2
    allowMultiplePerNode: false
    modules:
      # Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
      # are already enabled by other settings in the cluster CR.
      - name: pg_autoscaler
        enabled: true
      - name: rook
        enabled: true
  # enable the ceph dashboard for viewing cluster status
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
    # serve the dashboard at the given port.
    # port: 8443
    # serve the dashboard using SSL
    ssl: true
  # enable prometheus alerting for cluster
  monitoring:
    # requires Prometheus to be pre-installed
    enabled: true
  network:
    connections:
      # Whether to encrypt the data in transit across the wire to prevent eavesdropping the data on the network.
      # The default is false. When encryption is enabled, all communication between clients and Ceph daemons, or between Ceph daemons will be encrypted.
      # When encryption is not enabled, clients still establish a strong initial authentication and data integrity is still validated with a crc check.
      # IMPORTANT: Encryption requires the 5.11 kernel for the latest nbd and cephfs drivers. Alternatively for testing only,
      # you can set the "mounter: rbd-nbd" in the rbd storage class, or "mounter: fuse" in the cephfs storage class.
      # The nbd and fuse drivers are *not* recommended in production since restarting the csi driver pod will disconnect the volumes.
      encryption:
        enabled: false
      # Whether to compress the data in transit across the wire. The default is false.
      # Requires Ceph Quincy (v17) or newer. Also see the kernel requirements above for encryption.
      compression:
        enabled: false
    # enable host networking
    #provider: host
    # enable the Multus network provider
    #provider: multus
    #selectors:
      # The selector keys are required to be `public` and `cluster`.
      # Based on the configuration, the operator will do the following:
      #   1. if only the `public` selector key is specified both public_network and cluster_network Ceph settings will listen on that interface
      #   2. if both `public` and `cluster` selector keys are specified the first one will point to 'public_network' flag and the second one to 'cluster_network'
      #
      # In order to work, each selector value must match a NetworkAttachmentDefinition object in Multus
      #
      #public: public-conf --> NetworkAttachmentDefinition object name in Multus
      #cluster: cluster-conf --> NetworkAttachmentDefinition object name in Multus
    # Provide internet protocol version. IPv6, IPv4 or empty string are valid options. Empty string would mean IPv4
    #ipFamily: "IPv6"
    # Ceph daemons to listen on both IPv4 and Ipv6 networks
    #dualStack: false
  # enable the crash collector for ceph daemon crash collection
  crashCollector:
    disable: false
    # Uncomment daysToRetain to prune ceph crash entries older than the
    # specified number of days.
    #daysToRetain: 30
  # enable log collector, daemons will log on files and rotate
  logCollector:
    enabled: true
    periodicity: daily # one of: hourly, daily, weekly, monthly
    maxLogSize: 500M # SUFFIX may be 'M' or 'G'. Must be at least 1M.
  # automate [data cleanup process](https://github.com/rook/rook/blob/master/Documentation/Storage-Configuration/ceph-teardown.md#delete-the-data-on-hosts) in cluster destruction.
  cleanupPolicy:
    # Since cluster cleanup is destructive to data, confirmation is required.
    # To destroy all Rook data on hosts during uninstall, confirmation must be set to "yes-really-destroy-data".
    # This value should only be set when the cluster is about to be deleted. After the confirmation is set,
    # Rook will immediately stop configuring the cluster and only wait for the delete command.
    # If the empty string is set, Rook will not destroy any data on hosts during uninstall.
    confirmation: ""
    # sanitizeDisks represents settings for sanitizing OSD disks on cluster deletion
    sanitizeDisks:
      # method indicates if the entire disk should be sanitized or simply ceph's metadata
      # in both case, re-install is possible
      # possible choices are 'complete' or 'quick' (default)
      method: quick
      # dataSource indicate where to get random bytes from to write on the disk
      # possible choices are 'zero' (default) or 'random'
      # using random sources will consume entropy from the system and will take much more time then the zero source
      dataSource: zero
      # iteration overwrite N times instead of the default (1)
      # takes an integer value
      iteration: 1
    # allowUninstallWithVolumes defines how the uninstall should be performed
    # If set to true, cephCluster deletion does not wait for the PVs to be deleted.
    allowUninstallWithVolumes: false
  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
  # tolerate taints with a key of 'storage-node'.
#  placement:
#    all:
#      nodeAffinity:
#        requiredDuringSchedulingIgnoredDuringExecution:
#          nodeSelectorTerms:
#          - matchExpressions:
#            - key: role
#              operator: In
#              values:
#              - storage-node
#      podAffinity:
#      podAntiAffinity:
#      topologySpreadConstraints:
#      tolerations:
#      - key: storage-node
#        operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
#    mon:
# Monitor deployments may contain an anti-affinity rule for avoiding monitor
# collocation on the same node. This is a required rule when host network is used
# or when AllowMultiplePerNode is false. Otherwise this anti-affinity rule is a
# preferred rule with weight: 50.
#    osd:
#    prepareosd:
#    mgr:
#    cleanup:
  annotations:
#    all:
#    mon:
#    osd:
#    cleanup:
#    prepareosd:
# clusterMetadata annotations will be applied to only `rook-ceph-mon-endpoints` configmap and the `rook-ceph-mon` and `rook-ceph-admin-keyring` secrets.
# And clusterMetadata annotations will not be merged with `all` annotations.
#    clusterMetadata:
#       kubed.appscode.com/sync: "true"
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
#    mgr:
  labels:
#    all:
#    mon:
#    osd:
#    cleanup:
#    mgr:
#    prepareosd:
# monitoring is a list of key-value pairs. It is injected into all the monitoring resources created by operator.
# These labels can be passed as LabelSelector to Prometheus
#    monitoring:
#    crashcollector:
  resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
#    mgr:
#      limits:
#        cpu: "500m"
#        memory: "1024Mi"
#      requests:
#        cpu: "500m"
#        memory: "1024Mi"
# The above example requests/limits can also be added to the other components
#    mon:
#    osd:
# For OSD it also is a possible to specify requests/limits based on device class
#    osd-hdd:
#    osd-ssd:
#    osd-nvme:
#    prepareosd:
#    mgr-sidecar:
#    crashcollector:
#    logcollector:
#    cleanup:
  # The option to automatically remove OSDs that are out and are safe to destroy.
  removeOSDsIfOutAndSafeToRemove: false
  priorityClassNames:
    #all: rook-ceph-default-priority-class
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
    #crashcollector: rook-ceph-crashcollector-priority-class
  storage: # cluster level storage configuration and selection
    useAllNodes: true
    useAllDevices: true
    #deviceFilter:
    config:
      # crushRoot: "custom-root" # specify a non-default root label for the CRUSH map
      # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
      # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
      # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
      # osdsPerDevice: "1" # this value can be overridden at the node or device level
      # encryptedDevice: "true" # the default value for this option is "false"
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources.  Each node's 'name' field should match their 'kubernetes.io/hostname' label.
    # nodes:
    #   - name: "172.17.4.201"
    #     devices: # specific devices to use for storage can be specified for each node
    #       - name: "sdb"
    #       - name: "nvme01" # multiple osds can be created on high performance devices
    #         config:
    #           osdsPerDevice: "5"
    #       - name: "/dev/disk/by-id/ata-ST4000DM004-XXXX" # devices can be specified using full udev paths
    #     config: # configuration can be specified at the node level which overrides the cluster level config
    #   - name: "172.17.4.301"
    #     deviceFilter: "^sd."
    # when onlyApplyOSDPlacement is false, will merge both placement.All() and placement.osd
    onlyApplyOSDPlacement: false
  # The section for configuring management of daemon disruptions during upgrade or fencing.
  disruptionManagement:
    # If true, the operator will create and manage PodDisruptionBudgets for OSD, Mon, RGW, and MDS daemons. OSD PDBs are managed dynamically
    # via the strategy outlined in the [design](https://github.com/rook/rook/blob/master/design/ceph/ceph-managed-disruptionbudgets.md). The operator will
    # block eviction of OSDs by default and unblock them safely when drains are detected.
    managePodBudgets: true
    # A duration in minutes that determines how long an entire failureDomain like `region/zone/host` will be held in `noout` (in addition to the
    # default DOWN/OUT interval) when it is draining. This is only relevant when  `managePodBudgets` is `true`. The default value is `30` minutes.
    osdMaintenanceTimeout: 30
    # A duration in minutes that the operator will wait for the placement groups to become healthy (active+clean) after a drain was completed and OSDs came back up.
    # Operator will continue with the next drain if the timeout exceeds. It only works if `managePodBudgets` is `true`.
    # No values or 0 means that the operator will wait until the placement groups are healthy before unblocking the next drain.
    pgHealthCheckTimeout: 0
    # If true, the operator will create and manage MachineDisruptionBudgets to ensure OSDs are only fenced when the cluster is healthy.
    # Only available on OpenShift.
    manageMachineDisruptionBudgets: false
    # Namespace in which to watch for the MachineDisruptionBudgets.
    machineDisruptionBudgetNamespace: openshift-machine-api

  # healthChecks
  # Valid values for daemons are 'mon', 'osd', 'status'
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    # Change pod liveness probe timing or threshold values. Works for all mon,mgr,osd daemons.
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
    # Change pod startup probe timing or threshold values. Works for all mon,mgr,osd daemons.
    startupProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false

Cluster Status to submit:

Output of krew commands, if necessary

kubectl rook-ceph health:

Info:  Checking if at least three mon pods are running on different nodes
rook-ceph-mon-m-7765746b58-vfgvh                            2/2     Running     0             20d
rook-ceph-mon-n-7769cf6bc8-2h6dr                            2/2     Running     0             20d
rook-ceph-mon-k-585754cb5c-hssmx                            2/2     Running     0             20d

Info:  Checking mon quorum and ceph health details
HEALTH_OK

Info:  Checking if at least three osd pods are running on different nodes
rook-ceph-osd-0-5989dc5cc9-6ffcw                            2/2     Running     0             16h
rook-ceph-osd-5-866f59d556-kmc4v                            2/2     Running     0             16h
rook-ceph-osd-6-7f749856d5-bsr9x                            2/2     Running     0             16h
rook-ceph-osd-2-7c6fc4cdd9-r4qh4                            2/2     Running     0             16h
rook-ceph-osd-7-585765569d-spztf                            2/2     Running     0             16h
rook-ceph-osd-4-6dcc586fd8-tdgcn                            2/2     Running     0             16h
rook-ceph-osd-3-864b745f5f-n5c42                            2/2     Running     0             16h
rook-ceph-osd-8-66879b79dd-vh49l                            2/2     Running     0             16h

Info:  Pods that are in 'Running' status
NAME                                                        READY   STATUS    RESTARTS      AGE
rook-ceph-mon-m-7765746b58-vfgvh                            2/2     Running   0             20d
rook-ceph-mds-myfs-b-6fb7c564c5-9dp82                       2/2     Running   0             20d
rook-ceph-mon-n-7769cf6bc8-2h6dr                            2/2     Running   0             20d
rook-ceph-rgw-my-store-a-745657c48c-5gvf5                   2/2     Running   0             20d
rook-ceph-mds-myfs-a-d8c857cd-vfgzh                         2/2     Running   1 (20d ago)   20d
rook-ceph-mon-k-585754cb5c-hssmx                            2/2     Running   0             20d
rook-ceph-operator-5dcccd4b4c-x62xm                         1/1     Running   0             16h
rook-ceph-crashcollector-am5-k8s-node-04-6f4d5d6746-czw4f   1/1     Running   0             16h
rook-ceph-crashcollector-am5-k8s-node-01-56f5c86cd7-54sq4   1/1     Running   0             16h
rook-ceph-crashcollector-am5-k8s-node-03-bccd7fbd9-sttxq    1/1     Running   0             16h
rook-ceph-crashcollector-am5-k8s-node-02-6f44584ddb-jqdf4   1/1     Running   0             16h
rook-discover-plr4p                                         1/1     Running   0             16h
csi-cephfsplugin-vpkhf                                      2/2     Running   0             16h
csi-rbdplugin-jgz6g                                         2/2     Running   0             16h
rook-discover-zgbf8                                         1/1     Running   0             16h
csi-cephfsplugin-6f2pw                                      2/2     Running   0             16h
csi-rbdplugin-provisioner-99dd6c4c6-f627t                   5/5     Running   0             16h
csi-cephfsplugin-provisioner-7c594f8cf-m5z5s                5/5     Running   0             16h
csi-cephfsplugin-provisioner-7c594f8cf-vtvnr                5/5     Running   0             16h
csi-rbdplugin-provisioner-99dd6c4c6-8w7gd                   5/5     Running   0             16h
csi-cephfsplugin-p8f5l                                      2/2     Running   0             16h
csi-rbdplugin-clvvz                                         2/2     Running   0             16h
csi-rbdplugin-p7l5h                                         2/2     Running   0             16h
rook-discover-rl5js                                         1/1     Running   0             16h
csi-cephfsplugin-szm79                                      2/2     Running   0             16h
csi-rbdplugin-2wkfl                                         2/2     Running   0             16h
rook-discover-7nw6t                                         1/1     Running   0             16h
rook-ceph-mgr-a-584c6c6647-wtbbk                            3/3     Running   0             16h
rook-ceph-mgr-b-677bd6944-2tjnb                             3/3     Running   0             16h
rook-ceph-osd-0-5989dc5cc9-6ffcw                            2/2     Running   0             16h
rook-ceph-osd-5-866f59d556-kmc4v                            2/2     Running   0             16h
rook-ceph-osd-6-7f749856d5-bsr9x                            2/2     Running   0             16h
rook-ceph-osd-2-7c6fc4cdd9-r4qh4                            2/2     Running   0             16h
rook-ceph-osd-7-585765569d-spztf                            2/2     Running   0             16h
rook-ceph-osd-4-6dcc586fd8-tdgcn                            2/2     Running   0             16h
rook-ceph-osd-3-864b745f5f-n5c42                            2/2     Running   0             16h
rook-ceph-osd-8-66879b79dd-vh49l                            2/2     Running   0             16h

Warning:  Pods that are 'Not' in 'Running' status
NAME                                          READY   STATUS      RESTARTS   AGE

Info:  checking placement group status
Info:  169 pgs: 169 active+clean; 229 GiB data, 691 GiB used, 4.6 TiB / 5.3 TiB avail; 9.6 KiB/s rd, 640 KiB/s wr, 18 op/s

Info:  checking if at least one mgr pod is running
rook-ceph-mgr-a-584c6c6647-wtbbk                            Running     am5-k8s-node-01
rook-ceph-mgr-b-677bd6944-2tjnb                             Running     am5-k8s-node-02

kubectl rook-ceph ceph status

cluster:
    id:     b6f6bb22-151b-4bc1-bf92-1b2b68eee1d3
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum k,m,n (age 2w)
    mgr: b(active, since 16h), standbys: a
    mds: 1/1 daemons up, 1 hot standby
    osd: 8 osds: 8 up (since 16h), 8 in (since 3w)
    rgw: 1 daemon active (1 hosts, 1 zones)

  data:
    volumes: 1/1 healthy
    pools:   12 pools, 169 pgs
    objects: 152.68k objects, 229 GiB
    usage:   691 GiB used, 4.6 TiB / 5.3 TiB avail
    pgs:     169 active+clean

  io:
    client:   27 KiB/s rd, 452 KiB/s wr, 15 op/s rd, 16 op/s wr

Environment:

OS (e.g. from /etc/os-release): Ubuntu 20.04.5 LTS
Kernel (e.g. uname -a): 5.4.0-135-generic
Cloud provider or hardware configuration: -
Rook version (use rook version inside of a Rook Pod): v1.10.10
Storage backend version (e.g. for ceph do ceph -v): v17.2.5
Kubernetes version (use kubectl version): v1.24.8
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): MicroK8s
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_OK

The text was updated successfully, but these errors were encountered:

Madhu-1 · 2023-01-23T07:21:33Z

If I'm correct the max length is 255, but length of the first mount is 276 according to the following command: systemd-escape /var/snap/microk8s/common/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/32030ca958fe47c1e0da9fd8df2c7727edec6f59a544669ed4e4835bc15e1566/globalmount/0001-0009-rook-ceph-0000000000000002-0251e137-0a3e-11ec-99db-724c03a610e7 | wc -c
256

@DjVinnii yes, I think you are correct. This happens with most of the clusters which use a custom kubelet path ex:- micr0k8s. In actual cluster, its /var/lib/kublet, and in microk8s its /var/snap/microk8s/common/var/lib/kubelet/ , more then Rook this is a systemd issue and it was fixed in systemd/systemd#18077, AFAIK nothing can be fixed at Rook for this one.

DjVinnii · 2023-01-24T07:31:55Z

@Madhu-1 First of all, thanks for the quick reply.

In actual cluster, its /var/lib/kublet, and in microk8s its /var/snap/microk8s/common/var/lib/kubelet/ , more then Rook this is a systemd issue and it was fixed in systemd/systemd#18077, AFAIK nothing can be fixed at Rook for this one.

I already thought this wasn't something Rook could fix. After some further investigation with the issue you mentioned, I found out that it should be fixed with systemd 249 and higher. Ubuntu 20.04 LTS makes use of systemd 245. So, it looks like I need to upgrade my nodes Ubuntu 22.04 LTS, which makes use of systemd 249 according to Ubuntu Packages

Madhu-1 · 2023-01-24T07:42:48Z

@DjVinnii Thank you for checking. As it's not a Rook issue am closing this one.

DjVinnii added the bug label Jan 21, 2023

Madhu-1 closed this as not planned Won't fix, can't repro, duplicate, stale Jan 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Systemd: Failed to set up mount unit: Invalid argument #11575

Systemd: Failed to set up mount unit: Invalid argument #11575

DjVinnii commented Jan 21, 2023 •

edited

Madhu-1 commented Jan 23, 2023

DjVinnii commented Jan 24, 2023

Madhu-1 commented Jan 24, 2023

Systemd: Failed to set up mount unit: Invalid argument #11575

Systemd: Failed to set up mount unit: Invalid argument #11575

Comments

DjVinnii commented Jan 21, 2023 • edited

Madhu-1 commented Jan 23, 2023

DjVinnii commented Jan 24, 2023

Madhu-1 commented Jan 24, 2023

DjVinnii commented Jan 21, 2023 •

edited