[IMPROVEMENT] Support both NFS `hard` and `soft` with custom `timeo` and `retrans` options for RWX volumes #6655

derekbit · 2023-09-11T03:48:59Z

Is your improvement request related to a feature? Please describe (👍 if you like this request)

When a RWX volume is attached, a share-manager pod embedded with a userspace NFS server is created and the volume is exported. A remote exported share is hard mounted by Longhorn, and it is then provided to the workload. When the share-manager pod or embedded NFS server is somehow crashed or unreachable, the 'hard mount' option keeps the client trying to connect to NFS server and prevents data loss.

It has been reported that a reboot hangs when the NFS share is hard mounted and that the connection to the NFS server is lost during I/O operations.

The root cause is the Linux kernel is trying to maintain filesystem stability. Linux kernel will not allow a filesystem to be unmounted until all its pending IO is written back to storage, and the system can't shut down until all file systems are
unmounted. Currently, the bug/issue is not resolved.

A feasible workaround is using soft instead. It is possible to mitigate the potential data loss by using the sync mount option. However, the sync option makes the RWX volumes unusable in most practical applications due to poor IO performance.

As discussed with @innobead, at the trade-off the potential data loss and IO performance, the Longhorn system can use both the "hard" option and allow users to use the "soft" option with long custom timeo and retrans values.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

cc @james-munson

The text was updated successfully, but these errors were encountered:

james-munson · 2023-09-13T22:12:57Z

This is for use as a mounted volume, but a similar ticket exists for backup target mount options: #6608

They are separate cases, and would not share a common configuration, but is the answer in either case to allow a list of mount options (or maybe even a customized mount command) and if specified, use that string instead of the default set of mount options?

derekbit · 2023-09-14T00:04:25Z

They are separate cases, and would not share a common configuration, but is the answer in either case to allow a list of mount options (or maybe even a customized mount command) and if specified, use that string instead of the default set of mount options?

Yes, I think we can allow a list of mount options.

derekbit · 2023-09-20T04:29:50Z

Options

There are two mount options in a storage class

mountOptions

The options are recorded in the pv.spec.mountOptions. share-manager controller will fetch the options from the PV and append them to the share-manager pod's container args. Longhorn block device will be mounted with the custom options and exported as a NFS share. (code)

parameters.nfsOptions

For a RWX volume, CSI's NodeStageVolume will retrieve the nfsOptions in a storage class. (code)

If the value is empty, the NFS client uses the options

 	mountOptions = []string{
 		"vers=4.1",
 		"noresvport",
 		"intr",
 		"hard",
 	}

If the value is given, NFS client will use the given options instead.

v1.4.0-v1.4.3 and v1.5.0-v1.5.1

Users can create a storage class with nfsOptions if they want to switch to soft mode for avoiding the reboot hang issue

vers=4.1,noresvport,soft,timeo=150,retrans=3

Here, the timeout is 15 seconds and retransmission is 3.

timeo and retrans should be long enough. Then, If the share-manager pod is back or a replacement is created, the client can reconnect to it without data loss.

v1.4.4+, v1.5.2+ and v1.6+

The reboot hang issue in the Linux kernel is known but without a fix currently. To improve the stability, soft mode with long timeout will be used for mounting a NFS share.

innobead · 2023-09-20T04:34:55Z

@derekbit Can we make the default soft mode in upcoming 1.4.4 and 1.5.2 if no any compatibility concerns?

It seems it happens at runtime when mount the volume used by the share manager pod, so we would expect the NFS mount (share manager) will be changed to use soft mode after restarting the share manager pod, correct?

derekbit · 2023-09-20T04:36:10Z

Can we make the default soft mode in upcoming 1.4.4 and 1.5.2 if no any compatibility concerns?

Sure. Not compatibility concern.
The setting will be applied after the share-manager pod is restarted.

innobead · 2023-09-20T06:19:19Z

After discussing with @derekbit , we will investigate the feasibility of introducing hard mode back when share manager HA is introduced in 1.7. #6205

longhorn-io-github-bot · 2023-09-20T13:26:13Z

roger-ryao · 2023-09-20T13:45:10Z

Hi @derekbit

Create a directory to mount the NFS share: mkdir -p /mnt/nfs_share
Mount the NFS share from the server with this command: mount -t nfs XX.XX.XX.XX:/var/nfs /mnt/nfs_share
Start a data writing process in the background: dd if=/dev/zero of=/mnt/nfs_share/a &
Shutdown the NFS server.
Reboot the client machine.
After rebooting the client, check the syslog on the Ubuntu 22.04 machine for any relevant information.
we didn't see umount.nfs: /mnt: device is busy information in the syslog

innobead · 2023-09-20T14:22:52Z

Hi @derekbit

Create a directory to mount the NFS share: mkdir -p /mnt/nfs_share

Mount the NFS share from the server with this command: mount -t nfs XX.XX.XX.XX:/var/nfs /mnt/nfs_share

Start a data writing process in the background: dd if=/dev/zero of=/mnt/nfs_share/a &

Shutdown the NFS server.

Reboot the client machine.

Before rebooting the client machine, did you wait until the IO is stuck? I assume what you tested in the current implementation, hard mode, correct?

@derekbit should we have the above criteria when testing? it seems this case is not always able to reproduce.

derekbit · 2023-09-20T14:42:30Z

should we have the above criteria when testing? it seems this case is not always able to reproduce.

On-the-fly IO is required. If the server is down, the IO should be stuck. Yes, it is not easy to reproduce...

innobead · 2023-09-20T14:45:52Z

@roger-ryao probably can have a script to run n times to see whether it's able to reproduce if you are running this locally.

roger-ryao · 2023-09-26T08:29:34Z

Verified on master-head 20230926

longhorn master-head c0a258a
longhorn-manager master-head longhorn/longhorn-manager@a0c13a2

The test steps

Scenario 1: RWX Volume with Hard Mount

Deploy an RWX volume with the following storage class settings (nfsOptions: "hard,timeo=50,retrans=1")

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-test
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
  fsType: "ext4"
  nfsOptions: "hard,timeo=50,retrans=1"

Create a Pod that mounts this volume.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
#  annotations:
#    volume.beta.kubernetes.io/storage-class: longhorn-test
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn-test
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc

Verify that the remote export mount by Longhorn is in hard mode

kubectl exec -it volume-test -- /bin/bash -c "mount | grep nfs"

10.43.78.52:/pvc-433cb644-d339-4158-a477-6e1b82a903c1 on /data type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=50,retrans=1,sec=sys,clientaddr=10.0.2.43,local_lock=none,addr=10.43.78.52)

Scenario 2: RWX Volume with Soft Mount

Deploy an RWX volume with the following storage class settings (nfsOptions: "soft,timeo=250,retrans=5")

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: longhorn-test
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  numberOfReplicas: "3"
  staleReplicaTimeout: "2880"
  fromBackup: ""
  fsType: "ext4"
  nfsOptions: "soft,timeo=250,retrans=5"

Create a Pod that mounts this volume

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: longhorn-volv-pvc
#  annotations:
#    volume.beta.kubernetes.io/storage-class: longhorn-test
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: longhorn-test
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: volume-test
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: volume-test
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: volv
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: volv
    persistentVolumeClaim:
      claimName: longhorn-volv-pvc

Verify that the remote export mount by Longhorn is in soft mode

kubectl exec -it volume-test -- /bin/bash -c "mount | grep nfs"

10.43.22.27:/pvc-93725304-9c5f-46e6-bb04-ab2ae6bfc6dd on /data type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=250,retrans=5,sec=sys,clientaddr=10.0.2.43,local_lock=none,addr=10.43.22.27)

Result Passed

The test passes if the respective NFS mount options (hard or soft) are correctly configured and observed when checking the mounted volumes.
require/auto-e2e-test : [TEST][IMPROVEMENT] Support both NFS hard and soft with custom timeo and retrans options for RWX volumes #6656
require/doc : Checked documentation in Update rwx volume doc website#777

…/longhorn#6655 This reverts commit longhorn@1423656 Ref: longhorn/longhorn#6838 Signed-off-by: Roger Yao <roger.yao@suse.com>

derekbit added this to the v1.6.0 milestone Sep 11, 2023

github-actions bot mentioned this issue Sep 11, 2023

[TEST][IMPROVEMENT] Support both NFS hard and soft with custom timeo and retrans options for RWX volumes #6656

Closed

innobead assigned james-munson Sep 11, 2023

innobead added priority/0 Must be fixed in this release (managed by PO) backport/1.4.4 backport/1.5.2 labels Sep 11, 2023

innobead assigned derekbit and unassigned james-munson Sep 20, 2023

innobead mentioned this issue Sep 20, 2023

[FEATURE] Share manager HA #6205

Open

derekbit mentioned this issue Sep 20, 2023

RWX volume: use soft mode with long timeout for nfs client longhorn/longhorn-manager#2167

Merged

This was referenced Sep 20, 2023

RWX volume: use bigger timeo, retrans and softerr for nfs client longhorn/longhorn-manager#2170

Merged

Update rwx volume doc longhorn/website#777

Merged

derekbit mentioned this issue Sep 21, 2023

Fix typo longhorn/website#778

Merged

roger-ryao self-assigned this Sep 26, 2023

roger-ryao closed this as completed Sep 26, 2023

roger-ryao mentioned this issue Sep 27, 2023

[BUG] Storage Class Configuration Not Applied to PV When Created via Web UI #6802

Open

derekbit reopened this Oct 3, 2023

derekbit mentioned this issue Oct 3, 2023

rwx volume: use softerr instead longhorn/longhorn-manager#2193

Closed

derekbit closed this as completed Oct 3, 2023

derekbit mentioned this issue Oct 4, 2023

[TASK] Revert "Disable Automatically Delete Workload Pod when The Volume Is Detached Unexpectedly for RWX volumes" #6838

Closed

roger-ryao mentioned this issue Oct 6, 2023

Revert pr-1221 since revert back to softerr/soft mode in https://… longhorn/longhorn-tests#1549

Merged

james-munson mentioned this issue Oct 24, 2023

[QUESTION] longhorn-1.2.3 RWX very low performance on many small files #3664

Open

james-munson mentioned this issue Dec 19, 2023

[FEATURE] Allow to set mount options for storageclass via values.yaml in helm chart #7351

Closed

james-munson mentioned this issue Jan 17, 2024

[FEATURE] RWX volume supports different NFS version (4.2) and mount options #7638

Closed

This was referenced Jan 25, 2024

[BACKPORT][v1.5.4]Allow to set mount options for storageclass via values.yaml in helm chart #7593

Closed

[BACKPORT][v1.4.5]Allow to set mount options for storageclass via values.yaml in helm chart #7592

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IMPROVEMENT] Support both NFS `hard` and `soft` with custom `timeo` and `retrans` options for RWX volumes #6655

[IMPROVEMENT] Support both NFS `hard` and `soft` with custom `timeo` and `retrans` options for RWX volumes #6655

derekbit commented Sep 11, 2023 •

edited

james-munson commented Sep 13, 2023

derekbit commented Sep 14, 2023

derekbit commented Sep 20, 2023 •

edited

innobead commented Sep 20, 2023 •

edited

derekbit commented Sep 20, 2023

innobead commented Sep 20, 2023

longhorn-io-github-bot commented Sep 20, 2023 •

edited by derekbit

roger-ryao commented Sep 20, 2023

innobead commented Sep 20, 2023

derekbit commented Sep 20, 2023 •

edited

innobead commented Sep 20, 2023

roger-ryao commented Sep 26, 2023 •

edited

[IMPROVEMENT] Support both NFS hard and soft with custom timeo and retrans options for RWX volumes #6655

[IMPROVEMENT] Support both NFS hard and soft with custom timeo and retrans options for RWX volumes #6655

Comments

derekbit commented Sep 11, 2023 • edited

Is your improvement request related to a feature? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

james-munson commented Sep 13, 2023

derekbit commented Sep 14, 2023

derekbit commented Sep 20, 2023 • edited

Options

mountOptions

parameters.nfsOptions

v1.4.0-v1.4.3 and v1.5.0-v1.5.1

v1.4.4+, v1.5.2+ and v1.6+

innobead commented Sep 20, 2023 • edited

derekbit commented Sep 20, 2023

innobead commented Sep 20, 2023

longhorn-io-github-bot commented Sep 20, 2023 • edited by derekbit

Pre Ready-For-Testing Checklist

roger-ryao commented Sep 20, 2023

innobead commented Sep 20, 2023

derekbit commented Sep 20, 2023 • edited

innobead commented Sep 20, 2023

roger-ryao commented Sep 26, 2023 • edited

[IMPROVEMENT] Support both NFS `hard` and `soft` with custom `timeo` and `retrans` options for RWX volumes #6655

[IMPROVEMENT] Support both NFS `hard` and `soft` with custom `timeo` and `retrans` options for RWX volumes #6655

derekbit commented Sep 11, 2023 •

edited

derekbit commented Sep 20, 2023 •

edited

innobead commented Sep 20, 2023 •

edited

longhorn-io-github-bot commented Sep 20, 2023 •

edited by derekbit

derekbit commented Sep 20, 2023 •

edited

roger-ryao commented Sep 26, 2023 •

edited