[BUG] Executing fstrim while rebuilding causes IO errors #7103

ejweber · 2023-11-14T22:13:58Z

Describe the bug (🐛 if you encounter this issue)

Executing fstrim from a priveleged container that mounts a Longhorn volume results in IO errors in the kernel ring buffer if the volume is rebuilding. This would almost certainly also happen if fstrim were executed outside the container.

To Reproduce

Deploy Longhorn on a three node cluster.
Deploy the below manifest, which runs a privileged Ubuntu 22.04 container that mounts a Longhorn volume to /data.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: longhorn
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  namespace: default
spec:
  restartPolicy: Always
  containers:
  - name: test-container
    image: ubuntu:22.04
    securityContext:
      privileged: true
    args: 
      - sleep
      - infinity
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
          - ls
          - /data/lost+found
      initialDelaySeconds: 5
      periodSeconds: 5
    volumeMounts:
    - name: data
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: test-pvc

Write a reasonably 5GB of data to a file on the volume using dd twice. (Doing it twice may not be necessary, but it results in a bunch of blocks that can be later trimmed.)

eweber@laptop:~/issue-files/SURE-6935> k exec -it test-pod -- bash
root@test-pod:/# dd if=/dev/urandom iflag=count_bytes count=5G bs=1M of=/data/large
5120+0 records in
5120+0 records out
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 47.4062 s, 113 MB/s
root@test-pod:/# dd if=/dev/urandom iflag=count_bytes count=5G bs=1M of=/data/large
5120+0 records in
5120+0 records out
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 58.9104 s, 91.1 MB/s

Verify that the volume's actual size is approximately 10GB as expected.

eweber@laptop:~/longhorn> kl get volume -oyaml pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9 | grep actual
  actualSize: 10978639872

Verify that fstrim executed from the container can clean up approximately 5GB as expected.

root@test-pod:/# fstrim -va
/data: 4.3 GiB (4569669632 bytes) trimmed on /dev/longhorn/pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9

Write 5GB again to get the volume's actual size back to approximately 10GB.

root@test-pod:/# dd if=/dev/urandom iflag=count_bytes count=5G bs=1M of=/data/large
5120+0 records in
5120+0 records out
5368709120 bytes (5.4 GB, 5.0 GiB) copied, 57.7038 s, 93.0 MB/s
root@test-pod:/# fstrim -va
/data: 0 B (0 bytes) trimmed on /dev/longhorn/pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9

Verify that the volume's actual size is approximately 10GB.

eweber@laptop:~/longhorn> kl get volume -oyaml pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9 | grep actual
  actualSize: 10978639872

Delete any replica.
Once the replica is rebuilding (snapshots have purged and there is noticeable rebuilding progress), execute fstrim from the container again. No blocks are deleted.

root@test-pod:/# fstrim -va
/data: 0 B (0 bytes) trimmed on /dev/longhorn/pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9

Check the kernel ring buffer for IO errors. Verify that they occurred during the fstrim operation.

root@test-pod:/# dmesg --ctime | tail -n 200
...
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#83 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#83 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#83 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#83 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 270360 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#84 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#84 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#84 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#84 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 794648 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#71 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#71 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#71 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#71 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 1318936 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#72 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#72 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#72 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#72 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 1843224 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#73 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#73 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#73 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#73 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 2326528 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#76 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#76 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#76 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#76 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 2367512 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#85 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#85 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#85 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#85 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 2621440 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#86 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#86 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#86 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#86 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 2883584 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#64 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#64 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#64 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#64 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 3145728 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#65 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#65 Sense Key : Hardware Error [current] 
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#65 Add. Sense: Internal target failure
[Tue Nov 14 21:46:42 2023] sd 3:0:0:1: [sdc] tag#65 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Tue Nov 14 21:46:42 2023] blk_update_request: critical target error, dev sdc, sector 3407872 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0

Check the instance-manager logs for associated errors.

eweber@laptop:~/longhorn> kl logs instance-manager-de0a44c527b5c06489d7a009124e4fd3
...
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-e-0] time="2023-11-14T21:46:42Z" level=info msg="Finished to snapshot: 10.42.86.152:10000 5cea841f-f0b0-44a3-913e-228f38c36c06 UserCreated false Created at 2023-11-14T21:46:42Z, Labels map[]"
time="2023-11-14T21:46:42Z" level=info msg="Adding backend: tcp://10.42.86.152:10000"
time="2023-11-14T21:46:42Z" level=info msg="Start monitoring tcp://10.42.86.152:10000"
[longhorn-instance-manager] time="2023-11-14T21:46:42Z" level=info msg="Using replica tcp://10.42.27.34:10000 as the source for rebuild"
[longhorn-instance-manager] time="2023-11-14T21:46:42Z" level=info msg="Using replica tcp://10.42.86.152:10000 as the target for rebuild"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-e-0] time="2023-11-14T21:46:42Z" level=info msg="Set revision counter of 10.42.86.152:10000 to : 0"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-e-0] time="2023-11-14T21:46:42Z" level=info msg="Set backend tcp://10.42.86.152:10000 revision counter to 0"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-e-0] time="2023-11-14T21:46:42Z" level=info msg="Synchronizing volume-head-002.img.meta to volume-head-001.img.meta:10.42.86.152:10003"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-r-418301ed] time="2023-11-14T21:46:42Z" level=info msg="Syncing file volume-head-002.img.meta to 10.42.86.152:10003"
time="2023-11-14T21:46:42Z" level=info msg="Syncing file volume-head-002.img.meta to 10.42.86.152:10003: size 178, directIO false, fastSync false"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-r-418301ed] time="2023-11-14T21:46:42Z" level=info msg="Finished sync for the source file volume-head-002.img.meta, size 178, elapsed 0.02s"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-r-418301ed] time="2023-11-14T21:46:42Z" level=info msg="Done syncing file volume-head-002.img.meta to 10.42.86.152:10003"
[pvc-c244bf53-4319-4f77-82cc-e7cf28b31cc9-e-0] time="2023-11-14T21:46:42Z" level=info msg="Done synchronizing volume-head-002.img.meta to volume-head-001.img.meta:10.42.86.152:10003"
response_process: Receive error for response 3 of seq 19169
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:138424320 length:57344
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19170
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:406859776 length:4182016
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19171
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:675295232 length:4182016
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19172
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:943730688 length:4182016
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19173
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1191182336 length:16777216
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19174
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1212166144 length:130011136
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19175
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1342177280 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19176
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1476395008 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19177
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1610612736 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19178
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1744830464 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19179
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:1879048192 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19180
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2013265920 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19181
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2281701376 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19182
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2415919104 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19183
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2550136832 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19184
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2684354560 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19185
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2818572288 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19186
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:2952790016 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19187
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3087007744 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19188
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3221225472 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19189
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3359649792 length:130011136
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19190
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3489660928 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19191
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3628085248 length:130011136
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19192
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3758096384 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19193
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:3892314112 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19194
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4026531840 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19195
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4160749568 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19196
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4496293888 length:67108864
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19197
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4563402752 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19198
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4697620480 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19199
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4831838208 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19200
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:4966055936 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19201
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5100273664 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19202
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5234491392 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19203
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5368709120 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19204
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5502926848 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19205
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5637144576 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19206
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5771362304 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19207
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:5905580032 length:134217728
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success
response_process: Receive error for response 3 of seq 19208
tgtd: bs_longhorn_request(183) Failed to punch hole for UNMAP at offset:6039797760 length:75497472
tgtd: bs_longhorn_request(210) io error 0x1ca3d10 42 0 16 0, Success

Expected behavior

I'm not 100% sure. Maybe there is no way Longhorn can respond differently in this situation? However, Harvester VMs running on top of a Longhorn volume when this occurs pause until the rebuilding finishes (any IO error causes the VM to pause, and every time it unpauses, it experiences the same IO error again), so we need to investigate whether better behavior is possible.

@longhorn/dev-data-plane, is this behavior expected?

Environment

Longhorn version: v1.4.3 and master
Installation method (e.g. Rancher Catalog App/Helm/Kubectl): kubectl
Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: v1.25.12+rke2r1
- Number of management node in the cluster: 1
- Number of worker node in the cluster: 3
Node config
- OS type and version: Ubuntu 22.04.2 LTS
- Kernel version: 5.15.0-88-generic
- Disk type(e.g. SSD/NVMe/HDD): SSD
Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): DigitalOcean
Number of Longhorn volumes in the cluster: 1

The text was updated successfully, but these errors were encountered:

ejweber · 2023-11-17T15:38:18Z

After digging deeper with the supported customer who originally ran into this issue, we found that their VM had the root filesystem mounted with the discard option (in /etc/fstab). This caused them to hit the issue virtually every time Longhorn replicated. Most users will only hit the issue periodically, as many Linux distributions enable weekly fstrim. For Ubuntu, it generally happens on Mondays in the early morning.

ejweber · 2024-01-23T22:44:08Z

I recall from a previous discussion with @shuo-wu that we return an I/O error when we receive an UNMAP while rebuilding as a precautionary measure. We probably don't need to (since the UNMAP only affects the volume head, which is RW for all replicas during a rebuild). Testing whether this is the case is probably the best path forward for this issue.

However, from reading some discussions online and the SCSI spec, it occurs to me that we can probably legally ignore UNMAP commands if we want to (instead of returning an error). Doing so may limit the effectiveness of our trim support (e.g. a file system may choose not to send an UNMAP again until reboot for a particular block if it has already done so), but it would be much better than causing VMs to pause.

Page 28 of Information technology - SCSI Block Commands – 4 (SBC-4)
https://standards.incits.org/apps/group_public/download.php/124286/livelink

4.7.3.4.2 Processing unmap requests

Application clients use unmap commands (see 4.2.2) to request that LBAs be unmapped. For each LBA that
is requested to be unmapped, the device server shall:
    
    a) perform an unmap operation (see 4.7.3.4.3) on the LBA; or
    b) make no change to the logical block provisioning state of the LBA.

Page 232 of SCSI Commands Reference Manual
https://www.seagate.com/files/staticfiles/support/docs/manual/Interface%20manuals/100293068j.pdf

3.54.2 UNMAP parameter list

The UNMAP parameter list (see table 205) contains the data sent by an application client along with an UNMAP command.
Included in the data are an UNMAP parameter list header and block descriptors for LBA extents to be processed by the device
server for the UNMAP command. The LBAs specified in the block descriptors may contain overlapping extents, and may be in
any order.

For each specified LBA:

    a) a mapped LBA should be unmapped (see SBC-4), or may remain mapped; and
    b) an unmapped LBA shall remain unmapped.

shuo-wu · 2024-01-24T09:02:20Z

We probably don't need to (since the UNMAP only affects the volume head, which is RW for all replicas during a rebuild). Testing whether this is the case is probably the best path forward for this issue.

Agree. And based on the description of SCSI commands, it seems that Longhorn should do nothing rather than return an error when the unmap command cannot be processed.

ejweber · 2024-01-30T20:43:53Z

Testing with not returning an error first.

This minor change in longhorn-engine is promising.
longhorn/longhorn-engine@master...ejweber:7103-return-success-when-cannot-unmap
fstrim while rebuilding does not actually unmap blocks and returns with no error. There is also no error in instance-manager or in the kernel ring buffer. After rebuilding is complete, ext4 remembers which blocks it already trimmed and refuses to try to trim them again, so no space can be reclaimed.

# During rebuilding.
root@test-pod:/# fstrim -va
/data: 4.7 GiB (5093191680 bytes) trimmed on /dev/longhorn/pvc-bc6cc450-72ad-46b2-b2c2-d7fb5d0733ae

# After rebuilding.
root@test-pod:/# fstrim -va
/data: 0 B (0 bytes) trimmed on /dev/longhorn/pvc-bc6cc450-72ad-46b2-b2c2-d7fb5d0733ae

However, after remounting the file system, fstrim issues unmaps again, and space is freed.

eweber@laptop:~/issue-files/7103> kubectl delete pod test-pod
pod "test-pod" deleted

eweber@laptop:~/issue-files/7103> k apply -f test.yaml 
persistentvolumeclaim/test-pvc unchanged
pod/test-pod created

eweber@laptop:~/issue-files/7103> k exec -it test-pod -- bash

root@test-pod:/# fstrim -va
/data: 4.7 GiB (5093191680 bytes) trimmed on /dev/longhorn/pvc-bc6cc450-72ad-46b2-b2c2-d7fb5d0733ae

# Actual size reduces from 9.xx GiB to 5.22 GiB.

shuo-wu · 2024-01-31T03:44:59Z

There is also no error in instance-manager or in the kernel ring buffer. After rebuilding is complete, ext4 remembers which blocks it already trimmed and refuses to try to trim them again, so no space can be reclaimed.

IIRC, we encountered this issue before but I am not sure if that's the reason why we return an error when the trim cannot be handled.

ejweber · 2024-02-05T15:12:05Z

@shuo-wu, can you help to clarify the following point?

I was working off the assumption that UNMAP only affected the volume head. However, after digging deeper, I am understanding that it affects the volume head AND all removed snapshots.

If I understand correctly, these means allowing an UNMAP during a rebuild would be dangerous.

First the volume-head is synchronized. All replicas (including the rebuilding one) have the same contents in volume-head, so it is safe to punch holes in volume-head while a rebuild is ongoing.
Then, all other differencing disks are synchronized. Some of these differencing disks may be removed snapshots. RW replicas will have holes punched in all removed disks.
- Maybe holes will be punched at the same time a replica is sending one of these removed disks.
- Maybe holes will be punched after one of these removed disks is synchronized (in which case I THINK the holed won't be punched in the rebuilding replica's copy?).

So I am thinking we were correct to disallow UNMAP during rebuild in our initial implementation, and we should not attempt to change that behavior now.

I'm fairly new to working with this aspect of Longhorn. Is there something I'm missing?

ejweber · 2024-02-05T16:30:43Z

I tried #7103 (comment) with a longhorn-engine that allows UNMAP during rebuilding in two scenarios (basic steps in #7103 (comment)):

Scenario 1 - remove-snapshots-during-filesystem-trim == false

Nothing particularly interesting happened. The snapshot just behind volume-head is NOT marked as removed during the rebuild, so UNMAP only affects volume-head. In my case, volume-head was empty (because I was not writing during the rebuild), so no blocks were unmapped.

Scenario 2 - remove-snapshots-during-filesystem-trim == true

When I issued the fstrim command, the snapshot behind volume head was marked as removed. The engine's instance manager logged the following, because the rebuilding replica did not yet have the removed snapshot to trim from:

[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:03:43Z" level=info msg="Running ssync server for file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img at port 10004" func="rpc.(*SyncAgentServer).launchReceiver.func1" file="server.go:407"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] time="2024-02-05T16:03:43Z" level=info msg="Creating Ssync service" func=rest.Server file="server.go:79"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:03:43Z" level=info msg="file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img does not exist" func=rest.Server file="server.go:92"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-094ccdf7] time="2024-02-05T16:03:43Z" level=info msg="Syncing file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img to 10.42.1.220:10004" func="rpc.(*SyncAgentServer).FileSend" file="server.go:343"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] time="2024-02-05T16:03:43Z" level=info msg="Syncing file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img to 10.42.1.220:10004: size 10737418240, directIO true, fastSync true" func=sparse.SyncFile file="client.go:110"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] time="2024-02-05T16:03:43Z" level=warning msg="Failed to get change time and checksum of local file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img" func=sparse.SyncContent file="client.go:149" error="failed to open checksum file: open volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img.checksum: no such file or directory"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:03:43Z" level=info msg="Receiving volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img: size 10737418240, directIO true" func="rest.(*SyncServer).doOpen" file="handlers.go:103"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:03:43Z" level=info msg="Ssync server is opened and ready for receiving file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img" func="rest.(*SyncServer).open" file="handlers.go:84"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:04:12Z" level=info msg="Reloading the revision counter before processing the first write, the current revision cache is 0, the latest revision counter in file is 0" func="replica.(*Replica).increaseRevisionCounter" file="revision_counter.go:133"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:12Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 4096" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:12Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 117440512" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:12Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 125829120" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 125829120" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 125829120" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 125829120" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 125829120" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-e-0] time="2024-02-05T16:04:13Z" level=warning msg="One of the MultiUnmapper get different size 0 from others 134217728" func="controller.(*MultiUnmapperAt).UnmapAt.func1" file="multi_unmapper_at.go:60"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:05:42Z" level=info msg="Closing Ssync server" func="rest.(*SyncServer).close" file="handlers.go:165"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-094ccdf7] time="2024-02-05T16:05:42Z" level=info msg="Finished sync for the source file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img, size 10737418240, elapsed 119.35s" func=sparse.SyncContent file="client.go:171"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214] time="2024-02-05T16:05:42Z" level=info msg="Done running ssync server for file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img at port 10004" func="rpc.(*SyncAgentServer).launchReceiver.func1" file="server.go:412"
[instance-manager-082ed51c98f6227cd7ad985a3a912171] time="2024-02-05T16:05:42Z" level=info msg="Running ssync server for file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img.meta at port 10005" func="rpc.(*SyncAgentServer).launchReceiver.func1" file="server.go:407"
[instance-manager-413dddc5cbb0881de748c6d4eb6b5553] [pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-094ccdf7] time="2024-02-05T16:05:42Z" level=info msg="Done syncing file volume-snap-13d2d691-14ef-4956-b944-fcb663380592.img to 10.42.1.220:10004" func="rpc.(*SyncAgentServer).FileSend" file="server.go:347"

Once the rebuild was complete, I checked the files on replica files on disk for a mismatch.

eweber@laptop:~/liblonghorn> kl get replica -oyaml | grep -ie creationTimestamp -e directory -e " name:"
    creationTimestamp: "2024-02-05T15:42:47Z"
    name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-094ccdf7
      name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e
    dataDirectoryName: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4
    creationTimestamp: "2024-02-05T15:42:47Z"
    name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-12f62800
      name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e
    dataDirectoryName: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b
    creationTimestamp: "2024-02-05T16:03:31Z"
    name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-r-4ccb4214
      name: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e
    dataDirectoryName: pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c

eweber@laptop:~/liblonghorn> kl exec -it instance-manager-082ed51c98f6227cd7ad985a3a912171 -- bash -c "du -h /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-*/*"
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/revision.counter
0       /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/volume-head-001.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/volume-head-001.img.meta
9.5G    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img.meta
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-cfcd557c/volume.meta
eweber@laptop:~/liblonghorn> kl exec -it instance-manager-413dddc5cbb0881de748c6d4eb6b5553 -- bash -c "du -h /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-*/*"
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/revision.counter
0       /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/volume-head-003.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/volume-head-003.img.meta
5.3G    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img.meta
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-6bb570a4/volume.meta
eweber@laptop:~/liblonghorn> kl exec -it instance-manager-699da83c0e9d22726e667344227e096b -- bash -c "du -h /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-*/*"
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/revision.counter
0       /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/volume-head-003.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/volume-head-003.img.meta
5.3G    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/volume-snap-7ca3f397-904c-4925-a30c-291a324e38a5.img.meta
4.0K    /host/var/lib/longhorn/replicas/pvc-bae950ef-b641-4fcc-82c2-243348ff0b2e-ab058c6b/volume.meta

The snapshot file for the newly rebuilt replica uses much more disk space than the others (which were successfully trimmed). As far as I can tell so far, this is not actually problematic from a data integrity perspective, but I still don't think we should allow replicas to become out-of-sync like this.

shuo-wu · 2024-02-06T10:12:47Z

Nice Catch! Setting remove-snapshots-during-filesystem-trim does make allowing an UNMAP during a rebuild a dangerous operation. And I agree that let's keep our initial implementation.

ejweber · 2024-02-06T16:12:10Z

Given this discussion, I think there are now two goals for this issue:

Modify longhorn-engine code so we do NOT return an error when an UNMAP is issued on rebuild we we DO refuse to actually do the UNMAP. This is legal according to the SCSI standard and will prevent really annoying issues in Harvester.
Update our documentation to make the effects of this change clear. There is already mention that the user may need to remount the filesystem to make a trim effective in some situations. This is another situation in which that might be the case.

longhorn-io-github-bot · 2024-02-06T18:52:58Z

Pre Ready-For-Testing Checklist

Where is the reproduce steps/test steps documented?
The reproduce steps are at: [BUG] Executing fstrim while rebuilding causes IO errors #7103 (comment).
The test steps are at: Don't return an error on unmap while rebuilding longhorn-engine#1008 (comment).
Is there a workaround for the issue? If so, where is it documented?
Just wait for the rebuild to complete and attempt the trim again. However, VM workloads will remain paused for the duration of the rebuild.
Does the PR include the explanation for the fix or the feature?
Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
The PR is at: Don't return an error on unmap while rebuilding longhorn-engine#1008.
If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
The documentation issue/PR is at: Explain trim behavior while rebuilding and clean up doc website#868

yangchiu · 2024-03-01T01:42:56Z

Verified passed on master-head (longhorn-engine 5bd31c0) following the test steps.

ejweber added kind/bug require/qa-review-coverage Require QA to review coverage require/backport Require backport. Only used when the specific versions to backport have not been definied. reproduce/always 100% reproducible labels Nov 14, 2023

innobead assigned ejweber Nov 15, 2023

innobead added this to the v1.6.0 milestone Nov 15, 2023

innobead added priority/0 Must be fixed in this release (managed by PO) area/volume-trim Volume trim related labels Nov 15, 2023

ejweber mentioned this issue Nov 17, 2023

[BUG] IO errors from fstrim cause a VM pause/unpause loop harvester/harvester#4739

Open

innobead modified the milestones: v1.6.0, v1.7.0 Dec 29, 2023

ejweber mentioned this issue Jan 31, 2024

KB: add the article for potential risk with fstrim harvester/harvesterhci.io#54

Merged

ejweber mentioned this issue Feb 6, 2024

Don't return an error on unmap while rebuilding longhorn/longhorn-engine#1008

Merged

ejweber mentioned this issue Feb 6, 2024

Explain trim behavior while rebuilding and clean up doc longhorn/website#868

Merged

ejweber added backport/1.5.4 backport/1.6.1 labels Feb 6, 2024

This was referenced Feb 6, 2024

[BACKPORT][v1.5.4][BUG] Executing fstrim while rebuilding causes IO errors #7867

Closed

[BACKPORT][v1.6.1][BUG] Executing fstrim while rebuilding causes IO errors #7868

Closed

ejweber added the backport/1.4.5 label Feb 7, 2024

github-actions bot mentioned this issue Feb 7, 2024

[BACKPORT][v1.4.5][BUG] Executing fstrim while rebuilding causes IO errors #7883

Closed

innobead added the area/stability System or volume stability label Feb 21, 2024

innobead assigned yangchiu Feb 29, 2024

yangchiu closed this as completed Mar 1, 2024

jillian-maroket mentioned this issue Jun 3, 2024

KB: Clarify the impacted version of filesystem trim harvester/harvesterhci.io#63

Merged

Vicente-Cheng mentioned this issue Jun 27, 2024

[Doc] Add fstrim related article harvester/harvester#4988

Closed

yangchiu added the require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated label Jul 5, 2024

github-actions bot mentioned this issue Jul 5, 2024

[TEST][BUG] Executing fstrim while rebuilding causes IO errors #8929

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Executing fstrim while rebuilding causes IO errors #7103

[BUG] Executing fstrim while rebuilding causes IO errors #7103

ejweber commented Nov 14, 2023 •

edited

Loading

ejweber commented Nov 17, 2023

ejweber commented Jan 23, 2024 •

edited

Loading

shuo-wu commented Jan 24, 2024

ejweber commented Jan 30, 2024 •

edited

Loading

shuo-wu commented Jan 31, 2024

ejweber commented Feb 5, 2024

ejweber commented Feb 5, 2024 •

edited

Loading

shuo-wu commented Feb 6, 2024

ejweber commented Feb 6, 2024 •

edited

Loading

longhorn-io-github-bot commented Feb 6, 2024 •

edited by ejweber

Loading

yangchiu commented Mar 1, 2024

[BUG] Executing fstrim while rebuilding causes IO errors #7103

[BUG] Executing fstrim while rebuilding causes IO errors #7103

Comments

ejweber commented Nov 14, 2023 • edited Loading

Describe the bug (🐛 if you encounter this issue)

To Reproduce

Expected behavior

Environment

ejweber commented Nov 17, 2023

ejweber commented Jan 23, 2024 • edited Loading

shuo-wu commented Jan 24, 2024

ejweber commented Jan 30, 2024 • edited Loading

shuo-wu commented Jan 31, 2024

ejweber commented Feb 5, 2024

ejweber commented Feb 5, 2024 • edited Loading

shuo-wu commented Feb 6, 2024

ejweber commented Feb 6, 2024 • edited Loading

longhorn-io-github-bot commented Feb 6, 2024 • edited by ejweber Loading

Pre Ready-For-Testing Checklist

yangchiu commented Mar 1, 2024

ejweber commented Nov 14, 2023 •

edited

Loading

ejweber commented Jan 23, 2024 •

edited

Loading

ejweber commented Jan 30, 2024 •

edited

Loading

ejweber commented Feb 5, 2024 •

edited

Loading

ejweber commented Feb 6, 2024 •

edited

Loading

longhorn-io-github-bot commented Feb 6, 2024 •

edited by ejweber

Loading