BTRFS on SMR HDD switches to Read-only mode #415

diawayeu · 2021-10-29T13:32:03Z

We tested host managed SMR drives using the FIO utility. When testing 2 drives at the same time, the file system switches to read-only mode.
Please note that we could reproduce this error ONLY with a minimum of two drives. For one HDD only we could not reproduce it.

We used the script:

bs=64m
jobs=16
rw="read"
size=50G
mkdir -p /root/fio_infinite_testing_BTRFS
for disk in {a..b}; do
    dd if=/dev/zero of=/dev/sd${disk} bs=1 count=1000000
    mkfs.btrfs /dev/sd${disk} -d single -m single
    mkdir -p /mnt/sd${disk}
    mount -t btrfs /dev/sd${disk} /mnt/sd${disk}
done

for disk in {a..b}; do
    echo 3 >/proc/sys/vm/drop_caches
    fio --directory /mnt/sd${disk} --name fio.md_${disk}.${size}.test.file --rw ${rw} -bs ${bs} --size ${size} --numjobs ${jobs} --time_based --ramp_time 5 --runtime 480 |& tee /root/fio_infinite_testing_BTRFS/sd${disk}_${size}_${bs}_${r
w}_${jobs}_${dir}.fio.log &
    echo 3 >/proc/sys/vm/drop_caches
done
wait

mount | grep /dev/sd

/dev/sda on /mnt/sda type btrfs (ro,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdb on /mnt/sdb type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)

FIO output:
fio: pid=0, err=30/file:filesetup.c:150, func=unlink, error=Read-only file system

uname -a
Linux fs 5.14.10-300.fc35.x86_64 #1 SMP Thu Oct 7 20:48:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

btrfs --version
btrfs-progs v5.14.1

btrfs fi show

Label: none  uuid: 132f3ec1-5ff3-4c7c-bf50-da53378c8bde
        Total devices 1 FS bytes used 650.76GiB
        devid    1 size 18.19TiB used 652.50GiB path /dev/sda

Label: none  uuid: aca81f46-0c4c-4a1f-a47e-5058953ac790
        Total devices 1 FS bytes used 800.94GiB
        devid    1 size 18.19TiB used 802.25GiB path /dev/sdb

btrfs fi df /mnt/sda

Data, single: total=651.00GiB, used=650.00GiB
System, single: total=256.00MiB, used=288.00KiB
Metadata, single: total=1.25GiB, used=774.00MiB
GlobalReserve, single: total=512.00MiB, used=0.00B

btrfs fi df /mnt/sdb

Data, single: total=800.75GiB, used=800.00GiB
System, single: total=256.00MiB, used=352.00KiB
Metadata, single: total=1.25GiB, used=962.27MiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Attached is the output of dmesg, smartctl and fio.

Tests were conducted on Fedora Release 35 and Fedora Release 34.

Please advise.

sda_50G_64m_read_16_.fio.log
smartctl_sdb.log
smartctl_sda.log
dmesg.log
sdb_50G_64m_read_16_.fio.log

The text was updated successfully, but these errors were encountered:

kdave · 2021-10-29T13:39:06Z

[  244.185084] BTRFS: device fsid 132f3ec1-5ff3-4c7c-bf50-da53378c8bde devid 1 transid 5 /dev/sda scanned by mkfs.btrfs (1624)
[  244.235727] BTRFS info (device sda): flagging fs with big metadata feature
[  244.235732] BTRFS info (device sda): has skinny extents
[  246.302680] BTRFS info (device sda): host-managed zoned block device /dev/sda, 74508 zones of 268435456 bytes
[  246.302695] BTRFS info (device sda): zoned mode enabled with zone size 268435456
[  246.303418] BTRFS info (device sda): checking UUID tree
[  255.290854] BTRFS: device fsid aca81f46-0c4c-4a1f-a47e-5058953ac790 devid 1 transid 5 /dev/sdb scanned by mkfs.btrfs (1654)
[  255.368241] BTRFS info (device sdb): flagging fs with big metadata feature
[  255.368247] BTRFS info (device sdb): has skinny extents
[  257.515744] BTRFS info (device sdb): host-managed zoned block device /dev/sdb, 74508 zones of 268435456 bytes
[  257.515761] BTRFS info (device sdb): zoned mode enabled with zone size 268435456
[  257.516457] BTRFS info (device sdb): checking UUID tree
[  257.577860] test_one_v3.sh (1621): drop_caches: 3
[  257.585697] test_one_v3.sh (1621): drop_caches: 3
[  257.591546] test_one_v3.sh (1621): drop_caches: 3
[  257.596444] test_one_v3.sh (1621): drop_caches: 3
[ 3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction)
[ 3310.968060] BTRFS info (device sda): forced readonly
[ 3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction.
[ 3310.968065] ------------[ cut here ]------------
[ 3310.968066] BTRFS: Transaction aborted (error -11)
[ 3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8
[ 3310.968083] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr ns sunrpc vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof i2c_piix4 k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq fuse zram ip_tables xfs raid1 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ast i2c_algo_bit drm_vram_helper drm_kms_helper cec drm_ttm_helper ttm nvme drm ccp tg3 sp5100_tco nvme_core wmi
[ 3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1
[ 3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021
[ 3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8
[ 3310.968141] Code: e9 f3 b4 8d ff 49 8b 54 24 28 49 8b 44 24 30 48 89 42 08 48 89 10 e9 2d ff ff ff 44 89 f6 48 c7 c7 48 65 60 84 e8 fb 89 fe ff <0f> 0b e9 5e fe ff ff 48 8b 7d 50 44 89 f2 48 c7 c6 78 65 60 84 e8
[ 3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282
[ 3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027
[ 3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00
[ 3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48
[ 3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00
[ 3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58
[ 3310.968154] FS:  00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000
[ 3310.968157] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0
[ 3310.968160] PKRU: 55555554
[ 3310.968161] Call Trace:
[ 3310.968167]  ? dput+0xd4/0x300
[ 3310.968174]  btrfs_sync_file+0x3f1/0x490
[ 3310.968180]  __x64_sys_fsync+0x33/0x60
[ 3310.968185]  do_syscall_64+0x3b/0x90
[ 3310.968190]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 3310.968194] RIP: 0033:0x7efe6557329b
[ 3310.968198] Code: 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 1e f8 ff 8b 7c 24 0c 41 89 c0 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 1e f8 ff 8b 44
[ 3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
[ 3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b
[ 3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006
[ 3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010
[ 3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980
[ 3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000
[ 3310.968212] ---[ end trace 1a346f4d3c0d96ba ]---
[ 3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown

kdave · 2021-10-29T13:40:54Z

CC @morbidrsa @naota
Error -11 is EAGAIN.

I wonder why "fi df" does not show the "zone_unusable" stats, it should be there in version 5.14.1:

Data, single: total=800.75GiB, used=800.00GiB
System, single: total=256.00MiB, used=352.00KiB
Metadata, single: total=1.25GiB, used=962.27MiB
GlobalReserve, single: total=512.00MiB, used=0.00B

diawayeu · 2021-10-29T13:53:39Z

List of installed packages:

# rpm -qa "*btr*|*kernel*"
kernel-core-5.14.10-300.fc35.x86_64
kernel-modules-5.14.10-300.fc35.x86_64
libreport-plugin-kerneloops-2.15.2-6.fc35.x86_64
abrt-addon-kerneloops-2.14.6-9.fc35.x86_64
kernel-5.14.10-300.fc35.x86_64
btrfs-progs-5.14.1-1.fc35.x86_64

kdave · 2021-10-29T15:03:15Z

Does mkfs properly detect the device as zoned? It's in the summary.

diawayeu · 2021-11-01T07:25:36Z

The output of the mkfs.btrfs command:

mkfs.btrfs /dev/sda -d single -m single
btrfs-progs v5.14.1
See http://btrfs.wiki.kernel.org for more information.

Zoned: /dev/sda: host-managed device detected, setting zoned feature
Resetting device zones /dev/sda (74508 zones) ...
Label:              (null)
UUID:               8dcf01fd-f25b-4cbb-970c-54366cd8fbbd
Node size:          16384
Sector size:        4096
Filesystem size:    18.19TiB
Block group profiles:
  Data:             single          256.00MiB
  Metadata:         single          256.00MiB
  System:           single          256.00MiB
SSD detected:       no
Zoned device:       yes
  Zone size:        256.00MiB
Incompat features:  extref, skinny-metadata, zoned
Runtime features:
Checksum:           crc32c
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1    18.19TiB  /dev/sda

mkfs.btrfs /dev/sdb -d single -m single
btrfs-progs v5.14.1
See http://btrfs.wiki.kernel.org for more information.

Zoned: /dev/sdb: host-managed device detected, setting zoned feature
Resetting device zones /dev/sdb (74508 zones) ...
Label:              (null)
UUID:               712aeea5-6306-4725-b3e8-17afe6449269
Node size:          16384
Sector size:        4096
Filesystem size:    18.19TiB
Block group profiles:
  Data:             single          256.00MiB
  Metadata:         single          256.00MiB
  System:           single          256.00MiB
SSD detected:       no
Zoned device:       yes
  Zone size:        256.00MiB
Incompat features:  extref, skinny-metadata, zoned
Runtime features:
Checksum:           crc32c
Number of devices:  1
Devices:
   ID        SIZE  PATH
    1    18.19TiB  /dev/sdb

morbidrsa · 2021-11-02T07:38:00Z

Let me try to reproduce it.

morbidrsa · 2021-11-02T11:10:25Z

I haven't managed to reproduce the problem on v5.15 (yet), but from code inspection this:

[ 3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction)

I think is coming from this call chain:

btrfs_sync_file()
`-> btrfs_commit_transaction()
      `-> btrfs_write_and_wait_transaction()
            `-> btrfs_write_marked_extents()

@naota in btrfs_sync_log() we check for -EAGAIN on btrfs_write_marked_extents() and set the return to 0 if we encounter it. The comment above says that this is happening because the concurrent transaction created a hole and we're hoping the hole gets filled in this transaction so we're not falling back to a full commit here.

        /* we start IO on  all the marked extents here, but we don't actually                                                                                                        
         * wait for them until later.                                                                                                                                                
         */                                                                                                                                                                          
        blk_start_plug(&plug);                                                                                                                                                       
        ret = btrfs_write_marked_extents(fs_info, &log->dirty_log_pages, mark);                                                                                                      
        /*                                                                                                                                                                           
         * -EAGAIN happens when someone, e.g., a concurrent transaction                                                                                                              
         *  commit, writes a dirty extent in this tree-log commit. This                                                                                                              
         *  concurrent write will create a hole writing out the extents,                                                                                                             
         *  and we cannot proceed on a zoned filesystem, requiring                                                                                                                   
         *  sequential writing. While we can bail out to a full commit                                                                                                               
         *  here, but we can continue hoping the concurrent writing fills                                                                                                            
         *  the hole.                                                                                                                                                                
         */                                                                                                                                                                          
        if (ret == -EAGAIN && btrfs_is_zoned(fs_info))                                                                                                                               
                ret = 0;

Could it be that this comment does not hold true and we're not filling the hole created?

morbidrsa · 2021-11-02T11:34:35Z

I haven't managed to reproduce the problem on v5.15

OK managed to reproduce the issue on v5.14 (Linus' tree so 5.14.0). @diawayeu can you check David's misc-next branch or Linus v5.15 on your setup and see if the problem is gone there as well?

For reference here's the reproducer I've used:

#!/bin/sh

for d in sda sdb; do
        mkfs.btrfs -d single -m single -f /dev/\${d}
done

mount /dev/sda /mnt/test
mount /dev/sdb /mnt/scratch

for dir in test scratch; do
        echo 3 >/proc/sys/vm/drop_caches
        fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \
                --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \
                --group_reporting |& tee /dev/shm/fio.\${dir}
        echo 3 >/proc/sys/vm/drop_caches
done


for d in sda sdb; do
        umount /dev/\${d}
done

diawayeu · 2021-11-04T11:35:01Z

We have updated the kernel to version 5.15. It did not help.

# uname -a
Linux fs 5.15.0-60.fc36.x86_64 #1 SMP Mon Nov 1 15:11:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

attached dmesg.log file
dmesg.log

morbidrsa · 2021-11-04T11:44:16Z

We have updated the kernel to version 5.15. It did not help.
# uname -a
Linux fs 5.15.0-60.fc36.x86_64 #1 SMP Mon Nov 1 15:11:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
attached dmesg.log file dmesg.log

OK thanks for testing, I'll continue investigating.

morbidrsa · 2021-11-08T15:02:57Z

Just a quick update, I've managed to reproduce it once as well on Linus' master branch. The bug is still there but maybe harder to hit.

kdave · 2021-11-08T15:36:59Z

I'll keep the issue open until it's fixed though it's for kernel and not for progs.

diawayeu · 2021-11-08T17:26:16Z

@morbidrsa Please let us know if you need our remote lab to reproduce/investigate the issue and we'll provide the access credentials shortly.
Thank you.

morbidrsa · 2021-11-09T08:46:35Z

@morbidrsa Please let us know if you need our remote lab to reproduce/investigate the issue and we'll provide the access credentials shortly.
Thank you.

@diawayeu thanks for the offer, but I don't think it is necessary. I am able to reproduce the issue although it takes a while.

naota · 2021-11-15T05:03:15Z

I could also reproduce your issue. I investigated the problem and found this happens when the log_tree's level becomes >= 2.
There is a bug when freeing the log tree not properly re-dirtying intermediate nodes.
So, the patch below fixed the issue for me. Could you also try it if it also fixes your problem, please?

Why this happens only when we are running 2 devices is still under investigation. I'm feeling it a bit strange that the tree becomes so large (level = 2). There might be something around here.

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 17b8b3ebcd9f..13d2104e61c3 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2886,6 +2886,8 @@ static noinline int walk_up_log_tree(struct btrfs_trans_handle *trans,
 						     path->nodes[*level]->len);
 					if (ret)
 						return ret;
+					btrfs_redirty_list_add(trans->transaction,
+							       next);
 				} else {
 					if (test_and_clear_bit(EXTENT_BUFFER_DIRTY, &next->bflags))
 						clear_extent_buffer_dirty(next);

diawayeu · 2021-11-17T08:00:48Z

We have applied your patch and the bug no longer reproduces with 2 drives.
We used this script for 8 drives.

mkdir -p /root/fio_infinite_testing_BTRFS
for disk in {a..h}; do
    dd if=/dev/zero of=/dev/sd${disk} bs=1 count=1000000
    mkfs.btrfs /dev/sd${disk} -d single -m single
    mkdir -p /mnt/sd${disk}
    mount -t btrfs /dev/sd${disk} /mnt/sd${disk}
done

for rw in read write randread randwrite; do
        for bs in 4k 64k 4m 64m; do
                for jobs in 1 2 8; do
                    for disk in {a..h}; do
                        echo 3 > /proc/sys/vm/drop_caches;
                        size=50G;
                        fio --directory /mnt/sd${disk} --name fio.md_${disk}.${size}.test.file --rw ${rw} -bs ${bs} --size ${size} --numjobs ${jobs} --time_based --ramp_time 5 --runtime 60 |& tee /root/fio_infinite_testing_BTRFS/sd${disk
}_${size}_${bs}_${rw}_${jobs}_${dir}.fio.log &
                        echo 3 > /proc/sys/vm/drop_caches;
                    done
                    wait; #wait for the background job to finish.
                done
        done
done

The issue still reproduces (3 of 8 drives failed).

Drive /dev/sdc failed in loop step:
rw=write, bs=64m, jobs=1

Drive /dev/sde failed in loop step:
rw=randwrite, bs=4k, jobs=8

Drive /dev/sdh failed in loop step:
rw=write, bs=64k, jobs=8

# mount | grep /dev/sd
/dev/sda on /mnt/sda type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdb on /mnt/sdb type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdc on /mnt/sdc type btrfs (ro,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdd on /mnt/sdd type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sde on /mnt/sde type btrfs (ro,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdf on /mnt/sdf type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdg on /mnt/sdg type btrfs (rw,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)
/dev/sdh on /mnt/sdh type btrfs (ro,relatime,seclabel,nospace_cache,subvolid=5,subvol=/)

Let us know what else we can do.
Thanks!

dmesg.log
sdh_50G_64k_write_8_.fio.log
sde_50G_4k_randwrite_8_.fio.log
sdc_50G_64m_write_1_.fio.log

naota · 2021-11-17T11:27:54Z

Thank you for the testing.

At least, there are no -EAGAIN (-11) errors. But, there is another error in the log.

$ grep errno dmesg.log
[16422.837968] BTRFS: error (device sdc) in __btrfs_cow_block:449: errno=-22 unknown
[17940.951232] BTRFS: error (device sde) in __btrfs_cow_block:449: errno=-22 unknown

Drive /dev/sdc failed in loop step: rw=write, bs=64m, jobs=1

sdc is aborted at here:

[16422.837898] Call Trace:
[16422.837904] btrfs_cow_block+0xf5/0x180
[16422.837909] do_relocation+0x48d/0x5b0
[16422.837915] ? build_backref_tree+0x2b8/0x3f0
[16422.837920] relocate_tree_blocks+0x2b3/0x650
[16422.837925] relocate_block_group+0x1ef/0x550
[16422.837928] btrfs_relocate_block_group+0x16f/0x330
[16422.837930] btrfs_relocate_chunk+0x27/0xe0
[16422.837935] btrfs_reclaim_bgs_work.cold+0x55/0xb9
[16422.837941] process_one_work+0x1ec/0x390
[16422.837947] worker_thread+0x53/0x3e0
[16422.837950] ? process_one_work+0x390/0x390
[16422.837953] kthread+0x127/0x150
[16422.837956] ? set_kthread_struct+0x40/0x40
[16422.837959] ret_from_fork+0x22/0x30
[16422.837965] ---[ end trace 20ff043c9e05dee6 ]---
[16422.837968] BTRFS: error (device sdc) in __btrfs_cow_block:449: errno=-22 unknown
[16422.838007] BTRFS info (device sdc): forced readonly

Drive /dev/sde failed in loop step: rw=randwrite, bs=4k, jobs=8

and, sde is aborted at here:

[17940.951186] Call Trace:
[17940.951191] btrfs_cow_block+0xf5/0x180
[17940.951194] do_relocation+0x48d/0x5b0
[17940.951197] ? build_backref_tree+0x2b8/0x3f0
[17940.951200] relocate_tree_blocks+0x2b3/0x650
[17940.951204] relocate_block_group+0x1ef/0x550
[17940.951207] btrfs_relocate_block_group+0x16f/0x330
[17940.951209] btrfs_relocate_chunk+0x27/0xe0
[17940.951213] btrfs_reclaim_bgs_work.cold+0x55/0xb9
[17940.951216] process_one_work+0x1ec/0x390
[17940.951220] worker_thread+0x53/0x3e0
[17940.951222] ? process_one_work+0x390/0x390
[17940.951223] kthread+0x127/0x150
[17940.951225] ? set_kthread_struct+0x40/0x40
[17940.951227] ret_from_fork+0x22/0x30
[17940.951231] ---[ end trace 20ff043c9e05dee7 ]---
[17940.951232] BTRFS: error (device sde) in __btrfs_cow_block:449: errno=-22 unknown
[17940.951258] BTRFS info (device sde): forced readonly

This errno=-22 (-EINVAL) error at relocation is a known issue and is already fixed in 5.16-rc1. As the fix consists of some patches, you can try v5.16-rc1+my previous patch. Or, as @morbidrsa is working on backporting the fix patches to a stable kernel, you can try that instead.

Drive /dev/sdh failed in loop step: rw=write, bs=64k, jobs=8

I can't find any message regarding sdh in the dmesg log. Could you check if sdh failed with a similar issue?

morbidrsa · 2021-11-18T13:10:17Z

FTR here's the back port for the relocation fixes to 5.15 https://lore.kernel.org/linux-btrfs/cover.1637225333.git.johannes.thumshirn@wdc.com/T/#m7e231e3c4bef69bde450ea10b0e67875efa8c87b

diawayeu · 2021-11-19T08:28:55Z

I installed the kernel + patch

# uname -a
Linux btrfs 5.16.0-0.rc1.20211115git8ab774587903.14.btrfs.fc36.x86_64 #1 SMP PREEMPT Thu Nov 18 04:26:41 EST 2021 x86_64 x86_64 x86_64 GNU/Linux

I ran the test on 8 drives and the test was completed without errors.

kernel ( 5.16.0-0.rc1) + patch fixes our problem

Thanks!

morbidrsa · 2021-11-19T11:53:00Z

@diawayeu thanks for the confirmation.

For zoned btrfs, we re-dirty a freeing tree node to ensure btrfs write the region and not to leave a write hole on a zoned device. Current code failed to re-dirty a node when the tree-log tree's depth >= 2. This leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirty a node on walking up the tree. Link: kdave/btrfs-progs#415 Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [ 3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [ 3310.968060] BTRFS info (device sda): forced readonly [ 3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [ 3310.968065] ------------[ cut here ]------------ [ 3310.968066] BTRFS: Transaction aborted (error -11) [ 3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [ 3310.968083] Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr ns sunrpc vfat fat ipmi_ssif intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl wmi_bmof i2c_piix4 k10temp acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq fuse zram ip_tables xfs raid1 crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ast i2c_algo_bit drm_vram_helper drm_kms_helper cec drm_ttm_helper ttm nvme drm ccp tg3 sp5100_tco nvme_core wmi [ 3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [ 3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [ 3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [ 3310.968141] Code: e9 f3 b4 8d ff 49 8b 54 24 28 49 8b 44 24 30 48 89 42 08 48 89 10 e9 2d ff ff ff 44 89 f6 48 c7 c7 48 65 60 84 e8 fb 89 fe ff <0f> 0b e9 5e fe ff ff 48 8b 7d 50 44 89 f2 48 c7 c6 78 65 60 84 e8 [ 3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [ 3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [ 3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [ 3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [ 3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [ 3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [ 3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [ 3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [ 3310.968160] PKRU: 55555554 [ 3310.968161] Call Trace: [ 3310.968167] ? dput+0xd4/0x300 [ 3310.968174] btrfs_sync_file+0x3f1/0x490 [ 3310.968180] __x64_sys_fsync+0x33/0x60 [ 3310.968185] do_syscall_64+0x3b/0x90 [ 3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3310.968194] RIP: 0033:0x7efe6557329b [ 3310.968198] Code: 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 1e f8 ff 8b 7c 24 0c 41 89 c0 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 1e f8 ff 8b 44 [ 3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [ 3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [ 3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [ 3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [ 3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [ 3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [ 3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occur because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirty a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") Cc: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>

kdave · 2021-11-30T14:31:46Z

Thanks for the report and fix, patch is on the way to kernel and stable trees.

There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") Cc: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>

There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>

commit 84c2544 upstream. There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 84c2544 upstream. There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 gregkh#1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 84c2544 upstream. There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

BugLink: https://bugs.launchpad.net/bugs/1957832 commit 84c2544 upstream. There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

BugLink: https://bugs.launchpad.net/bugs/1954931 commit 84c2544 upstream. There is a report of a transaction abort of -EAGAIN with the following script. #!/bin/sh for d in sda sdb; do mkfs.btrfs -d single -m single -f /dev/\${d} done mount /dev/sda /mnt/test mount /dev/sdb /mnt/scratch for dir in test scratch; do echo 3 >/proc/sys/vm/drop_caches fio --directory=/mnt/\${dir} --name=fio.\${dir} --rw=read --size=50G --bs=64m \ --numjobs=$(nproc) --time_based --ramp_time=5 --runtime=480 \ --group_reporting |& tee /dev/shm/fio.\${dir} echo 3 >/proc/sys/vm/drop_caches done for d in sda sdb; do umount /dev/\${d} done The stack trace is shown in below. [3310.967991] BTRFS: error (device sda) in btrfs_commit_transaction:2341: errno=-11 unknown (Error while writing out transaction) [3310.968060] BTRFS info (device sda): forced readonly [3310.968064] BTRFS warning (device sda): Skipping commit of aborted transaction. [3310.968065] ------------[ cut here ]------------ [3310.968066] BTRFS: Transaction aborted (error -11) [3310.968074] WARNING: CPU: 14 PID: 1684 at fs/btrfs/transaction.c:1946 btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968131] CPU: 14 PID: 1684 Comm: fio Not tainted 5.14.10-300.fc35.x86_64 #1 [3310.968135] Hardware name: DIAWAY Tartu/Tartu, BIOS V2.01.B10 04/08/2021 [3310.968137] RIP: 0010:btrfs_commit_transaction.cold+0x209/0x2c8 [3310.968144] RSP: 0018:ffffb284ce393e10 EFLAGS: 00010282 [3310.968147] RAX: 0000000000000026 RBX: ffff973f147b0f60 RCX: 0000000000000027 [3310.968149] RDX: ffff974ecf098a08 RSI: 0000000000000001 RDI: ffff974ecf098a00 [3310.968150] RBP: ffff973f147b0f08 R08: 0000000000000000 R09: ffffb284ce393c48 [3310.968151] R10: ffffb284ce393c40 R11: ffffffff84f47468 R12: ffff973f101bfc00 [3310.968153] R13: ffff971f20cf2000 R14: 00000000fffffff5 R15: ffff973f147b0e58 [3310.968154] FS: 00007efe65468740(0000) GS:ffff974ecf080000(0000) knlGS:0000000000000000 [3310.968157] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3310.968158] CR2: 000055691bcbe260 CR3: 000000105cfa4001 CR4: 0000000000770ee0 [3310.968160] PKRU: 55555554 [3310.968161] Call Trace: [3310.968167] ? dput+0xd4/0x300 [3310.968174] btrfs_sync_file+0x3f1/0x490 [3310.968180] __x64_sys_fsync+0x33/0x60 [3310.968185] do_syscall_64+0x3b/0x90 [3310.968190] entry_SYSCALL_64_after_hwframe+0x44/0xae [3310.968194] RIP: 0033:0x7efe6557329b [3310.968200] RSP: 002b:00007ffe0236ebc0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [3310.968203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007efe6557329b [3310.968204] RDX: 0000000000000000 RSI: 00007efe58d77010 RDI: 0000000000000006 [3310.968205] RBP: 0000000004000000 R08: 0000000000000000 R09: 00007efe58d77010 [3310.968207] R10: 0000000016cacc0c R11: 0000000000000293 R12: 00007efe5ce95980 [3310.968208] R13: 0000000000000000 R14: 00007efe6447c790 R15: 0000000c80000000 [3310.968212] ---[ end trace 1a346f4d3c0d96ba ]--- [3310.968214] BTRFS: error (device sda) in cleanup_transaction:1946: errno=-11 unknown The abort occurs because of a write hole while writing out freeing tree nodes of a tree-log tree. For zoned btrfs, we re-dirty a freed tree node to ensure btrfs can write the region and does not leave a hole on write on a zoned device. The current code fails to re-dirty a node when the tree-log tree's depth is greater or equal to 2. That leads to a transaction abort with -EAGAIN. Fix the issue by properly re-dirtying a node on walking up the tree. Fixes: d357515 ("btrfs: zoned: redirty released extent buffers") CC: stable@vger.kernel.org # 5.12+ Link: kdave/btrfs-progs#415 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

kdave added bug kernel something in kernel has to be done too labels Oct 29, 2021

kdave closed this as completed Nov 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BTRFS on SMR HDD switches to Read-only mode #415

BTRFS on SMR HDD switches to Read-only mode #415

diawayeu commented Oct 29, 2021 •

edited

Loading

kdave commented Oct 29, 2021

kdave commented Oct 29, 2021

diawayeu commented Oct 29, 2021

kdave commented Oct 29, 2021

diawayeu commented Nov 1, 2021

morbidrsa commented Nov 2, 2021

morbidrsa commented Nov 2, 2021 •

edited

Loading

morbidrsa commented Nov 2, 2021

diawayeu commented Nov 4, 2021

morbidrsa commented Nov 4, 2021

morbidrsa commented Nov 8, 2021

kdave commented Nov 8, 2021

diawayeu commented Nov 8, 2021

morbidrsa commented Nov 9, 2021

naota commented Nov 15, 2021

diawayeu commented Nov 17, 2021 •

edited

Loading

naota commented Nov 17, 2021

morbidrsa commented Nov 18, 2021

diawayeu commented Nov 19, 2021

morbidrsa commented Nov 19, 2021

kdave commented Nov 30, 2021

BTRFS on SMR HDD switches to Read-only mode #415

BTRFS on SMR HDD switches to Read-only mode #415

Comments

diawayeu commented Oct 29, 2021 • edited Loading

kdave commented Oct 29, 2021

kdave commented Oct 29, 2021

diawayeu commented Oct 29, 2021

kdave commented Oct 29, 2021

diawayeu commented Nov 1, 2021

morbidrsa commented Nov 2, 2021

morbidrsa commented Nov 2, 2021 • edited Loading

morbidrsa commented Nov 2, 2021

diawayeu commented Nov 4, 2021

morbidrsa commented Nov 4, 2021

morbidrsa commented Nov 8, 2021

kdave commented Nov 8, 2021

diawayeu commented Nov 8, 2021

morbidrsa commented Nov 9, 2021

naota commented Nov 15, 2021

diawayeu commented Nov 17, 2021 • edited Loading

naota commented Nov 17, 2021

morbidrsa commented Nov 18, 2021

diawayeu commented Nov 19, 2021

morbidrsa commented Nov 19, 2021

kdave commented Nov 30, 2021

diawayeu commented Oct 29, 2021 •

edited

Loading

morbidrsa commented Nov 2, 2021 •

edited

Loading

diawayeu commented Nov 17, 2021 •

edited

Loading