Slow mount times due to enormous ZFS delete queue object #5916

DeHackEd · 2017-03-23T13:03:24Z

System information

Type	Version/Name
Distribution Name	CentOS
Distribution Version	7
Linux Kernel	4.9.11-1.el7.elrepo.x86_64
Architecture	x86_64
ZFS Version	v0.7.0-rc3_119_g912e2ba
SPL Version	v0.7.0-rc3_3_g8d5feec

Describe the problem you're observing

On mounting of a specific filesystem the mount command appears to hang for an extended period. This is very reproducible on this system. Even after sending/receiving the dataset on a remote system said remote experiences long mount times (though not quite as long as this one - 15 seconds vs 10 minutes)

root     27857  0.0  0.0 125640  1820 ?        S    06:34   0:00 /bin/mount -t zfs -o defaults,noatime,dev,exec,rw,suid,nomand,zfsutil pool/fs /srv/fs
root     27858  0.3  0.0 153640  4744 ?        D    06:34   0:00 /sbin/mount.zfs pool/fs /srv/fs -o rw,noatime,zfsutil

Stack trace of mount command:

[<ffffffffa0483133>] cv_wait_common+0xb3/0x130 [spl]
[<ffffffffa0483208>] __cv_wait_io+0x18/0x20 [spl]
[<ffffffffa105f84f>] zio_wait+0xef/0x190 [zfs]
[<ffffffffa0fb0934>] dbuf_read+0x2f4/0x8e0 [zfs]
[<ffffffffa0fbbc40>] dmu_buf_hold_by_dnode+0x50/0x80 [zfs]
[<ffffffffa10245eb>] zap_get_leaf_byblk.isra.11+0x8b/0x240 [zfs]
[<ffffffffa1024869>] zap_deref_leaf+0xc9/0xe0 [zfs]
[<ffffffffa10268eb>] fzap_cursor_retrieve+0x13b/0x260 [zfs]
[<ffffffffa102a374>] zap_cursor_retrieve+0x64/0x290 [zfs]
[<ffffffffa10337ee>] zfs_unlinked_drain+0x5e/0xe0 [zfs]
[<ffffffffa10471b5>] zfs_sb_setup+0x115/0x150 [zfs]
[<ffffffffa1048b7f>] zfs_domount+0x29f/0x330 [zfs]
[<ffffffffa10679dc>] zpl_fill_super+0x2c/0x40 [zfs]
[<ffffffff8122e23d>] mount_nodev+0x4d/0xa0
[<ffffffffa1067e52>] zpl_mount+0x52/0x80 [zfs]
[<ffffffff8122edd9>] mount_fs+0x39/0x160
[<ffffffff8124c277>] vfs_kern_mount+0x67/0x100
[<ffffffff8124e862>] do_mount+0x1e2/0xca0
[<ffffffff8124f633>] SyS_mount+0x83/0xd0
[<ffffffff81003a47>] do_syscall_64+0x67/0x180
[<ffffffff81757cab>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff

# time zdb -ddddd pool/fs 3|head -n 35
Dataset pool/fs [ZPL], ID 110, cr_txg 696, 48.7T, 64318820 objects, rootbp DVA[0]=<2:6a451ace000:3000> DVA[1]=<3:1ab19b35d000:3000> [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=5094441L/5094441P fill=64318820 cksum=f2994723e:5f373a0c800:13583087450e5:2b4e1e6496d98b

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         3    3    16K    16K   360M     512   391M   99.94  ZFS delete queue
	dnode flags: USED_BYTES USERUSED_ACCOUNTED 
	dnode maxblkid: 25013
	Fat ZAP stats:
		Pointer table:
			32768 elements
			zt_blk: 16400
			zt_numblks: 16
			zt_shift: 15
			zt_blks_copied: 0
			zt_nextblk: 0
		ZAP entries: 0
		Leaf blocks: 24982
		Total blocks: 25014
		zap_block_type: 0x8000000000000001
		zap_magic: 0x2f52ab2ab
		zap_salt: 0x7a559d52f
		Leafs with 2^n pointers:
			  0:  17196 ****************************************
			  1:   7786 *******************
		Blocks with n*5 entries:
			  0:  24982 ****************************************
		Blocks n/10 full:
			  1:  24982 ****************************************
		Entries with n chunks:
		Buckets with n entries:
			  0: 12790784 ****************************************

Indirect blocks:
               0 L2   4:ff246f97200:2a00 4000L/1c00P F=16369 B=3864811/3864811
               0  L1  1:cdde52fd000:6000 4000L/1e00P F=128 B=3864811/3864811
               0   L0 1:cdde52f1000:6000 4000L/1400P F=1 B=3864811/3864811

real	8m23.293s
user	0m8.725s
sys	0m8.642s

In contrast, using zdb to read the root directory listing

# time zdb -ddddd pool/fs 4 | head -n 35
    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         4    3    16K    16K  7.84M     512  4.02M  100.00  ZFS directory
                                               168   bonus  System attributes
	dnode flags: USED_BYTES USERUSED_ACCOUNTED 
	dnode maxblkid: 256
	path	/
	uid     500
	gid     500
	atime	Fri Jul 15 20:44:17 2016
	mtime	Thu Mar 23 08:54:54 2017
	ctime	Thu Mar 23 08:54:54 2017
	crtime	Fri Jul 15 20:44:17 2016
	gen	696
	mode	40755
	size	23217
	parent	4
	links	23217
	pflags	40800000144
	Fat ZAP stats:
		Pointer table:
			1024 elements
			zt_blk: 0
			zt_numblks: 0
			zt_shift: 10
			zt_blks_copied: 0
			zt_nextblk: 0
		ZAP entries: 23215
		Leaf blocks: 256
		Total blocks: 257
		zap_block_type: 0x8000000000000001
		zap_magic: 0x2f52ab2ab
		zap_salt: 0x7a559d681
		Leafs with 2^n pointers:

real	0m18.963s
user	0m4.931s
sys	0m5.508s

Describe how to reproduce the problem

I don't know how it got so big, but now that it is I'm in this unpleasant situation where there's nothing I can do about it on mount.

The large delete queue appears to go with a send/recv, but the enormous dataset and sensitivity of the contents makes it impractical to share.

Include any warning/errors/backtraces from the system logs

loli10K · 2017-03-23T17:42:41Z

Duplicate of #3814? How is this different?

DeHackEd · 2017-03-23T17:48:27Z

Interesting. It's probably close enough to call this a duplicate.

The main "issue" is that my delelet queue is empty. There's no work to be done. But because the ZAP data structure is so huge and fragmented (probably?) it takes forever to load into memory and finish the mount. And it never shrinks AFAIK.

DeHackEd closed this as completed Mar 23, 2017

DeHackEd mentioned this issue Mar 25, 2017

Process 'zfs_unlinked_drain' asynchronously on mount #3814

Closed

DeHackEd mentioned this issue May 9, 2017

Directory listing slow after it has contained many entires #4933

Open

DeHackEd mentioned this issue Dec 12, 2017

New feature: drop DDT entries #6947

Closed

suuchan mentioned this issue May 6, 2021

Large but empty delete queue ZAP causes slow mount #12006

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow mount times due to enormous ZFS delete queue object #5916

Slow mount times due to enormous ZFS delete queue object #5916

DeHackEd commented Mar 23, 2017

loli10K commented Mar 23, 2017

DeHackEd commented Mar 23, 2017

Slow mount times due to enormous ZFS delete queue object #5916

Slow mount times due to enormous ZFS delete queue object #5916

Comments

DeHackEd commented Mar 23, 2017

System information

Describe the problem you're observing

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

loli10K commented Mar 23, 2017

DeHackEd commented Mar 23, 2017