-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
System information:
Distro is customized slackware64-14.2
Kernel 4.14.49
zfs/spl 0.7.9-1
(2) E5-2690 v3 CPUs
HP P440ar raid controller (using ZFS for volume management/compression)
Also tried on (with same results):
Distro is customized slackware64-14.2
Kernel 4.9.101
zfs/spl 0.7.9-1
(1) E3-1230 CPU
LSI 2008 in IT mode with 4 SAS disks.
The issue is that I get poor write performance to a ZVOL, and the zvol kernel threads burn lots of CPU causing very high load averages on the machine. At first I was seeing the issue in libvirt/qemu while doing a virtual machine block copy, but reduced it down to this:
# dd if=/datastore/vm/dng-smokeping/dng-smokeping.raw of=/dev/zvol/datastore/vm/test bs=1M
51200+0 records in
51200+0 records out
53687091200 bytes (50.0GB) copied, 318.477527 seconds, 160.8MB/s
Speed isn't great, but the real issue is the load average goes through the roof:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 15503 44.2 0.0 17660 2956 pts/1 D+ 12:12 1:22 dd if /datastore/vm/dng-smokeping/dng-smokeping.raw of /dev/zvol/datastore/vm/test bs 1M
root 15506 18.6 0.0 0 0 ? R< 12:12 0:33 [zvol]
root 15505 18.6 0.0 0 0 ? D< 12:12 0:33 [zvol]
root 48390 17.2 0.0 0 0 ? D< 12:00 2:42 [zvol]
root 48296 17.2 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48290 17.2 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48289 17.2 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48287 17.2 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48282 17.2 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48280 17.2 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48274 17.2 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48273 17.2 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48271 17.2 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48298 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48297 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48295 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48293 17.1 0.0 0 0 ? R< 11:59 2:42 [zvol]
root 48292 17.1 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48291 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48288 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48286 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48284 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48283 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48281 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48279 17.1 0.0 0 0 ? R< 11:59 2:43 [zvol]
root 48278 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48277 17.1 0.0 0 0 ? R< 11:59 2:42 [zvol]
root 48276 17.1 0.0 0 0 ? D< 11:59 2:42 [zvol]
root 48275 17.1 0.0 0 0 ? R< 11:59 2:42 [zvol]
root 48272 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 48270 17.1 0.0 0 0 ? D< 11:59 2:43 [zvol]
root 800 13.9 0.0 0 0 ? D< 11:12 8:47 [zvol]
root 47832 12.2 0.0 0 0 ? R< 11:53 2:43 [zvol]
root 3798 0.0 0.0 16764 1200 pts/0 S+ 12:15 0:00 egrep USER|zvol
root 1432 0.0 0.0 0 0 ? S 11:13 0:00 [z_zvol]
# uptime
12:15:47 up 1:03, 2 users, load average: 44.88, 25.17, 19.15
Now, if I go the opposite direction it's much faster and the load average isn't nearly as high:
# dd of=/datastore/vm/dng-smokeping/dng-smokeping.raw if=/dev/zvol/datastore/vm/test bs=1M
51200+0 records in
51200+0 records out
53687091200 bytes (50.0GB) copied, 94.473277 seconds, 542.0MB/s
There is only a single zvol, and the load average is normal:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 45782 67.0 0.0 17660 2928 pts/1 R+ 12:19 1:00 dd of /datastore/vm/dng-smokeping/dng-smokeping.raw if /dev/zvol/datastore/vm/test bs 1M
root 800 14.6 0.0 0 0 ? S< 11:12 10:02 [zvol]
root 1432 0.0 0.0 0 0 ? S 11:13 0:00 [z_zvol]
root 1303 0.0 0.0 16764 1032 pts/0 S+ 12:21 0:00 egrep USER|zvol
# uptime
12:21:14 up 1:08, 2 users, load average: 3.57, 16.60, 18.11
What is also interesting is that both of these things are on the same dataset:
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
datastore 299G 1.78T 15.1G /datastore
datastore/vm 284G 1.78T 18.0G /datastore/vm
datastore/vm/test 51.6G 1.83T 1007M -
So not sure what to look at. As it is right now, I can't really write to a zvol without killing the machine, so I'm using raw disk images on mounted zfs filesystem to avoid the double COW.
Thanks!