Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

z_vol task hung on receiving side of zfs send #6330

Closed
edillmann opened this issue Jul 9, 2017 · 16 comments · Fixed by #7343
Closed

z_vol task hung on receiving side of zfs send #6330

edillmann opened this issue Jul 9, 2017 · 16 comments · Fixed by #7343
Labels
Component: ZVOL ZFS Volumes
Milestone

Comments

@edillmann
Copy link
Contributor

System information

Type Version/Name
Distribution Name Proxmox
Distribution Version 5.0
Linux Kernel 4.10.15-15
Architecture amd64
ZFS Version 0.7.0-rc4_96_g0ea05c64f
SPL Version 0.7.0-rc4_5_g7a35f2b

Describe the problem you're observing

I'm observing the following stack trace on the receiving side of zfs send.

Describe how to reproduce the problem

Setup a regular send to remote system (1/hour)

Include any warning/errors/backtraces from the system logs

[27671.358618] INFO: task z_zvol:348 blocked for more than 120 seconds.
[27671.360473]       Tainted: P           O    4.10.15-1-pve #1
[27671.362479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27671.363379] z_zvol          D    0   348      2 0x00000000
[27671.364409] Call Trace:
[27671.365169]  __schedule+0x233/0x6f0
[27671.366083]  schedule+0x36/0x80
[27671.366877]  taskq_wait_outstanding+0x8c/0xd0 [spl]
[27671.367720]  ? wake_atomic_t_function+0x60/0x60
[27671.368698]  zvol_task_cb+0x2fd/0x490 [zfs]
[27671.369449]  ? __schedule+0x23b/0x6f0
[27671.370574]  taskq_thread+0x25e/0x460 [spl]
[27671.371725]  ? wake_up_q+0x80/0x80
[27671.372447]  kthread+0x109/0x140
[27671.373306]  ? taskq_cancel_id+0x130/0x130 [spl]
[27671.374005]  ? kthread_create_on_node+0x60/0x60
[27671.374736]  ret_from_fork+0x2c/0x40
[27913.020358] INFO: task z_zvol:348 blocked for more than 120 seconds.
[27913.021135]       Tainted: P           O    4.10.15-1-pve #1
[27913.021804] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27913.022507] z_zvol          D    0   348      2 0x00000000
[27913.023489] Call Trace:
[27913.024566]  __schedule+0x233/0x6f0
[27913.025821]  schedule+0x36/0x80
[27913.026760]  taskq_wait_outstanding+0x8c/0xd0 [spl]
[27913.027658]  ? wake_atomic_t_function+0x60/0x60
[27913.029015]  zvol_task_cb+0x2fd/0x490 [zfs]
[27913.030207]  ? __schedule+0x23b/0x6f0
[27913.031317]  taskq_thread+0x25e/0x460 [spl]
[27913.032461]  ? wake_up_q+0x80/0x80
[27913.033602]  kthread+0x109/0x140
[27913.034545]  ? taskq_cancel_id+0x130/0x130 [spl]
[27913.035439]  ? kthread_create_on_node+0x60/0x60
[27913.036467]  ret_from_fork+0x2c/0x40

@gamanakis
Copy link
Contributor

Type Version/Name
Distribution Name Archlinux
Distribution Version rolling
Linux Kernel 4.9.37-1-lts
Architecture amd64
ZFS Version 0.7.0-rc5-13-gf6837d9b5
SPL Version 0.7.0-rc5-2-gcd47801

I am observing the very same issue on Archlinux, running the 4.9.37 kernel.

@behlendorf behlendorf added the Component: ZVOL ZFS Volumes label Jul 25, 2017
@klkblake
Copy link

I appear to also be having this issue. The backtrace in my case is slightly different but appears to be mostly the same. I have been experiencing this semi-regularly, particularly with large sends. Unlike the OP, I'm seeing this on the sending side, not the receiving side. I enabled the hung task detector in the hopes of obtaining logs of this freeze when it happened, without that the freeze appears to be permanent, remaining frozen for over an hour before I rebooted. I've also experienced shorter freezes, lasting perhaps 20 seconds or so, which do resolve themselves. They may be unrelated. Also, the kernel hung task detector has failed to obtain logs for this a few times, but it is unclear if this is due to this issue somehow freezing the detector as well, or if it is just firmware weirdness w.r.t pstore.

<3>[ 3933.064890] INFO: task z_zvol:263 blocked for more than 120 seconds.
<3>[ 3933.064892]       Not tainted 4.12.13-gentoo #3
<3>[ 3933.064892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[ 3933.064893] z_zvol          D    0   263      2 0x00000000
<4>[ 3933.064895] Call Trace:
<4>[ 3933.064899]  ? __schedule+0x213/0x5c0
<4>[ 3933.064900]  ? schedule+0x34/0x80
<4>[ 3933.064903]  ? taskq_wait_outstanding+0x5c/0xa0
<4>[ 3933.064904]  ? wake_atomic_t_function+0x50/0x50
<4>[ 3933.064907]  ? zvol_task_cb+0x1d2/0x570
<4>[ 3933.064908]  ? taskq_thread+0x236/0x430
<4>[ 3933.064910]  ? do_task_dead+0x40/0x40
<4>[ 3933.064912]  ? kthread+0xf2/0x130
<4>[ 3933.064913]  ? taskq_thread_spawn+0x50/0x50
<4>[ 3933.064914]  ? __kthread_create_on_node+0x140/0x140
<4>[ 3933.064915]  ? __kthread_create_on_node+0x140/0x140
<4>[ 3933.064916]  ? ret_from_fork+0x22/0x30
<3>[240856.578858] INFO: task z_zvol:253 blocked for more than 120 seconds.
<3>[240856.578859]       Not tainted 4.12.13-gentoo #3
<3>[240856.578860] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6>[240856.578861] z_zvol          D    0   253      2 0x00000000
<4>[240856.578862] Call Trace:
<4>[240856.578876]  ? __schedule+0x213/0x5c0
<4>[240856.578877]  ? schedule+0x34/0x80
<4>[240856.578880]  ? taskq_wait_outstanding+0x5c/0xa0
<4>[240856.578881]  ? wake_atomic_t_function+0x50/0x50
<4>[240856.578883]  ? zvol_task_cb+0x1d2/0x570
<4>[240856.578885]  ? taskq_thread+0x236/0x430
<4>[240856.578886]  ? do_task_dead+0x40/0x40
<4>[240856.578888]  ? kthread+0xf2/0x130
<4>[240856.578889]  ? taskq_thread_spawn+0x50/0x50
<4>[240856.578890]  ? __kthread_create_on_node+0x140/0x140
<4>[240856.578891]  ? ret_from_fork+0x22/0x30

@behlendorf
Copy link
Contributor

@klkblake if you observe this again can you please try running the following command to try and unwedge things. It will cause a new thread to be spawned for any taskq which appears to be stalled. Obviously we really need all the backtraces to determine why it stalled, but this might let you work around the issue until the root cause is understood.

echo 1 >/sys/module/spl/parameters/spl_taskq_kick

@klkblake
Copy link

Unfortunately I can't run commands when this happens -- can't even move my mouse cursor, and attempts to ssh in give "no route to host". I have had no success using the magic sysrq key to reboot, even.

@klkblake
Copy link

The following reliably reproduces the hardlock for me:

zfs create -o mountpoint=/mnt/other pool/test
dd if=/dev/urandom bs=1024 count=65536 of=/mnt/other/file
zfs umount pool/test
zfs send pool/test | sleep 999999999

It usually hardlocks the system within 5-10 minutes after running this. I guess it doesn't like having a slow receiver? When doing a backup on my system the transfer rate is 1MB/s or so, so I'd guess the send is outrunning the receive there too.

@behlendorf behlendorf added this to the 0.8.0 milestone Dec 21, 2017
@gamanakis
Copy link
Contributor

gamanakis commented Jan 10, 2018

The reproducer @klkblake provided, gives the following hung task in my system:

Type Version/Name
Distribution Name Archlinux
Distribution Version rolling
Linux Kernel 4.14.12-1-ARCH
Architecture amd64
ZFS Version 0.7.5_r0_ga803eacf2
SPL Version 0.7.5_r0_ged02400
[11822.623620] ZFS: Loaded module v0.7.5-1, ZFS pool version 5000, ZFS filesystem version 5
[12164.744359] INFO: task send_traverse:1808 blocked for more than 120 seconds.
[12164.746328]       Tainted: P           O    4.14.12-1-ARCH #1
[12164.747952] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12164.750323] send_traverse   D    0  1808      2 0x80000000
[12164.750326] Call Trace:
[12164.750335]  ? __schedule+0x290/0x890
[12164.750338]  schedule+0x2f/0x90
[12164.750341]  io_schedule+0x12/0x40
[12164.750351]  cv_wait_common+0xaa/0x130 [spl]
[12164.750355]  ? wait_woken+0x80/0x80
[12164.750395]  bqueue_enqueue+0x5d/0xd0 [zfs]
[12164.750424]  send_cb+0x144/0x180 [zfs]
[12164.750454]  traverse_visitbp+0x1b7/0x990 [zfs]
[12164.750484]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750513]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750543]  traverse_dnode+0xec/0x1b0 [zfs]
[12164.750573]  traverse_visitbp+0x70a/0x990 [zfs]
[12164.750602]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750631]  traverse_visitbp+0x338/0x990 [zfs]
[12164.750670]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750699]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750728]  traverse_visitbp+0x337/0x990 [zfs]
[12164.750758]  traverse_dnode+0xec/0x1b0 [zfs]
[12164.750786]  traverse_visitbp+0x7d7/0x990 [zfs]
[12164.750815]  traverse_impl+0x1e2/0x440 [zfs]
[12164.750844]  ? dmu_send_impl+0x1310/0x1310 [zfs]
[12164.750873]  ? byteswap_record+0x290/0x290 [zfs]
[12164.750878]  ? __thread_exit+0x20/0x20 [spl]
[12164.750909]  traverse_dataset_resume+0x42/0x50 [zfs]
[12164.750955]  ? dmu_send_impl+0x1310/0x1310 [zfs]
[12164.750984]  send_traverse_thread+0x52/0xb0 [zfs]
[12164.750989]  thread_generic_wrapper+0x6d/0x80 [spl]
[12164.750993]  kthread+0x118/0x130
[12164.750995]  ? kthread_create_on_node+0x70/0x70
[12164.750998]  ret_from_fork+0x1f/0x30

Replacing cv_wait() with cv_wait_sig() in module/zfs/bqueue.c seems to resolve it. I cannot tell though if this is a sane approach.

@behlendorf
Copy link
Contributor

behlendorf commented Jan 10, 2018

@gamanakis were you able to reproduce the hard lockup on the system? I ask because using cv_wait_sig() should only suppress the hung task watchdog on the system by sleeping interruptibly rather than uninterruptibly. That is a good idea, and a change we should make, but I don't see how it would explain the hard lock up described here.

[edit] I should add that we made a similar change a while back in commit 8e70975

@gamanakis
Copy link
Contributor

@behlendorf I could not reproduce a hard lockup this far.

@klkblake
Copy link

Possibly relevant: I'm running with ZFS_DEBUG_MODIFY set. This computer doesn't have ECC ram yet, and I'd heard it could catch some forms of memory corruption, so I've been using it as a stop-gap measure until I actually get ECC ram (and replace like, everything, because of intel's stupid market segmentation nonsense w.r.t ECC ram support).

behlendorf pushed a commit that referenced this issue Mar 30, 2018
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6330 
Closes #6890 
Closes #7343
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Apr 16, 2018
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#6330 
Closes openzfs#6890 
Closes openzfs#7343
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue May 4, 2018
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes openzfs#6330 
Closes openzfs#6890 
Closes openzfs#7343
tonyhutter pushed a commit that referenced this issue May 10, 2018
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue.  This results in hung tasks:

<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3>      Tainted: P           O  3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260

root@linux:~# cat /proc/spl/taskq
taskq                       act  nthr  spwn  maxt   pri  mina
spl_system_taskq/0            1     2     0    64   100     1
	active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
	wait: 5541
spl_delay_taskq/0             0     1     0     4   100     1
	delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1                      1     1     0     1   120     1
	active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
	pend: zvol_task_cb [zfs](0xffff88001fde6800)

This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.

Reviewed-by: Albert Lee <trisk@forkgnu.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #6330
Closes #6890
Closes #7343
@github-duran
Copy link

I apparently get the same issue sending/receiving a filesystem between 2 different pools on the same machine.

Setup. Fresh Ubuntu 18.04.1 install, fully updated.

Linux version 4.15.0-34-generic (buildd@lgw01-amd64-047) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

MemTotal:       16374904 kB
MemFree:         7322968 kB

pool: backup
 state: ONLINE
  scan: scrub repaired 0B in 2h18m with 0 errors on Sat Sep 15 01:59:21 2018
config:

        NAME                                    STATE     READ WRITE CKSUM
        backup                                  ONLINE       0     0     0
          mirror-0                              ONLINE       0     0     0
            ata-SAMSUNG_HD103UJ_S13PJDWQC05353  ONLINE       0     0     0
            ata-SAMSUNG_HD103UJ_S13PJDWQC05349  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: none requested
config:

        NAME                                          STATE     READ WRITE CKSUM
        tank                                          ONLINE       0     0     0
          raidz2-0                                    ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WCC1T0418462  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0061334  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0075172  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0076274  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0078222  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0656559  ONLINE       0     0     0
            ata-WDC_WD30EFRX-68AX9N0_WD-WMC1T0658861  ONLINE       0     0     0
            ata-WDC_WD40EFRX-68N32N0_WD-WCC7K4DSXX8H  ONLINE       0     0     0

errors: No known data errors


NAME                                     USED  AVAIL  REFER  USEDSNAP  COPIES  CREATION               RATIO
backup                                   526G   373G    28K        0B       1  Sat Sep  8 15:30 2018  1.03x
backup/audio                            23.9G   373G  23.9G       52K       1  Sun Sep  9  3:08 2018  1.02x
backup/backup                           10.5G   373G  10.5G      565K       1  Sun Sep  9  0:39 2018  1.44x
backup/backup-system-files-server-1204   788M   373G   788M        0B       1  Mon Sep 10 20:36 2018  1.93x
backup/dev                              12.7G   373G  12.7G      426K       2  Sat Sep  8 21:00 2018  1.10x
backup/documents                        13.0G   373G  13.0G     12.4M       2  Sat Sep  8 22:13 2018  1.07x
backup/ftp                              49.5K   373G    23K     26.5K       1  Sat Sep  8 21:50 2018  1.00x
backup/pics                              461G   373G   460G      684M       2  Sun Sep  9  0:47 2018  1.01x
backup/repo                              813M   373G   811M     2.08M       2  Sat Sep  8 20:48 2018  1.05x
backup/web                              3.15G   373G  1.33G     1.82G       2  Sun Sep  9  0:34 2018  1.08x
tank                                     491G  14.5T   239K        0B       1  Mon Sep 17 20:14 2018  1.03x
tank/audio                              24.0G  14.5T  24.0G      188K       1  Mon Sep 17 22:38 2018  1.02x
tank/backup                             11.2G  14.5T  11.2G     1.41M       1  Mon Sep 17 22:30 2018  1.42x
tank/dev                                13.1G  14.5T  13.1G      998K       2  Mon Sep 17 21:35 2018  1.09x
tank/documents                          13.2G  14.5T  13.1G     21.3M       2  Mon Sep 17 22:03 2018  1.06x
tank/pics                                425G  14.5T   115G      482M       2  Mon Sep 17 22:57 2018  1.01x
tank/repo                                850M  14.5T   844M     6.29M       2  Mon Sep 17 21:50 2018  1.04x
tank/web                                3.50G  14.5T  1.68G     1.82G       2  Mon Sep 17 21:53 2018  1.07x

Action
zfs send -vR backup/pics@snap-migrate-20180908 | zfs receive tank/pics

Dmesg (same message reoccrurring every 120s)

[Mon Sep 17 23:38:57 2018] INFO: task z_zvol:1049 blocked for more than 300 seconds.
[Mon Sep 17 23:38:57 2018]       Tainted: P           O     4.15.0-34-generic #37-Ubuntu
[Mon Sep 17 23:38:57 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Mon Sep 17 23:38:57 2018] z_zvol          D    0  1049      2 0x80000000
[Mon Sep 17 23:38:57 2018] Call Trace:
[Mon Sep 17 23:38:57 2018]  __schedule+0x291/0x8a0
[Mon Sep 17 23:38:57 2018]  schedule+0x2c/0x80
[Mon Sep 17 23:38:57 2018]  taskq_wait_outstanding+0x8c/0xd0 [spl]
[Mon Sep 17 23:38:57 2018]  ? wait_woken+0x80/0x80
[Mon Sep 17 23:38:57 2018]  zvol_task_cb+0x200/0x5b0 [zfs]
[Mon Sep 17 23:38:57 2018]  ? __schedule+0x299/0x8a0
[Mon Sep 17 23:38:57 2018]  taskq_thread+0x2ab/0x4e0 [spl]
[Mon Sep 17 23:38:57 2018]  ? wake_up_q+0x80/0x80
[Mon Sep 17 23:38:57 2018]  kthread+0x121/0x140
[Mon Sep 17 23:38:57 2018]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[Mon Sep 17 23:38:57 2018]  ? kthread_create_worker_on_cpu+0x70/0x70
[Mon Sep 17 23:38:57 2018]  ret_from_fork+0x22/0x40

Anything I can do/provide?

@behlendorf
Copy link
Contributor

@github-duran the fix for this issue was included as of the zfs-0.7.9 tag, it has not yet been included in the Ubuntu release. https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1772412

@github-duran
Copy link

Oh thank you. Can I safely ignore it until the fix is released?

@behlendorf
Copy link
Contributor

You'll need to reboot to stop the warnings at some point. But aside from that there's no risk to your data.

@saurabhnanda
Copy link

I'm facing this issue in production as I type this. I'm on Ubuntu 18.04.2 which has ZFS 0.7.5.

The transfer went smoothly till about 95GB and now it has slowed down to a crawl. Is there no easy fix for this?

@saurabhnanda
Copy link

If I let the transfer take it's own time, will it pick up speed later?

@saurabhnanda
Copy link

The status of the Ubuntu bug has been changed to fix released - https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1772412?comments=all - does anyone know where/how the bugfix has been released? And how to patch a live system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: ZVOL ZFS Volumes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants