Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic on Ubuntu - zfs 0.6.5-1~trusty #3786

Closed
Der-Jan opened this issue Sep 16, 2015 · 19 comments
Closed

Kernel panic on Ubuntu - zfs 0.6.5-1~trusty #3786

Der-Jan opened this issue Sep 16, 2015 · 19 comments
Milestone

Comments

@Der-Jan
Copy link

Der-Jan commented Sep 16, 2015

After updating to zfs 0.6.5 on my Ubuntu server, I get a kernel panic each morning. I think it might be related to mlocate. I've disabled it for the moment - but I haven't had any issues before.
[84255.374379] CPU: 0 PID: 15716 Comm: updatedb.mlocat Tainted: P OX 3.13.0-63-generic #103-Ubuntu
[84255.401492] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[84255.431562] 0000000000000009 ffff88107fc03d90 ffffffff81723cc0 0000000000000000
[84255.458151] ffff88107fc03dc8 ffffffff8106785d 0000000000000001 ffff88107fc13180
[84255.485714] 00000001014041ee 0000000000000000 ffff88107fc33180 ffff88107fc03dd8
[84255.514080] Call Trace:
[84255.527143] [] dump_stack+0x45/0x56
[84255.543818] [] warn_slowpath_common+0x7d/0xa0
[84255.560648] [] warn_slowpath_null+0x1a/0x20
[84255.577231] [] native_smp_send_reschedule+0x5d/0x60
[84255.594698] [] trigger_load_balance+0x16a/0x1e0
[84255.611747] [] scheduler_tick+0xa4/0xf0
[84255.628268] [] update_process_times+0x60/0x70
[84255.644967] [] tick_sched_handle.isra.17+0x25/0x60
[84255.661908] [] tick_sched_timer+0x41/0x60
[84255.678002] [] __run_hrtimer+0x77/0x1d0
[84255.693571] [] ? tick_sched_handle.isra.17+0x60/0x60
[84255.710216] [] hrtimer_interrupt+0xef/0x230
[84255.725929] [] local_apic_timer_interrupt+0x37/0x60
[84255.742135] [] smp_apic_timer_interrupt+0x3f/0x60
[84255.757983] [] apic_timer_interrupt+0x6d/0x80
[84255.773743] [] ? panic+0x193/0x1d7
[84255.788633] [] avl_add+0x4a/0x50 [zavl]
[84255.803095] [] zfsctl_snapshot_add+0x36/0x40 [zfs]
[84255.818878] [] zfsctl_snapshot_mount+0x3ad/0x440 [zfs]
[84255.834321] [] zpl_snapdir_automount+0x10/0x30 [zfs]
[84255.849493] [] follow_managed+0x13a/0x300
[84255.863640] [] ? path_lookupat+0x73/0x790
[84255.877231] [] lookup_fast+0x18b/0x2c0
[84255.891286] [] path_lookupat+0x155/0x790
[84255.904915] [] ? putname+0x29/0x40
[84255.918328] [] ? getname_flags+0x4f/0x190
[84255.931196] [] filename_lookup+0x2b/0xc0
[84255.944405] [] user_path_at_empty+0x54/0x90
[84255.956902] [] ? from_kgid_munged+0x12/0x20
[84255.969678] [] ? read_tsc+0x9/0x20
[84255.981437] [] ? __getnstimeofday+0x3a/0xc0
[84255.994077] [] user_path_at+0x11/0x20
[84256.005926] [] SyS_chdir+0x2f/0xc0
[84256.017377] [] sysenter_dispatch+0x7/0x21

@behlendorf
Copy link
Contributor

@Der-Jan thanks for reporting this. The stack appears to be caused by updatedb traversing in to the .zfs/snapshot directories. Until we get to the root cause you can prevent this from happening by setting the snapdir property to hidden. Can you check if this is just a panic or a warning, it's not 100% clear from the stack. Is there anything else in the console log or dmesg output?

@behlendorf behlendorf added this to the 0.7.0 milestone Sep 21, 2015
@Der-Jan
Copy link
Author

Der-Jan commented Sep 22, 2015

I had .zfs in the PRUNENAMES list for updatedb, but it was probably ignored. I had the very same idea, setting snapdir to hidden - some filesystems had it visible. For some reason - I didn't figure out - there was still a kernel panic the next morning (and yes it is panic, not warning, reboot via reset).
In the end I went back to 0.6.4.2-1~trusty and kernel panic is gone again.

Here's a more complete log:

[83918.617184] Kernel panic - not syncing: avl_find() succeeded inside avl_add()
[83918.624749] CPU: 0 PID: 22125 Comm: updatedb.mlocat Tainted: P           OX 3.13.0-63-generic #103-Ubuntu
[83918.634876] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[83918.646277]  ffff88057386bc00 ffff8805c106fbb0 ffffffff81723cc0 ffffffffa00211b0
[83918.654224]  ffff8805c106fc28 ffffffff8171cb43 0000000000000008 ffff8805c106fc38
[83918.662180]  ffff8805c106fbd8 ffff8805c106fbe8 ffff88057386bcc0 0000000000000000
[83918.670145] Call Trace:
[83918.672730]  [<ffffffff81723cc0>] dump_stack+0x45/0x56
[83918.678185]  [<ffffffff8171cb43>] panic+0xc8/0x1d7
[83918.683250]  [<ffffffffa002076a>] avl_add+0x4a/0x50 [zavl]
[83918.689103]  [<ffffffffa02890f6>] zfsctl_snapshot_add+0x36/0x40 [zfs]
[83918.695962]  [<ffffffffa028a37d>] zfsctl_snapshot_mount+0x3ad/0x440 [zfs]
[83918.703205]  [<ffffffffa02ba8d0>] zpl_snapdir_automount+0x10/0x30 [zfs]
[83918.710247]  [<ffffffff811c876a>] follow_managed+0x13a/0x300
[83918.716339]  [<ffffffff811cb563>] ? path_lookupat+0x73/0x790
[83918.722409]  [<ffffffff811c8f0b>] lookup_fast+0x18b/0x2c0
[83918.728126]  [<ffffffff811cb645>] path_lookupat+0x155/0x790
[83918.734023]  [<ffffffff811ce529>] ? putname+0x29/0x40
[83918.739366]  [<ffffffff811ce39f>] ? getname_flags+0x4f/0x190
[83918.745459]  [<ffffffff811cbcab>] filename_lookup+0x2b/0xc0
[83918.751476]  [<ffffffff811cf074>] user_path_at_empty+0x54/0x90
[83918.767030]  [<ffffffff810f44d2>] ? from_kgid_munged+0x12/0x20
[83918.781805]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[83918.795881]  [<ffffffff810cdcda>] ? __getnstimeofday+0x3a/0xc0
[83918.811108]  [<ffffffff811cf0c1>] user_path_at+0x11/0x20
[83918.824597]  [<ffffffff811bcc6f>] SyS_chdir+0x2f/0xc0
[83918.837878]  [<ffffffff8173657a>] sysenter_dispatch+0x7/0x21
[83918.863707] ------------[ cut here ]------------
[83918.875967] WARNING: CPU: 0 PID: 22125 at /build/linux-hFNI9K/linux-3.13.0/arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5d/0x60()
[83918.903926] Modules linked in: btrfs raid6_pq xor ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c nfsv3 nfsv4 iptable_filter ip_tables x_tables ipmi_devintf veth vmw_vsock_vmci_transport vsock vmw_vmci bridge stp llc joydev gpio_ich intel_rapl x86_pkg_temp_thermal intel_powerclamp dm_multipath kvm_intel scsi_dh kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw serio_raw mei_me gf128mul acpi_pad mei glue_helper ablk_helper lpc_ich cryptd ioatdma sb_edac ipmi_si edac_core shpchp mac_hid nfsd parport_pc ppdev w83627ehf hwmon_vid auth_rpcgss nfs_acl coretemp nfs lockd sunrpc lp parport fscache zfs(POX) zunicode(POX) zcommon(POX) znvpair(POX) spl(OX) zavl(POX) hid_generic usbhid igb i2c_algo_bit hid mpt2sas isci dca ahci libsas raid_class ptp psmouse libahci scsi_transport_sas pps_core
[83919.070454] CPU: 0 PID: 22125 Comm: updatedb.mlocat Tainted: P           OX 3.13.0-63-generic #103-Ubuntu
[83919.098044] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[83919.127560]  0000000000000009 ffff88107fc03d90 ffffffff81723cc0 0000000000000000
[83919.154881]  ffff88107fc03dc8 ffffffff8106785d 0000000000000001 ffff88107fc13180
[83919.182962]  00000001013ef97b 0000000000000000 ffff88107fc33180 ffff88107fc03dd8
[83919.213130] Call Trace:
[83919.226242]  <IRQ>  [<ffffffff81723cc0>] dump_stack+0x45/0x56
[83919.244621]  [<ffffffff8106785d>] warn_slowpath_common+0x7d/0xa0
[83919.261678]  [<ffffffff8106793a>] warn_slowpath_null+0x1a/0x20
[83919.278589]  [<ffffffff8104085d>] native_smp_send_reschedule+0x5d/0x60
[83919.296125]  [<ffffffff810a84fa>] trigger_load_balance+0x16a/0x1e0
[83919.313292]  [<ffffffff810997b4>] scheduler_tick+0xa4/0xf0
[83919.329563]  [<ffffffff81076500>] update_process_times+0x60/0x70
[83919.346118]  [<ffffffff810d66a5>] tick_sched_handle.isra.17+0x25/0x60
[83919.363044]  [<ffffffff810d6721>] tick_sched_timer+0x41/0x60
[83919.379004]  [<ffffffff8108e9e7>] __run_hrtimer+0x77/0x1d0
[83919.394458]  [<ffffffff810d66e0>] ? tick_sched_handle.isra.17+0x60/0x60
[83919.410981]  [<ffffffff8108f1af>] hrtimer_interrupt+0xef/0x230
[83919.426545]  [<ffffffff81043727>] local_apic_timer_interrupt+0x37/0x60
[83919.442651]  [<ffffffff81736c4f>] smp_apic_timer_interrupt+0x3f/0x60
[83919.458966]  [<ffffffff817355dd>] apic_timer_interrupt+0x6d/0x80
[83919.474278]  <EOI>  [<ffffffff8171cc0e>] ? panic+0x193/0x1d7
[83919.489180]  [<ffffffffa002076a>] avl_add+0x4a/0x50 [zavl]
[83919.504191]  [<ffffffffa02890f6>] zfsctl_snapshot_add+0x36/0x40 [zfs]
[83919.519482]  [<ffffffffa028a37d>] zfsctl_snapshot_mount+0x3ad/0x440 [zfs]
[83919.535221]  [<ffffffffa02ba8d0>] zpl_snapdir_automount+0x10/0x30 [zfs]
[83919.550839]  [<ffffffff811c876a>] follow_managed+0x13a/0x300
[83919.565027]  [<ffffffff811cb563>] ? path_lookupat+0x73/0x790
[83919.578807]  [<ffffffff811c8f0b>] lookup_fast+0x18b/0x2c0
[83919.592321]  [<ffffffff811cb645>] path_lookupat+0x155/0x790
[83919.605669]  [<ffffffff811ce529>] ? putname+0x29/0x40
[83919.618333]  [<ffffffff811ce39f>] ? getname_flags+0x4f/0x190
[83919.631287]  [<ffffffff811cbcab>] filename_lookup+0x2b/0xc0
[83919.643887]  [<ffffffff811cf074>] user_path_at_empty+0x54/0x90
[83919.656655]  [<ffffffff810f44d2>] ? from_kgid_munged+0x12/0x20
[83919.669793]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[83919.681322]  [<ffffffff810cdcda>] ? __getnstimeofday+0x3a/0xc0
[83919.693806]  [<ffffffff811cf0c1>] user_path_at+0x11/0x20
[83919.707177]  [<ffffffff811bcc6f>] SyS_chdir+0x2f/0xc0
[83919.720702]  [<ffffffff8173657a>] sysenter_dispatch+0x7/0x21
[83919.734587] ---[ end trace cf59f1be0bb8cbe1 ]---

@behlendorf
Copy link
Contributor

@Der-Jan this almost certainly was accidentally introduced by 278bee9, we'll get it sorted in a point release.

@behlendorf behlendorf modified the milestones: 0.6.5.2, 0.7.0 Sep 23, 2015
@behlendorf behlendorf modified the milestones: 0.6.5.3, 0.6.5.2 Sep 25, 2015
@dgtangman
Copy link

Same problem on my Ubuntu 14.04 desktop with ZFS 0.6.5.1-1~trusty installed. Added PRUNENAMES=".zfs" to /etc/updatedb.conf and updatedb ran to completion for the first time in a while. @Der-Jan you might want to check whether the PRUNENAMES line in your updatedb.conf is commented out (i.e., has a leading "#"); it was commented out in the default configuration on my system, and that would explain why adding .zfs didn't fix your problem.

I can also confirm that setting snapdir=hidden does not fix the problem.

behlendorf added a commit to behlendorf/zfs that referenced this issue Sep 26, 2015
When multiple processes simultaneously traverse in to a .zfs snapshot
directory they each may trigger an auto-mount of that snapshot.  Since
each of these mounts may succeed that will result in multiple entries
in the mounted snapshot tree triggering a panic in avl_add().

One solution to this issue is to hold the zfs_snapshot_lock over the
tree lookup and subsequent mount to serialize the process.  This has
the advtange of being simple and easily understandable.  It has the
disadvantage that it means the zpl_mount() function must never take
the zfs_snapshot_lock.  But in practice this never happens and is
easy to prevent.

A technically superior approach would be to add a dummy entry to the
snapshot during the mount.  But this would add significant complexity
because numberous places in the code would need to be updated to
understand dummy entries.  While not optimal this simpler solution
is preferable for a point release.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3786
@Der-Jan
Copy link
Author

Der-Jan commented Sep 28, 2015

@dgtangman PRUNENAMES was not commented out, even apt-get remove didn't help. I wonder if one of the Linux Containers was causing this (didn't go through all of them).

@behlendorf
Copy link
Contributor

@Der-Jan would it be possible for you to verify the fix in #3842. I haven't had any luck intentionally reproducing the issue but this patch should close the race.

@Der-Jan
Copy link
Author

Der-Jan commented Sep 29, 2015

@behlendorf didn't really do the trick:

[ 2191.821290] Kernel panic - not syncing: avl_find() succeeded inside avl_add()
[ 2191.828835] CPU: 9 PID: 18904 Comm: updatedb.mlocat Tainted: P           OX 3.13.0-63-generic #103-Ubuntu
[ 2191.838948] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[ 2191.850345]  ffff880d3ca44a20 ffff880d43871bb0 ffffffff81723cc0 ffffffffa002b1b0
[ 2191.858300]  ffff880d43871c28 ffffffff8171cb43 0000000000000008 ffff880d43871c38
[ 2191.866264]  ffff880d43871bd8 ffff880d43871be8 ffff880d38d88060 0000000000000000
[ 2191.874187] Call Trace:
[ 2191.876766]  [<ffffffff81723cc0>] dump_stack+0x45/0x56
[ 2191.882194]  [<ffffffff8171cb43>] panic+0xc8/0x1d7
[ 2191.887251]  [<ffffffffa002a76a>] avl_add+0x4a/0x50 [zavl]
[ 2191.893074]  [<ffffffffa02720f6>] zfsctl_snapshot_add+0x36/0x40 [zfs]
[ 2191.899886]  [<ffffffffa027333a>] zfsctl_snapshot_mount+0x36a/0x3d0 [zfs]
[ 2191.907068]  [<ffffffffa02a3860>] zpl_snapdir_automount+0x10/0x30 [zfs]
[ 2191.914129]  [<ffffffff811c876a>] follow_managed+0x13a/0x300
[ 2191.920101]  [<ffffffff811cb563>] ? path_lookupat+0x73/0x790
[ 2191.926188]  [<ffffffff811c8f0b>] lookup_fast+0x18b/0x2c0
[ 2191.931883]  [<ffffffff811cb645>] path_lookupat+0x155/0x790
[ 2191.937770]  [<ffffffff811ce529>] ? putname+0x29/0x40
[ 2191.943100]  [<ffffffff811ce39f>] ? getname_flags+0x4f/0x190
[ 2191.949072]  [<ffffffff811cbcab>] filename_lookup+0x2b/0xc0
[ 2191.954953]  [<ffffffff811cf074>] user_path_at_empty+0x54/0x90
[ 2191.969131]  [<ffffffff810f44d2>] ? from_kgid_munged+0x12/0x20
[ 2191.983343]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[ 2191.996945]  [<ffffffff810cdcda>] ? __getnstimeofday+0x3a/0xc0
[ 2192.011176]  [<ffffffff811cf0c1>] user_path_at+0x11/0x20
[ 2192.024405]  [<ffffffff811bcc6f>] SyS_chdir+0x2f/0xc0
[ 2192.037708]  [<ffffffff8173657a>] sysenter_dispatch+0x7/0x21
[ 2192.063481] ------------[ cut here ]------------
[ 2192.075484] WARNING: CPU: 9 PID: 18904 at /build/linux-hFNI9K/linux-3.13.0/arch/x86/kernel/smp.c:124 native_smp_send_reschedule+0x5d/0x60()
[ 2192.103076] Modules linked in: iptable_filter ip_tables x_tables ipmi_devintf veth vmw_vsock_vmci_transport vsock vmw_vmci bridge stp llc intel_rapl gpio_ich x86_pkg_t
emp_thermal intel_powerclamp kvm_intel kvm joydev crct10dif_pclmul crc32_pclmul dm_multipath ghash_clmulni_intel aesni_intel aes_x86_64 scsi_dh lrw gf128mul glue_helper a
blk_helper cryptd shpchp serio_raw sb_edac ioatdma edac_core mei_me mei lpc_ich ipmi_si acpi_pad mac_hid parport_pc ppdev nfsd w83627ehf hwmon_vid coretemp auth_rpcgss nf
s_acl nfs lp parport lockd sunrpc fscache zfs(POX) zunicode(POX) zcommon(POX) znvpair(POX) spl(OX) zavl(POX) hid_generic igb usbhid i2c_algo_bit hid mpt2sas isci dca libs
as raid_class ahci ptp psmouse libahci scsi_transport_sas pps_core
[ 2192.254064] CPU: 9 PID: 18904 Comm: updatedb.mlocat Tainted: P           OX 3.13.0-63-generic #103-Ubuntu
[ 2192.281222] Hardware name: Supermicro X9SRE/X9SRE-3F/X9SRi/X9SRi-3F/X9SRE/X9SRE-3F/X9SRi/X9SRi-3F, BIOS 1.0a 03/06/2012
[ 2192.310391]  0000000000000009 ffff88107fd23d90 ffffffff81723cc0 0000000000000000
[ 2192.337156]  ffff88107fd23dc8 ffffffff8106785d 0000000000000000 ffff88107fd33180
[ 2192.364885]  0000000100073736 0000000000000009 ffff88107fc13180 ffff88107fd23dd8
[ 2192.393482] Call Trace:
[ 2192.406613]  <IRQ>  [<ffffffff81723cc0>] dump_stack+0x45/0x56
[ 2192.423491]  [<ffffffff8106785d>] warn_slowpath_common+0x7d/0xa0
[ 2192.440321]  [<ffffffff8106793a>] warn_slowpath_null+0x1a/0x20
[ 2192.456995]  [<ffffffff8104085d>] native_smp_send_reschedule+0x5d/0x60
[ 2192.474453]  [<ffffffff810a84fa>] trigger_load_balance+0x16a/0x1e0
[ 2192.491590]  [<ffffffff810997b4>] scheduler_tick+0xa4/0xf0
[ 2192.507837]  [<ffffffff81076500>] update_process_times+0x60/0x70
[ 2192.524544]  [<ffffffff810d66a5>] tick_sched_handle.isra.17+0x25/0x60
[ 2192.541505]  [<ffffffff810d6721>] tick_sched_timer+0x41/0x60
[ 2192.557423]  [<ffffffff8108e9e7>] __run_hrtimer+0x77/0x1d0
[ 2192.572940]  [<ffffffff810d66e0>] ? tick_sched_handle.isra.17+0x60/0x60
[ 2192.589680]  [<ffffffff8108f1af>] hrtimer_interrupt+0xef/0x230
[ 2192.605268]  [<ffffffff81043727>] local_apic_timer_interrupt+0x37/0x60
[ 2192.621385]  [<ffffffff81736c4f>] smp_apic_timer_interrupt+0x3f/0x60
[ 2192.637162]  [<ffffffff817355dd>] apic_timer_interrupt+0x6d/0x80
[ 2192.652383]  <EOI>  [<ffffffff8171cc0e>] ? panic+0x193/0x1d7
[ 2192.668266]  [<ffffffffa002a76a>] avl_add+0x4a/0x50 [zavl]
[ 2192.682737]  [<ffffffffa02720f6>] zfsctl_snapshot_add+0x36/0x40 [zfs]
[ 2192.698363]  [<ffffffffa027333a>] zfsctl_snapshot_mount+0x36a/0x3d0 [zfs]
[ 2192.713967]  [<ffffffffa02a3860>] zpl_snapdir_automount+0x10/0x30 [zfs]
[ 2192.729354]  [<ffffffff811c876a>] follow_managed+0x13a/0x300
[ 2192.743872]  [<ffffffff811cb563>] ? path_lookupat+0x73/0x790
[ 2192.757829]  [<ffffffff811c8f0b>] lookup_fast+0x18b/0x2c0
[ 2192.771138]  [<ffffffff811cb645>] path_lookupat+0x155/0x790
[ 2192.784700]  [<ffffffff811ce529>] ? putname+0x29/0x40
[ 2192.797294]  [<ffffffff811ce39f>] ? getname_flags+0x4f/0x190
[ 2192.810289]  [<ffffffff811cbcab>] filename_lookup+0x2b/0xc0
[ 2192.822948]  [<ffffffff811cf074>] user_path_at_empty+0x54/0x90
[ 2192.835362]  [<ffffffff810f44d2>] ? from_kgid_munged+0x12/0x20
[ 2192.848136]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[ 2192.859998]  [<ffffffff810cdcda>] ? __getnstimeofday+0x3a/0xc0
[ 2192.872522]  [<ffffffff811cf0c1>] user_path_at+0x11/0x20
[ 2192.884301]  [<ffffffff811bcc6f>] SyS_chdir+0x2f/0xc0
[ 2192.895911]  [<ffffffff8173657a>] sysenter_dispatch+0x7/0x21
[ 2192.908314] ---[ end trace 28f4770bb0af8b6b ]---

@behlendorf
Copy link
Contributor

@Der-Jan OK, thanks for the quick turn around. I'll work on reproducing it locally.

@ltaulell
Copy link

Confirmed on Debian Wheezy and Jessie, with up-to-date kernels and spl/zfs modules (3.2.68-1+deb7u4 to 4.1.6-1~bpo8+1, zfs 0.6.5.1-4)

Kernel Panic, avl_find() succeeded inside avl_add() - updatedb.mlocat

Workaround:
add zfs to PRUNEFS in /etc/updatedb.conf
or set snapdir/dev to hidden (but a mount can still occur while an updatedb process is running)

behlendorf added a commit to behlendorf/zfs that referenced this issue Oct 5, 2015
There exists a race where the kernel will auto-mount a snapshot in
two different namespaces.  This can result in the zfs_snapentry_t
being added to the snapshot AVL trees twice.  In order to prevent
the panic check if it exists in the tree before inserting it.

Longer term to correctly handle multiple mounts in different
namespaces we may need to keep a list of all valid mount points
for each entry.  Otherwise we run the risk of them not automatically
unmounting.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue openzfs#3786
Issue openzfs#3887
@behlendorf
Copy link
Contributor

@Der-Jan @ltaulell I have a possible explanation for this but it would require you to have multiple pools imported on your system. Is this the case?

@behlendorf behlendorf modified the milestones: 0.6.5.4, 0.6.5.3 Oct 15, 2015
@Der-Jan
Copy link
Author

Der-Jan commented Oct 15, 2015

@behlendorf Indeed - there's rpool and tank as pools

@ltaulell
Copy link

@behlendorf Yep, multiple pools (3 to 8, 1 pool by DAS brick) on 2 servers.

@behlendorf
Copy link
Contributor

OK, then this makes sense. Basically the objset id tree needs to be per-spa because the objsetid's aren't unique. Thanks for the verification.

@tuxoko
Copy link
Contributor

tuxoko commented Oct 23, 2015

@behlendorf
I just look into this because it seem to be coming up very often.
This is my take on this. So basically, there's no one to preventing two processes going into zfsctl_snapshot_mount simultaneously. And so the two threads will both call into avl_add with the same snapshot and cause this panic.

Edit: I can reliably reproduce this with the following program

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
void child_func()
{
    char path[64];
    chdir("/pp/fs0/.zfs/snapshot/a00");
    getcwd(path, 64);
    printf("%s\n", path);
    chdir("/");
    printf("exit\n");
    exit(0);
}

int main(int argc, char *argv[])
{
    int i;
    pid_t pid;
    for (i = 0; i < 4; i++){
        pid = fork();
        if (pid == 0)
            child_func();
    }
    usleep(1000);
    return 0;
}

@ltaulell
Copy link

Just bumping into another "set case" with same consequences:

kernel:[ 213.949943] Kernel panic - not syncing: avl_find() succeeded inside avl_add()

wheezy stock: 3.2.68-1+deb7u3 + zfs dkms 0.6.5.2-2-wheezy
single pool (only one pool on this one), many zfs(nfs) exports, few rolling snapshots (7 by zfs).
snapdir/dev already hidden, and mlocate config to ignore filesystem AND filesystem type.

trying to upgrade the kernel as machine doesn't stand more than 2 minutes before panic again.

EDIT: related to nfs exports...
wheezy backports, 3.16.7-ckt11-1+deb8u4~bpo70+1, zfs dkms 0.6.5.2-2-wheezy
hang without any msg about 2 mn after nfs-kernel-server start

EDIT: my bad

there WAS one last zfs export with snapsdir=visible set. Returning back to snapdir=hidden "seems" to do the job, for now (48h running, in production conditions, without any hang).

Sorry for the inconvenience.

@behlendorf
Copy link
Contributor

@tuxoko thanks for digging in to this. The fix you proposed in #4018 looks right to me and I think it'll address the common case people are seeing. There's still a much less likely second case which can occur but it requires the system to have multiple pools mounted and that datasets with identical objset ids be auto-mounted concurrently. We can tackle that one in a follow up patch.

behlendorf pushed a commit to behlendorf/zfs that referenced this issue Dec 9, 2015
objsetid is not unique across pool, so using it solely as key would cause
panic when automounting two snapshot on different pools with the same
objsetid. We fix this by adding spa pointer as additional key.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Issue openzfs#3948
Issue openzfs#3786
Issue openzfs#3887
@behlendorf
Copy link
Contributor

This should now be addressed in master and these fixes will appear in the next point release. @tuxoko provided patches for both known ways this could occur. If people impacted by this issue have time to independently verify the fix which was applied to master that would be appreciated.

24ef51f Use spa as key besides objsetid for snapentry
d287880 Fix snapshot automount behavior when concurrent or fail

@Der-Jan
Copy link
Author

Der-Jan commented Dec 9, 2015

@behlendorf I've applied the patches and installed the modules. Since I made sure mlocate won't access any zfs snapshots, I cannot really test it with mlocate. However I used @tuxoko's parallel mount without kernel panic. So seems to do the trick for me.

tuxoko pushed a commit to tuxoko/zfs that referenced this issue Dec 14, 2015
objsetid is not unique across pool, so using it solely as key would cause
panic when automounting two snapshot on different pools with the same
objsetid. We fix this by adding spa pointer as additional key.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Issue openzfs#3948
Issue openzfs#3786
Issue openzfs#3887
nedbass pushed a commit that referenced this issue Dec 24, 2015
objsetid is not unique across pool, so using it solely as key would cause
panic when automounting two snapshot on different pools with the same
objsetid. We fix this by adding spa pointer as additional key.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Issue #3948
Issue #3786
Issue #3887
@behlendorf
Copy link
Contributor

This has been resolved in master and the release branch.

ryao pushed a commit to ryao/zfs that referenced this issue Jan 4, 2016
objsetid is not unique across pool, so using it solely as key would cause
panic when automounting two snapshot on different pools with the same
objsetid. We fix this by adding spa pointer as additional key.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Issue openzfs#3948
Issue openzfs#3786
Issue openzfs#3887
goulvenriou pushed a commit to Alyseo/zfs that referenced this issue Jan 17, 2016
objsetid is not unique across pool, so using it solely as key would cause
panic when automounting two snapshot on different pools with the same
objsetid. We fix this by adding spa pointer as additional key.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Issue openzfs#3948
Issue openzfs#3786
Issue openzfs#3887
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants