-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
access to .zfs directory via samba causes smbd NULL deref #626
Comments
The same failure can be seen with Ubuntu 11.10 kernel 3.0.0-17. [ 234.089112] BUG: unable to handle kernel NULL pointer dereference at 000000000000005f |
Are you only hitting this issue when using smbd? Or are you also able to hit it simply by manually traversing in to the directories from the shell? |
I tried to access/traverse the .zfs directories locally from the shell, no problems at all. Access from another linux with smbd works too. Only access from windows causes smbd to fail. Maybe windows is trying to write to .zfs or reading some attributes? |
I updated my Opensuse 12.1 to kernel 3.3.0-1 and samba version 3.6.3. New make for spl and zfs .56-rc8. Same problem when accessing the .zfs directory from windows. Smbd fails wih NULL deref. [ 1553.415448] BUG: unable to handle kernel NULL pointer dereference at 000000000000005f |
Finally traversing the directory tree with find shows a problem: find / -name test.txt |
Thanks for following up with additional detail. It's surprising that only a windows client hits the first issue since smbd is just a userspace process like everything else. It must be doing something slightly different. The second problem with find is an unrelated issue we'll need to fix as well. |
Im sure you know, within Solaris ZFS and samba, Windows can access shares and the snapshots via "recent versions" in windows explorer. This needs some lines of code for samba, which has been written some time ago (for samba 3.4.3) , but i cant adapt it for the current samba version 3.6.3. The access to the .zfs directory via samba & windows is a useful workaround and makes life easier. I hope you guys can solve this issue. Maybe there is someone which can adapt the samba-zfs patches (see the links) for the current samba version and the integration into ZOL. Issue #621 is maybe a good starting point. This will be a huge advantage for ZOL on Linux Servers and the huge community of Windows users. http://www.edplese.com/blog/2009/12/02/samba-shadow_copy2-enhancements/ |
Well, #621 doesn't touch samba at all. It only uses the 'net' command to add/delete shares, nothing else. So I fail to see that that could solve the problem. |
Dunno if it's the same problem or not, but I was getting a similar NULL deref issues during rsync backups of my zfs /export filesystem. rsync died so the backup didn't complete. I figured out how to avoid triggering the bug by excluding .zfs from rsync (on both src and dest) as described here: http://blog.taz.net.au/2012/04/01/rsync-and-zfs-snapshot-directories/ (which is what i want to do anyway, bug or no bug) system is running debian sid, kernel package linux-image-3.2.0-2-amd64 , zfs ubuntu-ppa 0.6.0.55 (recompiled for debian), linux 3.2. Apr 1 09:39:08 ganesh kernel: [593708.881748] BUG: unable to handle kernel NULL pointer dereference at 000000000000005f Apr 1 09:39:08 ganesh kernel: [593708.881763] IP: [] follow_managed+0x19a/0x1fb Apr 1 09:39:08 ganesh kernel: [593708.881779] PGD 24c2a2067 PUD 23d86f067 PMD 0 Apr 1 09:39:08 ganesh kernel: [593708.881790] Oops: 0000 [#9] SMP Apr 1 09:39:08 ganesh kernel: [593708.881797] CPU 5 Apr 1 09:39:08 ganesh kernel: [593708.881801] Modules linked in: nfnetlink_log ipt_MASQUERADE xt_CHECKSUM iptable_mangle nf_conntrack_netlink nfnetlink xt_comment xt_pkttype xt_recent ipt_REDIRECT xt_tcpudp xt_multiport xt_state ipt_REJECT ipt_LOG iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables sch_sfq cls_u32 sch_cbq pppoe pppox ppp_generic slhc ipt_ULOG powernow_k8 mperf cpufreq_stats cpufreq_userspace cpufreq_powersave cpufreq_conservative parport_pc ppdev lp parport bnep rfcomm bluetooth binfmt_misc uinput deflate ctr twofish_generic twofish_x86_64_3way twofish_x86_64 twofish_common camellia serpent blowfish_generic blowfish_x86_64 blowfish_common cast5 des_generic cbc cryptd aes_x86_64 aes_generic xcbc rmd160 sha512_generic sha256_generic sha1_generic hmac crypto_null af_key fuse nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc bridge stp ext4 crc16 jbd2 mbcache virtio_balloon virtio_pci virtio_ring virtio it87 hwmon Apr 1 09:39:08 ganesh kernel: _vid snd_usb_audio snd_usbmidi_lib xt_mac x_tables tun kvm_amd kvm snd_hda_codec_hdmi zfs(P) zunicode(P) zavl(P) zcommon(P) znvpair(P) spl(O) mt2060 ir_lirc_codec lirc_dev ir_mce_kbd_decoder snd_hda_codec_realtek ir_sony_decoder ir_jvc_decoder ir_rc6_decoder ir_rc5_decoder dvb_usb_dib0700 dib8000 dib7000m dib0090 dib0070 dib7000p dib3000mc dibx000_common dvb_usb ir_nec_decoder snd_hda_intel dvb_core rc_core snd_pcm_oss snd_hda_codec snd_mixer_oss snd_hwdep snd_seq_midi snd_pcm snd_seq_midi_event snd_page_alloc snd_rawmidi joydev snd_seq snd_seq_device snd_timer snd eeepc_wmi psmouse asus_wmi sp5100_tco sparse_keymap rfkill i2c_piix4 evdev serio_raw soundcore pcspkr k10temp mxm_wmi edac_mce_amd edac_core wmi button processor xfs btrfs crc32c libcrc32c zlib_deflate dm_mod thermal fan thermal_sys raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 md_mod sata_mv usbhid hid usb_storage uas sd_mod crc_t10dif nvidia(P) uhci_hcd ohci_hcd firewire_ohci firewi Apr 1 09:39:08 ganesh kernel: re_core crc_itu_t r8169 mii ahci libahci ehci_hcd mpt2sas raid_class scsi_transport_sas libata xhci_hcd i2c_core usbcore scsi_mod usb_common [last unloaded: scsi_wait_scan] Apr 1 09:39:08 ganesh kernel: [593708.882133] Apr 1 09:39:08 ganesh kernel: [593708.882139] Pid: 30254, comm: rsync Tainted: P D O 3.2.0-2-amd64 #1 To be filled by O.E.M. To be filled by O.E.M./SABERTOOTH 990FX Apr 1 09:39:08 ganesh kernel: [593708.882154] RIP: 0010:[] [] follow_managed+0x19a/0x1fb Apr 1 09:39:08 ganesh kernel: [593708.882166] RSP: 0018:ffff880186923ca8 EFLAGS: 00010292 Apr 1 09:39:08 ganesh kernel: [593708.882172] RAX: 0000000000000100 RBX: ffff880186923e00 RCX: 0000000000160015 Apr 1 09:39:08 ganesh kernel: [593708.882179] RDX: 0000000000000000 RSI: 0000000000000101 RDI: 000000000000005f Apr 1 09:39:08 ganesh kernel: [593708.882186] RBP: ffff880186923e00 R08: ffff88030ce312c0 R09: 0000000000000000 Apr 1 09:39:08 ganesh kernel: [593708.882193] R10: ffff8803e8eb2880 R11: ffff8803e8eb2880 R12: 0000000000000000 Apr 1 09:39:08 ganesh kernel: [593708.882200] R13: ffff88018af9d160 R14: ffff88041aee12c0 R15: 0000000000000000 Apr 1 09:39:08 ganesh kernel: [593708.882209] FS: 00007ffff7fb2700(0000) GS:ffff88042fd40000(0000) knlGS:00000000f7ca4b70 Apr 1 09:39:08 ganesh kernel: [593708.882216] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 1 09:39:08 ganesh kernel: [593708.882222] CR2: 000000000000005f CR3: 000000017b072000 CR4: 00000000000006e0 Apr 1 09:39:08 ganesh kernel: [593708.882230] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 1 09:39:08 ganesh kernel: [593708.882237] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 1 09:39:08 ganesh kernel: [593708.882245] Process rsync (pid: 30254, threadinfo ffff880186922000, task ffff88018af9d160) Apr 1 09:39:08 ganesh kernel: [593708.882251] Stack: Apr 1 09:39:08 ganesh kernel: [593708.882255] ffff8802fd969480 ffffffff811090d3 ffff88030ce312c0 000001000000005f Apr 1 09:39:08 ganesh kernel: [593708.882268] 000000000000005f ffff880186923e68 ffff880186923e00 ffff880186923e78 Apr 1 09:39:08 ganesh kernel: [593708.882279] ffff8802fd969480 ffff88041aee12c0 000000000000005f ffffffff81101dc8 Apr 1 09:39:08 ganesh kernel: [593708.882290] Call Trace: Apr 1 09:39:08 ganesh kernel: [593708.882301] [] ? dput+0x27/0xee Apr 1 09:39:08 ganesh kernel: [593708.882310] [] ? walk_component+0x2d4/0x406 Apr 1 09:39:08 ganesh kernel: [593708.882320] [] ? do_last+0x108/0x58d Apr 1 09:39:08 ganesh kernel: [593708.882329] [] ? path_openat+0xce/0x32a Apr 1 09:39:08 ganesh kernel: [593708.882352] [] ? tsd_hash_search+0x78/0x146 [spl] Apr 1 09:39:08 ganesh kernel: [593708.882362] [] ? do_filp_open+0x2a/0x6e Apr 1 09:39:08 ganesh kernel: [593708.882372] [] ? _cond_resched+0x7/0x1c Apr 1 09:39:08 ganesh kernel: [593708.882382] [] ? __strncpy_from_user+0x18/0x48 Apr 1 09:39:08 ganesh kernel: [593708.882391] [] ? alloc_fd+0x64/0x109 Apr 1 09:39:08 ganesh kernel: [593708.882400] [] ? do_sys_open+0x5e/0xe5 Apr 1 09:39:08 ganesh kernel: [593708.882409] [] ? system_call_fastpath+0x16/0x1b Apr 1 09:39:08 ganesh kernel: [593708.882415] Code: f0 89 c2 74 6e 85 c0 75 1a 48 89 df e8 68 fd ff ff 48 89 2b 48 8b 7d 20 e8 c3 fa ff ff 48 89 43 08 eb 50 85 d2 78 14 48 8b 7b 08 <8b> 07 89 c5 81 e5 00 00 07 00 0f 85 8f fe ff ff 45 84 e4 74 15 Apr 1 09:39:08 ganesh kernel: [593708.882489] RIP [] follow_managed+0x19a/0x1fb Apr 1 09:39:08 ganesh kernel: [593708.882498] RSP Apr 1 09:39:08 ganesh kernel: [593708.882502] CR2: 000000000000005f Apr 1 09:39:08 ganesh kernel: [593708.882508] ---[ end trace c8e4fc1a7428f71a ]--- |
It sure does look similar. A slightly different call path but probably the same issue. As an aside, you can set the .zfs directory to be hidden so things like rsync won't try and traverse in to it. It just won't appear in the directory list but you can still always manually change in to that directory. zfs set snapdir=hidden tank/fish |
thanks for the tip. i should have remembered that. i've added it to my blog page. |
I'm seeing the same thing in my VM when I use bash completion of a snapshot directory from a level up. |
@rlaager If your able to consistently reproduce this can you post in the exact command. If I could reproduce it that would help me considerably debugging it. |
@behlendorf: zfs snapshot rpool/srv@test; cd /srv/.zfs/snapshot; cd test It seems that any access to the snapshot filesystem leads to the BUG dump. But it's only the first access to the snapshot filesystem that causes this. |
I am seeing the same smbd NULL deference bug accessing from Windows XP to kernel 3.5.2 machine running Samba 3.6.6 with the rc10 release of ZOL.
|
I just ran into the same issue today . When accessing the .zfs folder from a windows 7 computer this error shows up in syslog a few times. However , there is an added twist, when I reboot , the volume in question doesn't automount . (I use PPA/Daily with mountall on ubuntu server 12.04.1) I tried this 3 times to be sure it's not a fluke. Access .zfs , wait for the log to fill up a bit , sudo reboot .. file system isn't automounted. I hope someone is able to reproduce this as well.. I don't see how the two things are related tbh.. Regards Phoenixxl. PS: Admit it ! you devs all love it when people report weird shit like this. |
Please see my comment #947 (comment) , except that I get spl panic on a simple bash "cd" command completion |
As using zfs/spl in Debug mode i catched a VERIFY situation today, while accessing .zfs from Windows 7 via a samba share. Maybe this helps to locate the problem. [ 106.252194] SPLError: 4520:0:(zpl_ctldir.c:423:zpl_shares_lookup()) VERIFY3(error <= 0) failed (95 <= 0) |
I tested these patch and can confirm that there are no more stacks/bugs while accessing the .zfs directory from samba. The error while running a simple find over the snapshot directories is already there: The command: find / -name foo.txt This is short "How to to use zfs snapshots with samba" ( > 3.6.7) within windows 7 Assuming that the snapshots in zfs are created with this schema : tank/shared@AutoH-2012-10-14T19:00 then we can add a few lines into the samba config file (smb.conf): [global] [shared] with that done, Windows 7 will show the share (shared) and the snapshots under "recent versions", which can be accessed |
Otherwise it will cause zpl_shares_lookup() to return a invalid pointer when an error occurs. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Yuxuan Shui <yshuiv7@gmail.com> Closes openzfs#626 openzfs#885 openzfs#947 openzfs#977
Historically the SPL cached the system hostid the first time it was accessed. This was done to speed up subsequent accesses. But in practice the system host id is rarely accessed and its inconvenient that it doesn't promptly detect /etc/hostid configuration changes. Therefore, zone_get_hostid() has been updated to always refresh the system hostid reported. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#626
…48aaf5d-4fc0-4683-baf8-fff3eec430ab QA-37442 ZTS: zdb_decompress_zstd failed to decompress the data to the expected length (131133 != 131072)
I am testing two instances of VM´s (Ubuntu 11.04 2.6.38.13 64 bit and OpenSuse 12.1 3.1.9-1.4 64 bit) with latest zfs/spl .56-rc8 and a simple zpool. I configured a zfs dir to share with samba. Created a snapshot. Accessing the .zfs directory from/with Windows 7 causes smbd to fail with a NULL deref.
SUSE: (dmesg)
[19849.673506] BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
[19849.673510] IP: [] follow_managed+0x37/0x140
[19849.673516] PGD 0
[19849.673517] Oops: 0000 [#76] SMP
[19849.673519] CPU 0
[19849.673520] Modules linked in: binfmt_misc iscsi_trgt crc32c_intel zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate fuse vmhgfs vsock acpiphp mperf joydev snd_ens1371 ppdev parport_pc parport shpchp gameport snd_rawmidi e1000 pci_hotplug snd_seq_device sr_mod cdrom floppy sg snd_ac97_codec ac97_bus snd_pcm snd_timer snd mptctl i2c_piix4 pcspkr soundcore vmci vmw_balloon snd_page_alloc button container ac autofs4 usbhid uhci_hcd ehci_hcd processor usbcore thermal_sys ata_generic mptspi mptscsih mptbase scsi_transport_spi vmxnet vmw_pvscsi vmxnet3 [last unloaded: crc32c_intel]
[19849.673547]
[19849.673549] Pid: 9092, comm: smbd Tainted: P D 3.1.9-1.4-default #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[19849.673551] RIP: 0010:[] [] follow_managed+0x37/0x140
[19849.673554] RSP: 0018:ffff8803a0977c78 EFLAGS: 00010246
[19849.673555] RAX: ffff8804239f92c0 RBX: ffff8803a0977d58 RCX: 0000000000000037
[19849.673557] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000005f
[19849.673558] RBP: ffff8804239f92c0 R08: 0000000000000000 R09: dead000000200200
[19849.673559] R10: 000000000000000b R11: ffff8802ad58300c R12: 0000000000000001
[19849.673560] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8803ec35e2c0
[19849.673561] FS: 00007f302fba97c0(0000) GS:ffff88043f200000(0000) knlGS:0000000000000000
[19849.673563] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[19849.673564] CR2: 000000000000005f CR3: 0000000352380000 CR4: 00000000000406f0
[19849.673567] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[19849.673570] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[19849.673571] Process smbd (pid: 9092, threadinfo ffff8803a0976000, task ffff8802984e2300)
[19849.673572] Stack:
[19849.673573] ffff8803d4954670 00ff8803a0977e08 0000000000000001 ffff8803a0977e08
[19849.673576] ffff8803a0977d58 000000000000005f 0000000000000001 ffff8803a0977d80
[19849.673578] ffff8803ec35e2c0 ffffffff81157d05 ffff880300000000 ffff8803a0977e18
[19849.673580] Call Trace:
[19849.673587] [] do_lookup+0x155/0x310
[19849.673591] [] path_lookupat+0x114/0x740
[19849.673594] [] do_path_lookup+0x2c/0xc0
[19849.673598] [] user_path_at_empty+0x5c/0xb0
[19849.673601] [] vfs_fstatat+0x32/0x60
[19849.673603] [] sys_newstat+0x12/0x30
[19849.673607] [] system_call_fastpath+0x16/0x1b
[19849.673610] [<00007f302ca29935>] 0x7f302ca29934
[19849.673611] Code: 24 28 48 89 fb 4c 89 74 24 38 48 89 6c 24 20 41 89 f4 4c 89 6c 24 30 4c 89 7c 24 40 45 31 f6 48 8b 2f c6 44 24 0f 00 48 8b 7b 08 <8b> 07 41 89 c5 41 81 e5 00 00 07 00 75 55 80 7c 24 0f 00 74 05
[19849.673626] RIP [] follow_managed+0x37/0x140
[19849.673628] RSP
[19849.673629] CR2: 000000000000005f
[19849.673631] ---[ end trace 0df84687749fd0a0 ]---
Ubuntu 11.04: (dmesg)
[ 1856.293088] BUG: unable to handle kernel NULL pointer dereference at 000000000000005f
[ 1856.293093] IP: [] follow_managed+0x35/0x130
[ 1856.293099] PGD 0
[ 1856.293100] Oops: 0000 [#50] SMP
[ 1856.293102] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
[ 1856.293104] CPU 0
[ 1856.293105] Modules linked in: dm_crypt vesafb binfmt_misc ppdev vmw_balloon snd_ens1371 gameport snd_ac97_codec ac97_bus psmouse serio_raw snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device joydev snd soundcore snd_page_alloc i2c_piix4 shpchp parport_pc lp parport zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate raid10 raid456 async_pq async_xor async_memcpy async_raid6_recov raid6_pq async_tx raid1 raid0 multipath linear dm_raid45 xor e1000 floppy mptspi mptscsih mptbase scsi_transport_spi usbhid hid
[ 1856.293136]
[ 1856.293138] Pid: 3108, comm: smbd Tainted: P D 2.6.38-13-generic #57-Ubuntu VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 1856.293141] RIP: 0010:[] [] follow_managed+0x35/0x130
[ 1856.293144] RSP: 0018:ffff880123cf1c08 EFLAGS: 00010246
[ 1856.293145] RAX: 000000000000005f RBX: ffff880123cf1d28 RCX: 0000000000000000
[ 1856.293147] RDX: 000000000000005f RSI: 0000000000000001 RDI: 000000000000005f
[ 1856.293148] RBP: ffff880123cf1c58 R08: dead000000200200 R09: 0000000000000000
[ 1856.293149] R10: dead000000100100 R11: dead000000200200 R12: ffff880138ee6f00
[ 1856.293150] R13: ffff880131d3b8c0 R14: 0000000000000001 R15: 0000000000000000
[ 1856.293152] FS: 00007f795e360740(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
[ 1856.293153] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1856.293154] CR2: 000000000000005f CR3: 0000000123c88000 CR4: 00000000000406f0
[ 1856.293158] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1856.293160] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1856.293162] Process smbd (pid: 3108, threadinfo ffff880123cf0000, task ffff8801298e0000)
[ 1856.293163] Stack:
[ 1856.293164] ffff880123cf1dc8 ffff880123cf1d40 ffff880123cf1c58 00ffffff8116f210
[ 1856.293166] ffff880123cf1c58 ffff880123cf1dc8 ffff880123cf1d28 ffff880131d3b8c0
[ 1856.293168] ffff880123cf1d40 ffff880138ee6f00 ffff880123cf1cc8 ffffffff811715c3
[ 1856.293170] Call Trace:
[ 1856.293173] [] do_lookup+0x113/0x2e0
[ 1856.293176] [] ? in_group_p+0x31/0x40
[ 1856.293178] [] link_path_walk+0x656/0xc40
[ 1856.293180] [] ? __do_fault+0x449/0x520
[ 1856.293182] [] do_path_lookup+0x5b/0x160
[ 1856.293184] [] user_path_at+0x57/0xa0
[ 1856.293187] [] ? mutex_lock+0x1e/0x50
[ 1856.293210] [] ? zpl_shares_getattr+0x10a/0x150 [zfs]
[ 1856.293213] [] ? apparmor_inode_getattr+0x54/0x60
[ 1856.293216] [] ? cp_new_stat+0xf8/0x110
[ 1856.293218] [] vfs_fstatat+0x39/0x70
[ 1856.293220] [] vfs_stat+0x1b/0x20
[ 1856.293222] [] sys_newstat+0x1a/0x40
[ 1856.293224] [] system_call_fastpath+0x16/0x1b
[ 1856.293225] Code: 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 0f 1f 44 00 00 4c 8b 27 45 31 ff 48 89 fb 41 89 f6 c6 45 cf 00 48 8b 7b 08 <8b> 07 41 89 c5 41 81 e5 00 00 07 00 75 3f 80 7d cf 00 74 05 4c
[ 1856.293242] RIP [] follow_managed+0x35/0x130
[ 1856.293244] RSP
[ 1856.293245] CR2: 000000000000005f
[ 1856.293247] ---[ end trace 454df8b6fb14de9b ]---
The text was updated successfully, but these errors were encountered: