New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel panic on zfs send #1678

Closed
dward opened this Issue Aug 26, 2013 · 2 comments

Comments

Projects
None yet
2 participants
@dward

dward commented Aug 26, 2013

I'm not sure if this is ZFS related or not. Currently the only way I can reproduce this crash is to replicate ZFS to a slave node frequently. Our fileserver replicates data on a 5 minute interval and crashes about once or twice a day with the following error:

invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu7/cache/index2/shared_cpu_map
CPU 0
Modules linked in: nfs(U) fscache(U) tun(U) nfsd(U) lockd(U) nfs_acl(U) auth_rpcgss(U) sunrpc(U) exportfs(U) ipt_MASQUERADE(U) ipt_addrtype(U) xt_tcpudp(U) xt_state(U) ipt_LOG(U) iptable_mangle(U) iptable_nat(U) nf_nat(U) nf_conntrack_ipv6(U) nf_conntrack_ipv4(U) nf_conntrack(U) nf_defrag_ipv4(U) iptable_filter(U) ip_tables(U) x_tables(U) ib_iser(U) rdma_cm(U) ib_cm(U) iw_cm(U) ib_sa(U) ib_mad(U) ib_core(U) ib_addr(U) iscsi_tcp(U) bnx2i(U) cnic(U) uio(U) ipv6(U) cxgb3i(U) cxgb3(U) mdio(U) libiscsi_tcp(U) libiscsi(U) scsi_transport_iscsi(U) bridge(U) stp(U) llc(U) zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate(U) video(U) output(U) sbs(U) sbshc(U) ipmi_si(U) ipmi_devintf(U) ipmi_msghandler(U) parport_pc(U) lp(U) parport(U) kvm_intel(U) kvm(U) joydev(U) igb(U) snd_seq_dummy(U) serio_raw(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) snd_timer(U) snd(U) soundcore(U) iTCO_wdt(U) snd_page_alloc(U) i2c_i801(U) iTCO_vendor_support(U) pcspkr(U) i2c_core(U) ioatdma(U) dca(U) usb_storage(U) ahci(U) shpchp(U) 3w_sas(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) [last unloaded: microcode]
Pid: 10162, comm: zfs Tainted: P 2.6.32-100.24.3.tn #1 X8DTU
RIP: 0010:[] [] __unlazy_fpu+0x2c/0x8f
RSP: 0018:ffff88031ce37a00 EFLAGS: 00010086
RAX: 00000000ffffffff RBX: ffff88031ceba880 RCX: ffff8801567f4858
RDX: 00000000ffffffff RSI: ffff8801567f4480 RDI: ffff88056fd78e00
RBP: ffff88031ce37a18 R08: 00000000000462aa R09: ffff8801d4a70578
R10: ffff88031ceba3f8 R11: ffff880028214ec0 R12: ffff88031ceba3c0
R13: ffff8801567f4940 R14: 0000000000000000 R15: ffff880028212a40
FS: 00007fcdce5a2b60(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fefb8fe3bd0 CR3: 0000000331489000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process zfs (pid: 10162, threadinfo ffff8801d4a70000, task ffff8801567f4480)
Stack:
0000000000000000 ffff88031ce37a18 ffff88031ceba880 ffff88031ce37a78
<0> ffffffff81010eee ffff88031ce37a38 ffff88031ceba3c0 ffff8801567f4480
<0> 000000001ce37b98 0000000000214ec0 ffff880028214ec0 ffff8806313e0800
Call Trace:
Code: 48 89 e5 53 48 83 ec 10 0f 1f 44 00 00 48 8b 47 08 48 89 fe 8b 40 14 a8 01 74 67 a8 10 48 8b bf 50 05 00 00 74 0b 83 c8 ff 89 c2 <48> 0f ae 27 eb 04 48 0f ae 07 48 8b 56 08 48 8b 8e 50 05 00 00
RIP [] __unlazy_fpu+0x2c/0x8f
RSP

 KERNEL: /usr/lib/debug/lib/modules/2.6.32-100.24.3.tn/vmlinux
DUMPFILE: vmcore  [PARTIAL DUMP]
    CPUS: 8
    DATE: Sun Aug 25 08:48:04 2013
  UPTIME: 02:04:24

LOAD AVERAGE: 1.62, 1.61, 1.69
TASKS: 441
NODENAME: master.tms.local
RELEASE: 2.6.32-100.24.3.tn
VERSION: #1 SMP Mon Nov 28 10:48:19 CST 2011
MACHINE: x86_64 (2133 Mhz)
MEMORY: 24 GB
PANIC: ""
PID: 5201
COMMAND: "smbd"
TASK: ffff88031ceba3c0 [THREAD_INFO: ffff88031ce36000]
CPU: 0
STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 5201 TASK: ffff88031ceba3c0 CPU: 0 COMMAND: "smbd"
#0 [ffff88031ce376b0] machine_kexec at ffffffff8102cc9b
#1 [ffff88031ce37730] crash_kexec at ffffffff810964d4
#2 [ffff88031ce377b8] __unlazy_fpu at ffffffff81010441
#3 [ffff88031ce37800] __die at ffffffff81439bf9
#4 [ffff88031ce37830] die at ffffffff81015639
#5 [ffff88031ce37860] do_trap at ffffffff8143954c
#6 [ffff88031ce378b0] do_invalid_op at ffffffff81013902
#7 [ffff88031ce37950] invalid_op at ffffffff81012b7b
#8 [ffff88031ce379d8] __unlazy_fpu at ffffffff81010441
#9 [ffff88031ce37a20] __switch_to at ffffffff81010eee
#10 [ffff88031ce37a80] schedule at ffffffff814375f3
#11 [ffff88031ce37b10] thread_return at ffffffff814376e5
#12 [ffff88031ce37b30] __sleep_on_page_lock at ffffffff810d5a35
#13 [ffff88031ce37b40] out_of_line_wait_on_bit_lock at ffffffff81437c14
#14 [ffff88031ce37b80] wait_on_page_bit at ffffffff810d5bd2
#15 [ffff88031ce37ba8] wake_up_bit at ffffffff81075ab9
#16 [ffff88031ce37bc0] __pagevec_release at ffffffff810ded4a
#17 [ffff88031ce37be0] wait_on_page_writeback_range at ffffffff810d611c
#18 [ffff88031ce37cc0] __filemap_fdatawrite_range at ffffffff810d62ff
#19 [ffff88031ce37cf0] vfs_fsync_range at ffffffff8113b0a1
#20 [ffff88031ce37d30] generic_write_sync at ffffffff8113b136
#21 [ffff88031ce37d40] generic_file_aio_write at ffffffff810d69c0
#22 [ffff88031ce37d80] nfs_file_write at ffffffffa05c8dec
#23 [ffff88031ce37dd0] do_sync_write at ffffffff81117caf
#24 [ffff88031ce37f00] vfs_write at ffffffff8111843a
#25 [ffff88031ce37f30] sys_pwrite64 at ffffffff811184ee
#26 [ffff88031ce37f80] system_call_fastpath at ffffffff81011db2

RIP: 00007fa14f76e613  RSP: 00007fff752a7538  RFLAGS: 00010202
RAX: 0000000000000012  RBX: ffffffff81011db2  RCX: 0000000000000000
RDX: 000000000000f000  RSI: 00007fa152747664  RDI: 0000000000000014
RBP: 0000000000280000   R8: 000000000000f000   R9: 0000000000280000
R10: 0000000000280000  R11: 0000000000000246  R12: 0000000000000014
R13: 00007fa152747664  R14: 000000000000f000  R15: 0000000000000000
ORIG_RAX: 0000000000000012  CS: 0033  SS: 002b

It's always the source server with ZFS send, the destination never crashes.

The source is also running both samba and NFS.
ZFS/SPL version is 0.6.1.

Any help or suggestions is appreciated. Thanks!

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Sep 3, 2013

Member

@dward If you can certainly update to 0.6.2, it contains numerous bug fixes and improvements. That said, the stack you've posted doesn't contain anything ZFS related, it's all NFS. So aside from the fact that the server is running ZFS there's nothing really implicating it. Sorry I couldn't be more helpful.

Member

behlendorf commented Sep 3, 2013

@dward If you can certainly update to 0.6.2, it contains numerous bug fixes and improvements. That said, the stack you've posted doesn't contain anything ZFS related, it's all NFS. So aside from the fact that the server is running ZFS there's nothing really implicating it. Sorry I couldn't be more helpful.

@behlendorf

This comment has been minimized.

Show comment
Hide comment
@behlendorf

behlendorf Nov 2, 2013

Member

Closing as stale.

Member

behlendorf commented Nov 2, 2013

Closing as stale.

@behlendorf behlendorf closed this Nov 2, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment