Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

Illumos taskq port #54

Closed
wants to merge 4 commits into from
Closed

Illumos taskq port #54

wants to merge 4 commits into from

Conversation

prakashsurya
Copy link
Member

I'm currently running down a kernel crash when insmod'ing zfs with the illumos-taskq-port patched zfs and spl packages.. But I thought I'd send this up for you to look over in the meantime.

BTW.. Here's the bt from crash on the dump. I think it's due to a NULL pointer dereference.

crash> bt
PID: 8354   TASK: ffff880116689540  CPU: 1   COMMAND: "insmod"
 #0 [ffff8801169638a0] machine_kexec at ffffffff810312bb
 #1 [ffff880116963900] crash_kexec at ffffffff810b6742
 #2 [ffff8801169639d0] oops_end at ffffffff814df070
 #3 [ffff880116963a00] die at ffffffff8100f2eb
 #4 [ffff880116963a30] do_trap at ffffffff814de964
 #5 [ffff880116963a90] do_invalid_op at ffffffff8100ceb5
 #6 [ffff880116963b30] invalid_op at ffffffff8100bf5b
    [exception RIP: kfree+666]
    RIP: ffffffff8115c4da  RSP: ffff880116963be8  RFLAGS: 00010046
    RAX: ffffea0003d400a8  RBX: ffff880118003af0  RCX: ffff880118003708
    RDX: 0040000000000000  RSI: 0000000000000038  RDI: ffff880118003af0
    RBP: ffff880116963c48   R8: ffff880116963bd0   R9: 0000000000000001
    R10: 00000000ffffffff  R11: 0000000000000000  R12: ffffffffa01c3616
    R13: 0000000000000082  R14: 0000000000000008  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff880116963c50] kmem_free_debug at ffffffffa01c3616 [spl]
 #8 [ffff880116963c60] __taskq_destroy at ffffffffa01c4705 [spl]
 #9 [ffff880116963c90] spa_deactivate at ffffffffa064e4a0 [zfs]
#10 [ffff880116963cc0] spa_open_common at ffffffffa06562e7 [zfs]
#11 [ffff880116963d30] spa_open at ffffffffa06564b3 [zfs]
#12 [ffff880116963d40] dsl_dir_open_spa at ffffffffa063c1b3 [zfs]
#13 [ffff880116963de0] dmu_objset_find_spa at ffffffffa061d346 [zfs]
#14 [ffff880116963eb0] zvol_create_minors at ffffffffa06a8181 [zfs]
#15 [ffff880116963ed0] zvol_init at ffffffffa06a82a1 [zfs]
#16 [ffff880116963ef0] _init at ffffffffa0682b82 [zfs]
#17 [ffff880116963f10] init_module at ffffffffa0682c93 [zfs]
#18 [ffff880116963f20] do_one_initcall at ffffffff8100204c
#19 [ffff880116963f50] sys_init_module at ffffffff810ace2f
#20 [ffff880116963f80] system_call_fastpath at ffffffff8100b172
    RIP: 00000036b82e5dea  RSP: 00007fffd3566ec8  RFLAGS: 00000217
    RAX: 00000000000000af  RBX: ffffffff8100b172  RCX: 00000036b82e5f3a
    RDX: 00000000020b4010  RSI: 0000000001684835  RDI: 00007f37dcfeb010
    RBP: 00000000020b4010   R8: 0000000002001000   R9: 0000000001001000
    R10: 00000036b82d8460  R11: 0000000000000202  R12: 0000000002000000
    R13: 0000000001684835  R14: 00007fffd3568837  R15: 0000000001684835
    ORIG_RAX: 00000000000000af  CS: 0033  SS: 002b
------------[ cut here ]------------
kernel BUG at mm/slab.c:522!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/kernel/mm/ksm/run
CPU 1 
Modules linked in: zfs(P+)(U) zcommon(P)(U) zunicode(P)(U) znvpair(P)(U) zavl(P)(U) splat(U) spl(U) fuse autofs4 nfs lockd fscache(T) nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zlib_deflate dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc 8139too 8139cp mii i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: spl]

Modules linked in: zfs(P+)(U) zcommon(P)(U) zunicode(P)(U) znvpair(P)(U) zavl(P)(U) splat(U) spl(U) fuse autofs4 nfs lockd fscache(T) nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 zlib_deflate dm_mirror dm_region_hash dm_log vhost_net macvtap macvlan tun virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc 8139too 8139cp mii i2c_piix4 i2c_core sg ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif virtio_pci virtio_ring virtio ata_generic pata_acpi ata_piix dm_mod [last unloaded: spl]
Pid: 8354, comm: insmod Tainted: P           ---------------- T 2.6.32-131.12.1.1chaos.ch5.x86_64 #1 Bochs
RIP: 0010:[]  [] kfree+0x29a/0x320
RSP: 0018:ffff880116963be8  EFLAGS: 00010046
RAX: ffffea0003d400a8 RBX: ffff880118003af0 RCX: ffff880118003708
RDX: 0040000000000000 RSI: 0000000000000038 RDI: ffff880118003af0
RBP: ffff880116963c48 R08: ffff880116963bd0 R09: 0000000000000001
R10: 00000000ffffffff R11: 0000000000000000 R12: ffffffffa01c3616
R13: 0000000000000082 R14: 0000000000000008 R15: 0000000000000000
FS:  00007f37e0fb4700(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f37de60000f CR3: 00000001168ff000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 8354, threadinfo ffff880116962000, task ffff880116689540)
Stack:
 ffff880116963c08 ffff880115e8aac0 ffff880112b0bef8 0000000000000000
<0> 0000000000000000 0000000000000000 ffff880116963c28 ffff8801168f9f00
<0> 0000000000000001 ffff8801168f9f50 0000000000000008 0000000000000000
Call Trace:
 [] kmem_free_debug+0x16/0x20 [spl]
 [] __taskq_destroy+0xa5/0xf0 [spl]
 [] spa_deactivate+0x60/0x190 [zfs]
 [] spa_open_common+0x1b7/0x370 [zfs]
 [] spa_open+0x13/0x20 [zfs]
 [] dsl_dir_open_spa+0x3a3/0x470 [zfs]
 [] ? spl__init+0x0/0x20 [zfs]
 [] dmu_objset_find_spa+0x56/0x460 [zfs]
 [] ? zvol_create_minors_cb+0x0/0x40 [zfs]
 [] ? kobj_map+0x1b0/0x1d0
 [] ? zvol_probe+0x0/0xa0 [zfs]
 [] ? spl__init+0x0/0x20 [zfs]
 [] zvol_create_minors+0x91/0xc0 [zfs]
 [] ? spl__init+0x0/0x20 [zfs]
 [] zvol_init+0xf1/0x130 [zfs]
 [] _init+0x22/0x120 [zfs]
 [] spl__init+0x13/0x20 [zfs]
 [] do_one_initcall+0x3c/0x1d0
 [] sys_init_module+0xdf/0x250
 [] system_call_fastpath+0x16/0x1b
Code: 4d c8 89 c2 83 c0 01 49 89 4c d4 18 41 89 04 24 c7 03 00 00 00 00 e9 91 fe ff ff 0f 0b eb fe 48 8b 40 10 48 8b 10 e9 46 fe ff ff <0f> 0b 0f 1f 40 00 eb fa 48 8b 40 10 48 8b 10 66 85 d2 0f 89 d3 
RIP  [] kfree+0x29a/0x320
 RSP 

943 zio_interrupt ends up calling taskq_dispatch with TQ_SLEEP
Reviewed by: Albert Lee <trisk@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Alexey Zaytsev <alexey.zaytsev@nexenta.com>
Reviewed by: Jason Brian King <jason.brian.king@gmail.com>
Reviewed by: George Wilson <gwilson@zfsmail.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Approved by: Gordon Ross <gwr@nexenta.com>
The spl_task structure was moved into the taskq.h file and was renamed
to taskq_ent. This was to align with the naming convention which the ZFS
code assumes (i.e. ZFS assumes this structure to be named taskq_ent).
Preallocated tasks should never make it onto the free list of tasks.
Thus, if an element removed from the free list is found to have it's
TQENT_FLAG_PREALLOC flag set, it is reported as an ASSERT.
Although the __taskq_dispatch function is kind enough to zero out the
tqent_flags variable when it gets called, this variable should probably
get zeroed out by the taskq_alloc function as well.
@prakashsurya
Copy link
Member Author

Ok, I think this is ready to be reviewed and merged if everything checks out. I still need to get a regression test written for the new interface though. Not sure if you want this included when this lands, or if that can go in separately?

@behlendorf
Copy link
Contributor

Thanks for breaking these all up in to separate patches, but in this case I think it makes sense to squash them all in to one. The test case can be done with a follow up path, let just make sure we don't forget to do it. Additional comments inline.

@prakashsurya
Copy link
Member Author

I incorporated your feedback and pushed it all to another branch (illumos-taskq-port2). I'm going to close this pull request and open another one for the updated branch.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants