Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel oops with v3.6.0-rc4 on Ubuntu Xenial LTS kernel #284

Closed
toreanderson opened this issue Apr 20, 2019 · 5 comments

Comments

Projects
None yet
2 participants
@toreanderson
Copy link
Contributor

commented Apr 20, 2019

After I hit #283 when attempting a kernel upgrade, I thought I'd simply try upgrading to the latest v3.6 RC. It built fine, but when starting it up, the kernel oopsed:

[81201.607134] jool_siit: loading out-of-tree module taints kernel.
[81201.607246] jool_siit: module verification failed: signature and/or required key missing - tainting kernel
[81201.617897] jool_siit: unknown parameter 'disabled' ignored
[81201.617989] SIIT Jool: SIIT Jool v3.5.7.203 module inserted.
[81201.621067] BUG: unable to handle kernel NULL pointer dereference at           (null)
[81201.626745] IP: [<ffffffffc04e9adb>] __handle_jool_message+0x6b/0x230 [jool_siit]
[81201.642269] PGD 800000043f549067 PUD 4437e6067 PMD 0 
[81201.645248] Oops: 0000 [#1] SMP 
[81201.661372] Modules linked in: jool_siit(OE) ipmi_devintf 8021q garp mrp stp llc mptctl bonding ipmi_ssif ast ttm drm_kms_helper drm gpio_ich intel_powerclamp fb_sys_fops syscopyarea sysfillrect coretemp joydev sysimgblt input_leds lpc_ich 8250_fintek kvm_intel i5500_temp kvm ioatdma i7core_edac mac_hid edac_core irqbypass shpchp ipmi_si ipmi_msghandler xt_mark ip6table_mangle lp nf_conntrack_ipv6 parport nf_defrag_ipv6 ip6table_filter ip6_tables xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables btrfs xor raid6_pq hid_generic igb mptsas i2c_algo_bit mptscsih dca usbhid mptbase ahci ptp libahci hid scsi_transport_sas pps_core fjes
[81201.764705] CPU: 4 PID: 21256 Comm: jool_siit Tainted: G          IOE   4.4.0-144-generic #170~14.04.1-Ubuntu
[81201.783083] Hardware name: SUN MICROSYSTEMS SUN FIRE X4170 SERVER          /ASSY,MOTHERBOARD,X4170, BIOS 07060309 07/10/2013
[81201.802550] task: ffff88043f681980 ti: ffff88043ed1c000 task.ti: ffff88043ed1c000
[81201.821126] RIP: 0010:[<ffffffffc04e9adb>]  [<ffffffffc04e9adb>] __handle_jool_message+0x6b/0x230 [jool_siit]
[81201.840562] RSP: 0018:ffff88043ed1f8c8  EFLAGS: 00010246
[81201.843184] RAX: ffff88006d773100 RBX: ffff88043ed1fb78 RCX: ffffffffc04f0c74
[81201.861503] RDX: ffffffffc04f0c82 RSI: ffffffff81e49360 RDI: 0000000000000000
[81201.864907] RBP: ffff88043ed1fb38 R08: ffff88043ed1f8cf R09: ffff88006d773100
[81201.882610] R10: ffff880442436810 R11: 0000000000000004 R12: ffffffffc04f7d20
[81201.900796] R13: ffff880442436814 R14: ffff8802749af100 R15: ffffffff81efd700
[81201.903353] FS:  00007fa2c9018740(0000) GS:ffff88047fc00000(0000) knlGS:0000000000000000
[81201.922927] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[81201.940437] CR2: 0000000000000000 CR3: 00000004422aa000 CR4: 0000000000000670
[81201.942709] Stack:
[81201.944213]  0000000000000246 0000000202235250 0000000000000001 0000000000000001
[81201.963722]  ffff88047ffef6c0 0000000002235250 0000000000000141 ffff88043f681980
[81201.981645]  ffff88043ed1f998 ffffffff811979f0 0000000000000000 ffff88043f681980
[81201.985080] Call Trace:
[81202.000739]  [<ffffffff811979f0>] ? __alloc_pages_nodemask+0x130/0x250
[81202.003555]  [<ffffffff811ea16c>] ? ___slab_alloc+0x1cc/0x470
[81202.021399]  [<ffffffff817105be>] ? __alloc_skb+0x4e/0x260
[81202.023602]  [<ffffffff811ee18b>] ? __kmalloc_node_track_caller+0x24b/0x2b0
[81202.041976]  [<ffffffff81711b4a>] ? pskb_expand_head+0x6a/0x260
[81202.044198]  [<ffffffff8170eec1>] ? __kmalloc_reserve.isra.34+0x31/0x90
[81202.061763]  [<ffffffff8170ed13>] ? skb_queue_tail+0x43/0x50
[81202.063975]  [<ffffffff81754ba2>] ? __netlink_sendskb+0x42/0x60
[81202.082657]  [<ffffffff81757369>] ? netlink_unicast+0x1c9/0x230
[81202.100446]  [<ffffffffc04e9cc6>] handle_jool_message+0x26/0x40 [jool_siit]
[81202.102717]  [<ffffffff817587e1>] genl_family_rcv_msg+0x1d1/0x390
[81202.121370]  [<ffffffff817589a0>] ? genl_family_rcv_msg+0x390/0x390
[81202.123886]  [<ffffffff81758a20>] genl_rcv_msg+0x80/0xc0
[81202.141130]  [<ffffffff81757939>] netlink_rcv_skb+0xa9/0xc0
[81202.143905]  [<ffffffff81757ff8>] genl_rcv+0x28/0x40
[81202.161358]  [<ffffffff81757303>] netlink_unicast+0x163/0x230
[81202.164143]  [<ffffffff817576eb>] netlink_sendmsg+0x31b/0x390
[81202.182271]  [<ffffffff81707c6e>] sock_sendmsg+0x3e/0x50
[81202.186127]  [<ffffffff817085c6>] ___sys_sendmsg+0x276/0x290
[81202.202494]  [<ffffffff8119d4d7>] ? lru_cache_add_active_or_unevictable+0x27/0x90
[81202.206691]  [<ffffffff813949db>] ? aa_sock_perm+0x4b/0xe0
[81202.223923]  [<ffffffff8170719d>] ? SYSC_getsockname+0xcd/0xe0
[81202.227187]  [<ffffffff81708e92>] __sys_sendmsg+0x42/0x80
[81202.243346]  [<ffffffff81708ee2>] SyS_sendmsg+0x12/0x20
[81202.246560]  [<ffffffff8182d39b>] entry_SYSCALL_64_fastpath+0x22/0xcb
[81202.265280] Code: 00 f6 05 5a f1 00 00 04 0f 85 f9 00 00 00 48 8b 43 20 4c 8d 85 97 fd ff ff 48 c7 c1 74 0c 4f c0 48 c7 c2 82 0c 4f c0 48 8b 78 10 <0f> b7 37 48 83 c7 04 83 ee 04 48 63 f6 e8 f3 49 ff ff 85 c0 74 
[81202.304233] RIP  [<ffffffffc04e9adb>] __handle_jool_message+0x6b/0x230 [jool_siit]
[81202.308917]  RSP <ffff88043ed1f8c8>
[81202.322048] CR2: 0000000000000000
[81202.328054] ---[ end trace 0ab568a8a3bd889c ]---
[81202.331184] init: jool pre-start process (21254) terminated with status 137

I did not investigate exactly at which porint in the Jool initialisation routine the oops occurred as I did not have time to do so during the maintenance window. (I reverted to v3.5.7 with an older LTS kernel instead.) This is what the init script does, in a nutshell:

modprobe jool_siit disabled=1
jool_siit --pool6 --add <pool6>
jool_siit --eamt -add <eam4-1> <eam6-1>
jool_siit --eamt -add <eam4-2> <eam6-2>
[...]
jool_siit --eamt -add <eam4-n> <eam6-n>
jool_siit --pool6791 --add <pool6791>
jool_siit --enable 
@ydahhrk

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

BTW: 3.6.0-rc4 is not really the last release candidate of anything; 3.6 was renamed into 4.0. That's why the sequence is 3.6.0-rc1, 3.6.0-rc2, 3.6.0-rc3, 3.6.0-rc4, 4.0.0-rc5 and 4.0.0.

All of these expect the new command line syntax:

modprobe jool_siit
jool_siit instance add --netfilter -6 <pool6>
jool_siit eamt add <eam4-1> <eam6-1>
jool_siit pool6791 add <pool6791>
@ydahhrk

This comment has been minimized.

Copy link
Member

commented Apr 20, 2019

Whatever the error is, it might be also present in 4.0.0. __handle_jool_message() didn't change.

You sure the new userspace client was installed correctly? I can't reproduce it because none of the jool_siit commands are well-formed (as far as rc4 is concerned), so the requests are shot down long before they reach the kernel.

Check jool_siit --version, please.

@toreanderson

This comment has been minimized.

Copy link
Contributor Author

commented Apr 21, 2019

3.6.0-rc4 is not really the last release candidate of anything; 3.6 was renamed into 4.0.

🤦‍♂

You sure the new userspace client was installed correctly?

No, I'm not sure. I just used my regular install script, tried to fire it up, got that error, and reverted as I didn't have time to debug further at that point. Might be the new client didn't get installed over the old version and I just didn't notice.

I'm closing this issue as it's probably just a dumb user error. Apologies for the noise. I'll let you know If I experience it after I upgrade to 4.0 (including updating all the CLI calls).

@ydahhrk

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

Ok, but

It won't do if all it takes to crash the kernel is to issue a command from an outdated client.

I need to look into this more.

@ydahhrk ydahhrk reopened this Apr 22, 2019

@ydahhrk

This comment has been minimized.

Copy link
Member

commented Apr 22, 2019

Bug confirmed. Fixing.

@ydahhrk ydahhrk closed this in 765ba25 Apr 22, 2019

@ydahhrk ydahhrk added this to the 4.0.1 milestone Apr 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.