Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

sathnaga · 2017-11-06T03:36:54Z

cde:info Mirrored with LTC bug https://bugzilla.linux.ibm.com/show_bug.cgi?id=160986 </cde:info>

Host was running guest_sanity tests.
Kernel: 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le

    lr: d00000000b30e498: kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
    sp: c0000000ae89f850
   msr: 900000010280b033
   dar: d00000002b5bb20c
 dsisr: 40000000
  current = 0xc0000001c4003080
  paca    = 0xc00000000fd8f400   softe: 0        irq_happened: 0x01
    pid   = 46914, comm = CPU 3/KVM
Linux version 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le (mockbuild@host-os-jenkins-slave03.aus.stglabs.ibm.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-17) (GCC)) #1 SMP Fri Oct 20 22:55:44 -02 2017
[66906.130198] KVM: CPU 44 seems to be stuck
[66906.130257] KVM: CPU 46 seems to be stuck
enter ? for help
[c0000000aee2b8b0] d00000000b30e498 kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[c0000000aee2b9e0] d00000000b30a078 kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
[c0000000aee2bb30] d00000000b1348c4 kvmppc_vcpu_run+0x34/0x50 [kvm]
[c0000000aee2bb50] d00000000b130d54 kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
[c0000000aee2bbd0] d00000000b1239d8 kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[c0000000aee2bd40] c0000000003832e0 do_vfs_ioctl+0xd0/0x8c0
[c0000000aee2bde0] c000000000383ba4 SyS_ioctl+0xd4/0x130
[c0000000aee2be30] c00000000000b8e0 system_call+0x58/0x6c
--- Exception: c00 (System Call) at 00007fff8d0b674c
SP (7fff597fde60) is in userspace
8:mon> ```

The text was updated successfully, but these errors were encountered:

sathnaga · 2017-11-06T03:37:23Z

jenkins.txt

paulusmack · 2017-11-07T00:41:19Z

After some digging, it looks like one vcpu task has handled a hypervisor page fault while the resize code is in the middle of making all the HPTEs absent. The technique which the resize code uses to exclude vcpus from running (set hpte_setup_done to 0 and send an IPI to all CPUs) doesn't actually work since another vcpu task could be in the host handling a page fault or a hcall at the time the IPI is sent, in which case that vcpu task will just handle the IPI and continue to re-enter the guest.

I'm currently trying to think of a reasonable way to fix this...

cdeadmin · 2017-11-07T06:55:41Z

------- Comment From bssrikanth@in.ibm.com 2017-11-07 01:47:28 EDT-------
Paul Mackerras seem to have patch which will hopefully fix this issue.. saw his comments on slack channel of host-os..

sathnaga · 2017-11-09T15:30:05Z

Seeing the issue is fixed with latest devel branch update, 4.14.0-2.rc8.dev.gitcc4bf22.el7.centos.ppc64le.
will wait for release branch update for this fix to close the issue.

cdeadmin · 2017-11-09T15:30:24Z

------- Comment From viparash@in.ibm.com 2017-11-07 04:37:28 EDT-------
*** Bug 160904 has been marked as a duplicate of this bug. ***

cdeadmin · 2017-11-11T01:05:20Z

Fixed with latest devel branch update, 4.14.0-2.rc8.dev.gitcc4bf22.el7.centos.ppc64le.

cdeadmin · 2017-11-11T02:05:19Z

------- Comment From lagarcia@br.ibm.com 2017-11-10 20:55:34 EDT-------
Sprtin 3 hostos-release branch is closed for new commits. Targeting this one to sprint 4.

Paul,

Could you please cherry-pick this patch into hostos-release as soon as sprint 4 hostos-release branch gets opened?

sathnaga · 2017-11-28T11:49:27Z

Verified in latest hostos release branch 4.14.0-1.rel.git68b4afb.el7.centos.ppc64le

(7/9) guest_sanity.hotplug.memory.qemu.qcow2.virtio_scsi.smp2.virtio_net.HostOS.ppc64le.powerkvm-libvirt.libvirt_mem.positive_test.hot_plug: PASS (88.70 s)

Regards,
-Satheesh.

[ Upstream commit dcc3b5f ] The following warning can be triggered by hot-unplugging the CPU on which an active SCHED_DEADLINE task is running on: ------------[ cut here ]------------ WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40 rq->clock_update_flags < RQCF_ACT_SKIP CPU: 7 PID: 0 Comm: swapper/7 Tainted: G B 4.11.0-rc1+ open-power-host-os#24 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: <IRQ> dump_stack+0x85/0xc4 __warn+0x172/0x1b0 warn_slowpath_fmt+0xb4/0xf0 ? __warn+0x1b0/0x1b0 ? debug_check_no_locks_freed+0x2c0/0x2c0 ? cpudl_set+0x3d/0x2b0 replenish_dl_entity+0x71e/0xc40 enqueue_task_dl+0x2ea/0x12e0 ? dl_task_timer+0x777/0x990 ? __hrtimer_run_queues+0x270/0xa50 dl_task_timer+0x316/0x990 ? enqueue_task_dl+0x12e0/0x12e0 ? enqueue_task_dl+0x12e0/0x12e0 __hrtimer_run_queues+0x270/0xa50 ? hrtimer_cancel+0x20/0x20 ? hrtimer_interrupt+0x119/0x600 hrtimer_interrupt+0x19c/0x600 ? trace_hardirqs_off+0xd/0x10 local_apic_timer_interrupt+0x74/0xe0 smp_apic_timer_interrupt+0x76/0xa0 apic_timer_interrupt+0x93/0xa0 The DL task will be migrated to a suitable later deadline rq once the DL timer fires and currnet rq is offline. The rq clock of the new rq should be updated. This patch fixes it by updating the rq clock after holding the new rq's rq lock. Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1488865888-15894-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 5cc41e0 upstream. WHen registering a new binfmt_misc handler, it is possible to overflow the offset to get a negative value, which might crash the system, or possibly leak kernel data. Here is a crash log when 2500000000 was used as an offset: BUG: unable to handle kernel paging request at ffff989cfd6edca0 IP: load_misc_binary+0x22b/0x470 [binfmt_misc] PGD 1ef3e067 P4D 1ef3e067 PUD 0 Oops: 0000 [#1] SMP NOPTI Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic open-power-host-os#24-Ubuntu Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc] Call Trace: search_binary_handler+0x97/0x1d0 do_execveat_common.isra.34+0x667/0x810 SyS_execve+0x31/0x40 do_syscall_64+0x73/0x130 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Use kstrtoint instead of simple_strtoul. It will work as the code already set the delimiter byte to '\0' and we only do it when the field is not empty. Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested with examples documented at Documentation/admin-guide/binfmt-misc.rst and other registrations from packages on Ubuntu. Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

[ Upstream commit 9970a8e ] GC of set uses call_rcu() to destroy elements. So that elements would be destroyed after destroying sets and chains. But, elements should be destroyed before destroying sets and chains. In order to wait calling call_rcu(), a rcu_barrier() is added. In order to test correctly, below patch should be applied. https://patchwork.ozlabs.org/patch/940883/ test scripts: %cat test.nft table ip aa { map map1 { type ipv4_addr : verdict; flags timeout; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset [ ... ] table ip aa { map map1 { type ipv4_addr : verdict; flags timeout; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } timeout 1s; } chain a0 { } } flush ruleset Splat looks like: [ 200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363! [ 200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ open-power-host-os#24 [ 200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables] [ 200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff [ 200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282 [ 200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000 [ 200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90 [ 200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa [ 200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200 [ 200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000 [ 200.906354] FS: 00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 200.915533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0 [ 200.930353] Call Trace: [ 200.932351] ? nf_tables_commit+0x26f6/0x2c60 [nf_tables] [ 200.939525] ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables] [ 200.947525] ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables] [ 200.952383] ? nft_add_set_elem+0x1700/0x1700 [nf_tables] [ 200.959532] ? nla_parse+0xab/0x230 [ 200.963529] ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink] [ 200.968384] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 200.975525] ? debug_show_all_locks+0x290/0x290 [ 200.980363] ? debug_show_all_locks+0x290/0x290 [ 200.986356] ? sched_clock_cpu+0x132/0x170 [ 200.990352] ? find_held_lock+0x39/0x1b0 [ 200.994355] ? sched_clock_local+0x10d/0x130 [ 200.999531] ? memset+0x1f/0x40 Fixes: 9d09829 ("netfilter: nft_hash: add support for timeouts") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sathnaga changed the title ~~Host crashed while running guest_sanity tests with latest devel branch~~ Host crashed while running memhotplug guest_sanity tests with latest devel branch Nov 7, 2017

sathnaga mentioned this issue Nov 7, 2017

vm fails to resume(start) after memhotplug+managedsave+start sequence on latest devel branch open-power-host-os/qemu#28

Closed

sathnaga closed this as completed Nov 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

sathnaga commented Nov 6, 2017 •

edited by cdeadmin

Loading

sathnaga commented Nov 6, 2017

paulusmack commented Nov 7, 2017

cdeadmin commented Nov 7, 2017

sathnaga commented Nov 9, 2017 •

edited

Loading

cdeadmin commented Nov 9, 2017

cdeadmin commented Nov 11, 2017

cdeadmin commented Nov 11, 2017

sathnaga commented Nov 28, 2017

Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

Comments

sathnaga commented Nov 6, 2017 • edited by cdeadmin Loading

sathnaga commented Nov 6, 2017

paulusmack commented Nov 7, 2017

cdeadmin commented Nov 7, 2017

sathnaga commented Nov 9, 2017 • edited Loading

cdeadmin commented Nov 9, 2017

cdeadmin commented Nov 11, 2017

cdeadmin commented Nov 11, 2017

sathnaga commented Nov 28, 2017

sathnaga commented Nov 6, 2017 •

edited by cdeadmin

Loading

sathnaga commented Nov 9, 2017 •

edited

Loading