Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host crashed while running memhotplug guest_sanity tests with latest devel branch #24

Closed
sathnaga opened this issue Nov 6, 2017 · 8 comments

Comments

@sathnaga
Copy link
Member

sathnaga commented Nov 6, 2017

cde:info Mirrored with LTC bug https://bugzilla.linux.ibm.com/show_bug.cgi?id=160986 </cde:info>

Host was running guest_sanity tests.
Kernel: 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le

    lr: d00000000b30e498: kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
    sp: c0000000ae89f850
   msr: 900000010280b033
   dar: d00000002b5bb20c
 dsisr: 40000000
  current = 0xc0000001c4003080
  paca    = 0xc00000000fd8f400   softe: 0        irq_happened: 0x01
    pid   = 46914, comm = CPU 3/KVM
Linux version 4.14.0-1.rc4.dev.gitb27fc5c.el7.centos.ppc64le (mockbuild@host-os-jenkins-slave03.aus.stglabs.ibm.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-17) (GCC)) #1 SMP Fri Oct 20 22:55:44 -02 2017
[66906.130198] KVM: CPU 44 seems to be stuck
[66906.130257] KVM: CPU 46 seems to be stuck
enter ? for help
[c0000000aee2b8b0] d00000000b30e498 kvmppc_book3s_hv_page_fault+0xbb8/0xc40 [kvm_hv]
[c0000000aee2b9e0] d00000000b30a078 kvmppc_vcpu_run_hv+0xdf8/0x1300 [kvm_hv]
[c0000000aee2bb30] d00000000b1348c4 kvmppc_vcpu_run+0x34/0x50 [kvm]
[c0000000aee2bb50] d00000000b130d54 kvm_arch_vcpu_ioctl_run+0x114/0x2a0 [kvm]
[c0000000aee2bbd0] d00000000b1239d8 kvm_vcpu_ioctl+0x598/0x7a0 [kvm]
[c0000000aee2bd40] c0000000003832e0 do_vfs_ioctl+0xd0/0x8c0
[c0000000aee2bde0] c000000000383ba4 SyS_ioctl+0xd4/0x130
[c0000000aee2be30] c00000000000b8e0 system_call+0x58/0x6c
--- Exception: c00 (System Call) at 00007fff8d0b674c
SP (7fff597fde60) is in userspace
8:mon> ```
@sathnaga
Copy link
Member Author

sathnaga commented Nov 6, 2017

jenkins.txt

@paulusmack
Copy link

After some digging, it looks like one vcpu task has handled a hypervisor page fault while the resize code is in the middle of making all the HPTEs absent. The technique which the resize code uses to exclude vcpus from running (set hpte_setup_done to 0 and send an IPI to all CPUs) doesn't actually work since another vcpu task could be in the host handling a page fault or a hcall at the time the IPI is sent, in which case that vcpu task will just handle the IPI and continue to re-enter the guest.

I'm currently trying to think of a reasonable way to fix this...

@cdeadmin
Copy link

cdeadmin commented Nov 7, 2017

------- Comment From bssrikanth@in.ibm.com 2017-11-07 01:47:28 EDT-------
Paul Mackerras seem to have patch which will hopefully fix this issue.. saw his comments on slack channel of host-os..

@sathnaga sathnaga changed the title Host crashed while running guest_sanity tests with latest devel branch Host crashed while running memhotplug guest_sanity tests with latest devel branch Nov 7, 2017
@sathnaga
Copy link
Member Author

sathnaga commented Nov 9, 2017

Seeing the issue is fixed with latest devel branch update, 4.14.0-2.rc8.dev.gitcc4bf22.el7.centos.ppc64le.
will wait for release branch update for this fix to close the issue.

@cdeadmin
Copy link

cdeadmin commented Nov 9, 2017

------- Comment From viparash@in.ibm.com 2017-11-07 04:37:28 EDT-------
*** Bug 160904 has been marked as a duplicate of this bug. ***

@cdeadmin
Copy link

Fixed with latest devel branch update, 4.14.0-2.rc8.dev.gitcc4bf22.el7.centos.ppc64le.

@cdeadmin
Copy link

------- Comment From lagarcia@br.ibm.com 2017-11-10 20:55:34 EDT-------
Sprtin 3 hostos-release branch is closed for new commits. Targeting this one to sprint 4.

Paul,

Could you please cherry-pick this patch into hostos-release as soon as sprint 4 hostos-release branch gets opened?

@sathnaga
Copy link
Member Author

Verified in latest hostos release branch 4.14.0-1.rel.git68b4afb.el7.centos.ppc64le

(7/9) guest_sanity.hotplug.memory.qemu.qcow2.virtio_scsi.smp2.virtio_net.HostOS.ppc64le.powerkvm-libvirt.libvirt_mem.positive_test.hot_plug: PASS (88.70 s)

Regards,
-Satheesh.

malcolmcrossley pushed a commit to malcolmcrossley/linux that referenced this issue Jan 24, 2018
[ Upstream commit dcc3b5f ]

The following warning can be triggered by hot-unplugging the CPU
on which an active SCHED_DEADLINE task is running on:

 ------------[ cut here ]------------
 WARNING: CPU: 7 PID: 0 at kernel/sched/sched.h:833 replenish_dl_entity+0x71e/0xc40
 rq->clock_update_flags < RQCF_ACT_SKIP
 CPU: 7 PID: 0 Comm: swapper/7 Tainted: G    B           4.11.0-rc1+ open-power-host-os#24
 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016
 Call Trace:
  <IRQ>
  dump_stack+0x85/0xc4
  __warn+0x172/0x1b0
  warn_slowpath_fmt+0xb4/0xf0
  ? __warn+0x1b0/0x1b0
  ? debug_check_no_locks_freed+0x2c0/0x2c0
  ? cpudl_set+0x3d/0x2b0
  replenish_dl_entity+0x71e/0xc40
  enqueue_task_dl+0x2ea/0x12e0
  ? dl_task_timer+0x777/0x990
  ? __hrtimer_run_queues+0x270/0xa50
  dl_task_timer+0x316/0x990
  ? enqueue_task_dl+0x12e0/0x12e0
  ? enqueue_task_dl+0x12e0/0x12e0
  __hrtimer_run_queues+0x270/0xa50
  ? hrtimer_cancel+0x20/0x20
  ? hrtimer_interrupt+0x119/0x600
  hrtimer_interrupt+0x19c/0x600
  ? trace_hardirqs_off+0xd/0x10
  local_apic_timer_interrupt+0x74/0xe0
  smp_apic_timer_interrupt+0x76/0xa0
  apic_timer_interrupt+0x93/0xa0

The DL task will be migrated to a suitable later deadline rq once the DL
timer fires and currnet rq is offline. The rq clock of the new rq should
be updated. This patch fixes it by updating the rq clock after holding
the new rq's rq lock.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1488865888-15894-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
liyi-ibm pushed a commit to liyi-ibm/linux that referenced this issue Dec 28, 2018
commit 5cc41e0 upstream.

WHen registering a new binfmt_misc handler, it is possible to overflow
the offset to get a negative value, which might crash the system, or
possibly leak kernel data.

Here is a crash log when 2500000000 was used as an offset:

  BUG: unable to handle kernel paging request at ffff989cfd6edca0
  IP: load_misc_binary+0x22b/0x470 [binfmt_misc]
  PGD 1ef3e067 P4D 1ef3e067 PUD 0
  Oops: 0000 [#1] SMP NOPTI
  Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy
  CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic open-power-host-os#24-Ubuntu
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
  RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc]
  Call Trace:
    search_binary_handler+0x97/0x1d0
    do_execveat_common.isra.34+0x667/0x810
    SyS_execve+0x31/0x40
    do_syscall_64+0x73/0x130
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Use kstrtoint instead of simple_strtoul.  It will work as the code
already set the delimiter byte to '\0' and we only do it when the field
is not empty.

Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX.  Also tested
with examples documented at Documentation/admin-guide/binfmt-misc.rst
and other registrations from packages on Ubuntu.

Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com
Fixes: 1da177e ("Linux-2.6.12-rc2")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
liyi-ibm pushed a commit to liyi-ibm/linux that referenced this issue Dec 28, 2018
[ Upstream commit 9970a8e ]

GC of set uses call_rcu() to destroy elements.
So that elements would be destroyed after destroying sets and chains.
But, elements should be destroyed before destroying sets and chains.
In order to wait calling call_rcu(), a rcu_barrier() is added.

In order to test correctly, below patch should be applied.
https://patchwork.ozlabs.org/patch/940883/

test scripts:
   %cat test.nft
   table ip aa {
	   map map1 {
		   type ipv4_addr : verdict; flags timeout;
		   elements = {
			   0 : jump a0,
			   1 : jump a0,
			   2 : jump a0,
			   3 : jump a0,
			   4 : jump a0,
			   5 : jump a0,
			   6 : jump a0,
			   7 : jump a0,
			   8 : jump a0,
			   9 : jump a0,
		   }
		   timeout 1s;
	   }
	   chain a0 {
	   }
   }
   flush ruleset

   [ ... ]

   table ip aa {
	   map map1 {
		   type ipv4_addr : verdict; flags timeout;
		   elements = {
			   0 : jump a0,
			   1 : jump a0,
			   2 : jump a0,
			   3 : jump a0,
			   4 : jump a0,
			   5 : jump a0,
			   6 : jump a0,
			   7 : jump a0,
			   8 : jump a0,
			   9 : jump a0,
		   }
		   timeout 1s;
	   }
	   chain a0 {
	   }
   }
   flush ruleset

Splat looks like:
[  200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363!
[  200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ open-power-host-os#24
[  200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[  200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables]
[  200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0
4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff
[  200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282
[  200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000
[  200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90
[  200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa
[  200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200
[  200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000
[  200.906354] FS:  00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
[  200.915533] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0
[  200.930353] Call Trace:
[  200.932351]  ? nf_tables_commit+0x26f6/0x2c60 [nf_tables]
[  200.939525]  ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables]
[  200.947525]  ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables]
[  200.952383]  ? nft_add_set_elem+0x1700/0x1700 [nf_tables]
[  200.959532]  ? nla_parse+0xab/0x230
[  200.963529]  ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink]
[  200.968384]  ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
[  200.975525]  ? debug_show_all_locks+0x290/0x290
[  200.980363]  ? debug_show_all_locks+0x290/0x290
[  200.986356]  ? sched_clock_cpu+0x132/0x170
[  200.990352]  ? find_held_lock+0x39/0x1b0
[  200.994355]  ? sched_clock_local+0x10d/0x130
[  200.999531]  ? memset+0x1f/0x40

Fixes: 9d09829 ("netfilter: nft_hash: add support for timeouts")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants