Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

BUG ? host crashed while using hugetlbpage with uksmd was stopped #4

Closed
colo-ft opened this issue Nov 26, 2016 · 4 comments
Closed

Comments

@colo-ft
Copy link

colo-ft commented Nov 26, 2016

Hi,

It seems that, uksm doesn't handle hugetlbpage very well.
We encountered host crashed error while use hugetlbpage by device node way.
(That is, we start VM with whose memory backing is /dev/hugepages/libvirt/qemu)
The dump stack is:
[exception RIP: follow_page_mask+945]
RIP: ffffffff811984f1 RSP: ffff88115e9e3dc8 RFLAGS: 00010202
RAX: 0000000000000001 RBX: 00007f0eb8c00000 RCX: 00003ffffffff000
RDX: ffff880000000e30 RSI: 00007f0eb8c00000 RDI: 800000043aa000e7
RBP: ffff88115e9e3e10 R8: ffff88115e9e3ef0 R9: ffff8811c4288e30
R10: 0000000020000000 R11: ffff8811d146f080 R12: ffff881027548438
R13: ffff8811d146f080 R14: ffff88115e9e3e24 R15: 0000000000000004
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffff88115e9e3dc0] follow_page_mask at ffffffff8119830f
#9 [ffff88115e9e3e18] break_ksm at ffffffff811bd075
#10 [ffff88115e9e3e58] ksm_madvise at ffffffff811c750f
#11 [ffff88115e9e3e90] sys_madvise at ffffffff81194947
#12 [ffff88115e9e3f80] system_call_fastpath at ffffffff81652189
RIP: 00007f0fbe7150f7 RSP: 00007f0bb41feb30 RFLAGS: 00000206
RAX: 000000000000001c RBX: ffffffff81652189 RCX: ffffffffffffffff
RDX: 000000000000000d RSI: 0000000020000000 RDI: 00007f0eb8c00000
RBP: 00007f0bb41fe9c0 R8: 00007f0fbe66cd38 R9: 0000000400000000
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000000000
R13: 0000000020000000 R14: 0000000000000000 R15: 0000000020000000
ORIG_RAX: 000000000000001c CS: 0033 SS: 002b

The kernel version of host is 3.10.
It also reports such information in core dump ‘PANIC: "kernel BUG at mm/memory.c:1576!"
The related codes here is:
struct page *follow_page_mask(struct vm_area_struct *vma,
unsigned long address, unsigned int flags,
unsigned int *page_mask)
{
... ...

pmd = pmd_offset(pud, address);
if (pmd_none(*pmd))
	goto no_page_table;
if (pmd_huge(*pmd) && vma->vm_flags & VM_HUGETLB) {
	BUG_ON(flags & FOLL_GET); ---------------------> here.
	page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE);
	goto out;
}

Although codes in high version has changed here, but I believe, it still has
problems here, and i think this is caused by uksm which doesn't handle vma of hugetlbpage
properly, though, we are supposed not to support merging hugetlbpage in uksm.
I investigated this issue, and seemed to find the answer, the code path here is
SyS_mmap
--> SyS_mmap_pgoff
--> vm_mmap_pgoff
--> do_mmap_pgoff
--> mmap_region
--> vma_merge
--> vma_adjust
-->uksm_vma_add_new (We add this vma without VM_HUGETLB flag to uksm !!!!)
--> file->f_op->mmap(file, vma) (callback function here is hugetlbfs_file_mmap,it changed the flag value vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND;)
--> uksm_vma_add_new (It is OK this time)

Any idea about this problem ?

@naixia
Copy link
Collaborator

naixia commented Nov 26, 2016

  • Add uksm_flags_can_scan() check in ksm_madvise(), where areas with VM_HUGETLB don't need to be MADV_UNMERGEABLE.
  • Similarly, in try_down_read_slot_mmap_sem(), filter out this VMA just before it goes to be scanned.

Since it's your special workload, you can try this fix and feedback here with the patch, if it works.

@colo-ft
Copy link
Author

colo-ft commented Nov 27, 2016

Hi Mr. Xia

Thank you for your reply.

So the main problem here is we should not madvise(MAD_UNMERGEABLE) the vma of hugetlb page,
and there is no problem with hugetlb with uksm enabled ? I mean, will uksm ingores the hugetlb vma ?

I think your solution works.
Actually, I fix this problem by the follow modification:
--- a/mm/uksm.c
+++ b/mm/uksm.c
@@ -5547,6 +5547,10 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
if (!(*vm_flags & VM_MERGEABLE))

		return 0;		/* just ignore the advice */
  •  if (unlikely(vma->vm_flags & VM_HUGETLB)) { 
    
  •  	return 0; /* ignore the hugepage area */
    
  •  }
    
  •  if (vma->anon_vma) {
     	err = unmerge_uksm_pages(vma, start, end);
     	if (err)
    

--
Or we can fix it like that:
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 6304e91..c6a27b1 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -111,6 +111,10 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma)
* and ia64).
*/
vma->vm_flags |= VM_HUGETLB | VM_DONTEXPAND;
+#ifdef CONFIG_UKSM

  •   /* We can't support page merge for hugepage */
    
  •   vma->vm_flags &= ~VM_MERGEABLE;
    

+#endif
vma->vm_ops = &hugetlb_vm_ops;

    if (vma->vm_pgoff & (~huge_page_mask(h) >> PAGE_SHIFT))

--
1.8.3.1

Thanks :)

@naixia
Copy link
Collaborator

naixia commented Nov 28, 2016

UKSM will BUG() when it tries to scan a hugetlb vma. If you really observe this BUG(), apply the second fix i mentioned above.

@colo-ft
Copy link
Author

colo-ft commented Nov 29, 2016

Great, since now, we didn't catch it, thank you :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants