Skip to content
/ linux Public

Commit 6e096db

Browse files
RichardWeiYanggregkh
authored andcommitted
mm/huge_memory: fix early failure try_to_migrate() when split huge pmd for shared THP
commit 9390808 upstream. Commit 60fbb14 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()") return false unconditionally after split_huge_pmd_locked(). This may fail try_to_migrate() early when TTU_SPLIT_HUGE_PMD is specified. The reason is the above commit adjusted try_to_migrate_one() to, when a PMD-mapped THP entry is found, and TTU_SPLIT_HUGE_PMD is specified (for example, via unmap_folio()), return false unconditionally. This breaks the rmap walk and fail try_to_migrate() early, if this PMD-mapped THP is mapped in multiple processes. The user sensible impact of this bug could be: * On memory pressure, shrink_folio_list() may split partially mapped folio with split_folio_to_list(). Then free unmapped pages without IO. If failed, it may not be reclaimed. * On memory failure, memory_failure() would call try_to_split_thp_page() to split folio contains the bad page. If succeed, the PG_has_hwpoisoned bit is only set in the after-split folio contains @split_at. By doing so, we limit bad memory. If failed to split, the whole folios is not usable. One way to reproduce: Create an anonymous THP range and fork 512 children, so we have a THP shared mapped in 513 processes. Then trigger folio split with /sys/kernel/debug/split_huge_pages debugfs to split the THP folio to order 0. Without the above commit, we can successfully split to order 0. With the above commit, the folio is still a large folio. And currently there are two core users of TTU_SPLIT_HUGE_PMD: * try_to_unmap_one() * try_to_migrate_one() try_to_unmap_one() would restart the rmap walk, so only try_to_migrate_one() is affected. We can't simply revert commit 60fbb14 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()"), since it removed some duplicated check covered by page_vma_mapped_walk(). This patch fixes this by restart page_vma_mapped_walk() after split_huge_pmd_locked(). Since we cannot simply return "true" to fix the problem, as that would affect another case: When invoking folio_try_share_anon_rmap_pmd() from split_huge_pmd_locked(), the latter can fail and leave a large folio mapped through PTEs, in which case we ought to return true from try_to_migrate_one(). This might result in unnecessary walking of the rmap but is relatively harmless. Link: https://lkml.kernel.org/r/20260305015006.27343-1-richard.weiyang@gmail.com Fixes: 60fbb14 ("mm/huge_memory: adjust try_to_migrate_one() and split_huge_pmd_locked()") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Tested-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Lance Yang <lance.yang@linux.dev> Reviewed-by: Gavin Guo <gavinguo@igalia.com> Acked-by: David Hildenbrand (arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent f3caaee commit 6e096db

File tree

1 file changed

+9
-3
lines changed

1 file changed

+9
-3
lines changed

mm/rmap.c

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2346,11 +2346,17 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
23462346
/* PMD-mapped THP migration entry */
23472347
if (!pvmw.pte) {
23482348
if (flags & TTU_SPLIT_HUGE_PMD) {
2349+
/*
2350+
* split_huge_pmd_locked() might leave the
2351+
* folio mapped through PTEs. Retry the walk
2352+
* so we can detect this scenario and properly
2353+
* abort the walk.
2354+
*/
23492355
split_huge_pmd_locked(vma, pvmw.address,
23502356
pvmw.pmd, true);
2351-
ret = false;
2352-
page_vma_mapped_walk_done(&pvmw);
2353-
break;
2357+
flags &= ~TTU_SPLIT_HUGE_PMD;
2358+
page_vma_mapped_walk_restart(&pvmw);
2359+
continue;
23542360
}
23552361
#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
23562362
subpage = folio_page(folio,

0 commit comments

Comments
 (0)