Skip to content

Commit 14d1e55

Browse files
sean-jcgregkh
authored andcommitted
KVM: x86: Fix shadow paging use-after-free due to unexpected GFN
commit 0cb2af2 upstream. The shadow MMU computes GFNs for direct shadow pages using sp->gfn plus the SPTE index. This assumption breaks for shadow paging if the guest page tables are modified between VM entries (similar to commit aad885e, "KVM: x86/mmu: Drop/zap existing present SPTE even when creating an MMIO SPTE", 2026-03-27). The flow is as follows: - a PDE is installed for a 2MB mapping, and a page in that area is accessed. KVM creates a kvm_mmu_page consisting of 512 4KB pages; the kvm_mmu_page is marked by FNAME(fetch) as direct-mapped because the guest's mapping is a huge page (and thus contiguous). - the PDE mapping is changed from outside the guest. - the guest accesses another page in the same 2MB area. KVM installs a new leaf SPTE and rmap entry; the SPTE uses the "correct" GFN (i.e. based on the new mapping, as changed in the previous step) but that GFN is outside of the [sp->gfn, sp->gfn + 511] range; therefore the rmap entry cannot be found and removed when the kvm_mmu_page is zapped. - the memslot that covers the first 2MB mapping is deleted, and the kvm_mmu_page for the now-invalid GPA is zapped. However, rmap_remove() only looks at the [sp->gfn, sp->gfn + 511] range established in step 1, and fails to find the rmap entry that was recorded by step 3. - any operation that causes an rmap walk for the same page accessed by step 3 then walks a stale rmap and dereferences a freed kvm_mmu_page. This includes dirty logging or MMU notifier invalidations (e.g., from MADV_DONTNEED). The underlying issue is that KVM's walking of shadow PTEs assumes that if a SPTE is present when KVM wants to install a non-leaf SPTE, then the existing kvm_mmu_page must be for the correct gfn. Because the only way for the gfn to be wrong is if KVM messed up and failed to zap a SPTE... which shouldn't happen, but *actually* only happens in response to a guest write. That bug dates back literally forever, as even the first version of KVM assumes that the GFN matches and walks into the "wrong" shadow page. However, that was only an imprecision until 2032a93 ("KVM: MMU: Don't allocate gfns page for direct mmu pages") came along. Fix it by checking for a target gfn mismatch and zapping the existing SPTE. That way the old SP and rmap entries are gone, KVM installs the rmap in the right location, and everyone is happy. Fixes: 2032a93 ("KVM: MMU: Don't allocate gfns page for direct mmu pages") Fixes: 6aa8b73 ("kvm: userspace interface") Reported-by: Alexander Bulekov <bkov@amazon.com> Reported-by: Fred Griffoul <fgriffo@amazon.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://patch.msgid.link/20260503201029.106481-1-pbonzini@redhat.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
1 parent 9a4a4c9 commit 14d1e55

1 file changed

Lines changed: 14 additions & 21 deletions

File tree

arch/x86/kvm/mmu/mmu.c

Lines changed: 14 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,8 @@ static struct kmem_cache *pte_list_desc_cache;
182182
struct kmem_cache *mmu_page_header_cache;
183183

184184
static void mmu_spte_set(u64 *sptep, u64 spte);
185+
static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp,
186+
u64 *spte, struct list_head *invalid_list);
185187

186188
struct kvm_mmu_role_regs {
187189
const unsigned long cr0;
@@ -1287,19 +1289,6 @@ static void drop_spte(struct kvm *kvm, u64 *sptep)
12871289
rmap_remove(kvm, sptep);
12881290
}
12891291

1290-
static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
1291-
{
1292-
struct kvm_mmu_page *sp;
1293-
1294-
sp = sptep_to_sp(sptep);
1295-
WARN_ON_ONCE(sp->role.level == PG_LEVEL_4K);
1296-
1297-
drop_spte(kvm, sptep);
1298-
1299-
if (flush)
1300-
kvm_flush_remote_tlbs_sptep(kvm, sptep);
1301-
}
1302-
13031292
/*
13041293
* Write-protect on the specified @sptep, @pt_protect indicates whether
13051294
* spte write-protection is caused by protecting shadow page table.
@@ -2466,7 +2455,8 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
24662455
{
24672456
union kvm_mmu_page_role role;
24682457

2469-
if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep))
2458+
if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep) &&
2459+
spte_to_child_sp(*sptep) && spte_to_child_sp(*sptep)->gfn == gfn)
24702460
return ERR_PTR(-EEXIST);
24712461

24722462
role = kvm_mmu_child_role(sptep, direct, access);
@@ -2544,13 +2534,16 @@ static void __link_shadow_page(struct kvm *kvm,
25442534

25452535
BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK);
25462536

2547-
/*
2548-
* If an SPTE is present already, it must be a leaf and therefore
2549-
* a large one. Drop it, and flush the TLB if needed, before
2550-
* installing sp.
2551-
*/
2552-
if (is_shadow_present_pte(*sptep))
2553-
drop_large_spte(kvm, sptep, flush);
2537+
if (is_shadow_present_pte(*sptep)) {
2538+
struct kvm_mmu_page *parent_sp;
2539+
LIST_HEAD(invalid_list);
2540+
2541+
parent_sp = sptep_to_sp(sptep);
2542+
WARN_ON_ONCE(parent_sp->role.level == PG_LEVEL_4K);
2543+
2544+
mmu_page_zap_pte(kvm, parent_sp, sptep, &invalid_list);
2545+
kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, true);
2546+
}
25542547

25552548
spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp));
25562549

0 commit comments

Comments
 (0)