Skip to content

Commit 878940b

Browse files
committed
KVM: VMX: Retry APIC-access page reload if invalidation is in-progress
Re-request an APIC-access page reload if there is a relevant mmu_notifier invalidation in-progress when KVM retrieves the backing pfn, i.e. stall vCPUs until the backing pfn for the APIC-access page is "officially" stable. Relying on the primary MMU to not make changes after invoking ->invalidate_range() works, e.g. any additional changes to a PRESENT PTE would also trigger an ->invalidate_range(), but using ->invalidate_range() to fudge around KVM not honoring past and in-progress invalidations is a bit hacky. Honoring invalidations will allow using KVM's standard mmu_notifier hooks to detect APIC-access page reloads, which will in turn allow removing KVM's implementation of ->invalidate_range() (the APIC-access page case is a true one-off). Opportunistically add a comment to explain why doing nothing if a memslot isn't found is functionally correct. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/r/20230602011518.787006-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent c3a1e11 commit 878940b

File tree

1 file changed

+45
-5
lines changed

1 file changed

+45
-5
lines changed

arch/x86/kvm/vmx/vmx.c

Lines changed: 45 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6714,7 +6714,12 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
67146714

67156715
static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu)
67166716
{
6717-
struct page *page;
6717+
const gfn_t gfn = APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT;
6718+
struct kvm *kvm = vcpu->kvm;
6719+
struct kvm_memslots *slots = kvm_memslots(kvm);
6720+
struct kvm_memory_slot *slot;
6721+
unsigned long mmu_seq;
6722+
kvm_pfn_t pfn;
67186723

67196724
/* Defer reload until vmcs01 is the current VMCS. */
67206725
if (is_guest_mode(vcpu)) {
@@ -6726,18 +6731,53 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu)
67266731
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
67276732
return;
67286733

6729-
page = gfn_to_page(vcpu->kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
6730-
if (is_error_page(page))
6734+
/*
6735+
* Grab the memslot so that the hva lookup for the mmu_notifier retry
6736+
* is guaranteed to use the same memslot as the pfn lookup, i.e. rely
6737+
* on the pfn lookup's validation of the memslot to ensure a valid hva
6738+
* is used for the retry check.
6739+
*/
6740+
slot = id_to_memslot(slots, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT);
6741+
if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
6742+
return;
6743+
6744+
/*
6745+
* Ensure that the mmu_notifier sequence count is read before KVM
6746+
* retrieves the pfn from the primary MMU. Note, the memslot is
6747+
* protected by SRCU, not the mmu_notifier. Pairs with the smp_wmb()
6748+
* in kvm_mmu_invalidate_end().
6749+
*/
6750+
mmu_seq = kvm->mmu_invalidate_seq;
6751+
smp_rmb();
6752+
6753+
/*
6754+
* No need to retry if the memslot does not exist or is invalid. KVM
6755+
* controls the APIC-access page memslot, and only deletes the memslot
6756+
* if APICv is permanently inhibited, i.e. the memslot won't reappear.
6757+
*/
6758+
pfn = gfn_to_pfn_memslot(slot, gfn);
6759+
if (is_error_noslot_pfn(pfn))
67316760
return;
67326761

6733-
vmcs_write64(APIC_ACCESS_ADDR, page_to_phys(page));
6762+
read_lock(&vcpu->kvm->mmu_lock);
6763+
if (mmu_invalidate_retry_hva(kvm, mmu_seq,
6764+
gfn_to_hva_memslot(slot, gfn))) {
6765+
kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
6766+
read_unlock(&vcpu->kvm->mmu_lock);
6767+
goto out;
6768+
}
6769+
6770+
vmcs_write64(APIC_ACCESS_ADDR, pfn_to_hpa(pfn));
6771+
read_unlock(&vcpu->kvm->mmu_lock);
6772+
67346773
vmx_flush_tlb_current(vcpu);
67356774

6775+
out:
67366776
/*
67376777
* Do not pin apic access page in memory, the MMU notifier
67386778
* will call us again if it is migrated or swapped out.
67396779
*/
6740-
put_page(page);
6780+
kvm_release_pfn_clean(pfn);
67416781
}
67426782

67436783
static void vmx_hwapic_isr_update(int max_isr)

0 commit comments

Comments
 (0)