Skip to content
Permalink
Sean-Christoph…
Switch branches/tags

Commits on Apr 13, 2021

  1. KVM: x86: Defer tick-based accounting 'til after IRQ handling

    When using tick-based accounting, defer the call to account guest time
    until after servicing any IRQ(s) that happened in the guest (or
    immediately after VM-Exit).  When using tick-based accounting, time is
    accounted to the guest when PF_VCPU is set when the tick IRQ handler
    runs.  The current approach of unconditionally accounting time in
    kvm_guest_exit_irqoff() prevents IRQs that occur in the guest from ever
    being processed with PF_VCPU set, since PF_VCPU ends up being set only
    during the relatively short VM-Enter sequence, which runs entirely with
    IRQs disabled.
    
    Fixes: 87fa7f3 ("x86/kvm: Move context tracking where it belongs")
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Michael Tokarev <mjt@tls.msk.ru>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  2. KVM: x86: Consolidate guest enter/exit logic to common helpers

    Move the enter/exit logic in {svm,vmx}_vcpu_enter_exit() to common
    helpers.  In addition to deduplicating code, this will allow tweaking the
    vtime accounting in the VM-Exit path without splitting logic across x86,
    VMX, and SVM.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  3. KVM: Move vtime accounting of guest exit to separate helper

    Provide a standalone helper for guest exit vtime accounting so that x86
    can defer tick-based accounting until the appropriate time, while still
    updating context tracking immediately after VM-Exit.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  4. context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain

    Move the guest enter/exit wrappers to kvm_host.h so that KVM can manage
    its context tracking vs. vtime accounting without bleeding too many KVM
    details into the context tracking code.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  5. context_tracking: Consolidate guest enter/exit wrappers

    Consolidate the guest enter/exit wrappers by providing stubs for the
    context tracking helpers as necessary.  This will allow moving the
    wrappers under KVM without having to bleed too many #ifdefs into the
    soon-to-be KVM code.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  6. context_tracking: Move guest enter/exit logic to standalone helpers

    Move guest enter/exit context tracking to standalone helpers, so that the
    existing wrappers can be moved under KVM.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021
  7. sched/vtime: Move guest enter/exit vtime accounting to separate helpers

    Provide separate helpers for guest enter/exit vtime accounting instead of
    open coding the logic within the context tracking code.  This will allow
    KVM x86 to handle vtime accounting slightly differently when using tick-
    based accounting.
    
    Opportunstically delete the vtime_account_kernel() stub now that all
    callers are wrapped with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y.
    
    No functional change intended.
    
    Suggested-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Christian Borntraeger <borntraeger@de.ibm.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    sean-jc authored and intel-lab-lkp committed Apr 13, 2021

Commits on Apr 2, 2021

  1. KVM: x86: Support KVM VMs sharing SEV context

    Add a capability for userspace to mirror SEV encryption context from
    one vm to another. On our side, this is intended to support a
    Migration Helper vCPU, but it can also be used generically to support
    other in-guest workloads scheduled by the host. The intention is for
    the primary guest and the mirror to have nearly identical memslots.
    
    The primary benefits of this are that:
    1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they
    can't accidentally clobber each other.
    2) The VMs can have different memory-views, which is necessary for post-copy
    migration (the migration vCPUs on the target need to read and write to
    pages, when the primary guest would VMEXIT).
    
    This does not change the threat model for AMD SEV. Any memory involved
    is still owned by the primary guest and its initial state is still
    attested to through the normal SEV_LAUNCH_* flows. If userspace wanted
    to circumvent SEV, they could achieve the same effect by simply attaching
    a vCPU to the primary VM.
    This patch deliberately leaves userspace in charge of the memslots for the
    mirror, as it already has the power to mess with them in the primary guest.
    
    This patch does not support SEV-ES (much less SNP), as it does not
    handle handing off attested VMSAs to the mirror.
    
    For additional context, we need a Migration Helper because SEV PSP migration
    is far too slow for our live migration on its own. Using an in-guest
    migrator lets us speed this up significantly.
    
    Signed-off-by: Nathan Tempelman <natet@google.com>
    Message-Id: <20210316014027.3116119-1-natet@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Nathan Tempelman authored and bonzini committed Apr 2, 2021
  2. KVM: x86: pending exceptions must not be blocked by an injected event

    Injected interrupts/nmi should not block a pending exception,
    but rather be either lost if nested hypervisor doesn't
    intercept the pending exception (as in stock x86), or be delivered
    in exitintinfo/IDT_VECTORING_INFO field, as a part of a VMexit
    that corresponds to the pending exception.
    
    The only reason for an exception to be blocked is when nested run
    is pending (and that can't really happen currently
    but still worth checking for).
    
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401143817.1030695-2-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  3. KVM: selftests: remove redundant semi-colon

    Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
    Message-Id: <20210401142514.1688199-1-yangyingliang@huawei.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Yang Yingliang authored and bonzini committed Apr 2, 2021
  4. KVM: x86: introduce kvm_register_clear_available

    Small refactoring that will be used in the next patch.
    
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401141814.1029036-4-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  5. KVM: nSVM: call nested_svm_load_cr3 on nested state load

    While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4
    by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore
    only root_mmu is reset.
    
    On regular nested entries we call nested_svm_load_cr3 which both updates
    the guest's CR3 in the MMU when it is needed, and it also initializes
    the mmu again which makes it initialize the walk_mmu as well when nested
    paging is enabled in both host and guest.
    
    Since we don't call nested_svm_load_cr3 on nested state load,
    the walk_mmu can be left uninitialized, which can lead to a NULL pointer
    dereference while accessing it if we happen to get a nested page fault
    right after entering the nested guest first time after the migration and
    we decide to emulate it, which leads to the emulator trying to access
    walk_mmu->gva_to_gpa which is NULL.
    
    Therefore we should call this function on nested state load as well.
    
    Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401141814.1029036-3-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  6. KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES

    Similar to the rest of guest page accesses after migration,
    this should be delayed to KVM_REQ_GET_NESTED_STATE_PAGES
    request.
    
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401141814.1029036-2-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  7. KVM: x86: dump_vmcs should include the autoload/autostore MSR lists

    When dumping the current VMCS state, include the MSRs that are being
    automatically loaded/stored during VM entry/exit.
    
    Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: David Edmondson <david.edmondson@oracle.com>
    Message-Id: <20210318120841.133123-6-david.edmondson@oracle.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    dme authored and bonzini committed Apr 2, 2021
  8. KVM: x86: dump_vmcs should show the effective EFER

    If EFER is not being loaded from the VMCS, show the effective value by
    reference to the MSR autoload list or calculation.
    
    Suggested-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: David Edmondson <david.edmondson@oracle.com>
    Message-Id: <20210318120841.133123-5-david.edmondson@oracle.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    dme authored and bonzini committed Apr 2, 2021
  9. KVM: x86: dump_vmcs should consider only the load controls of EFER/PAT

    When deciding whether to dump the GUEST_IA32_EFER and GUEST_IA32_PAT
    fields of the VMCS, examine only the VM entry load controls, as saving
    on VM exit has no effect on whether VM entry succeeds or fails.
    
    Suggested-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: David Edmondson <david.edmondson@oracle.com>
    Message-Id: <20210318120841.133123-4-david.edmondson@oracle.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    dme authored and bonzini committed Apr 2, 2021
  10. KVM: x86: dump_vmcs should not conflate EFER and PAT presence in VMCS

    Show EFER and PAT based on their individual entry/exit controls.
    
    Signed-off-by: David Edmondson <david.edmondson@oracle.com>
    Message-Id: <20210318120841.133123-3-david.edmondson@oracle.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    dme authored and bonzini committed Apr 2, 2021
  11. KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid

    If the VM entry/exit controls for loading/saving MSR_EFER are either
    not available (an older processor or explicitly disabled) or not
    used (host and guest values are the same), reading GUEST_IA32_EFER
    from the VMCS returns an inaccurate value.
    
    Because of this, in dump_vmcs() don't use GUEST_IA32_EFER to decide
    whether to print the PDPTRs - always do so if the fields exist.
    
    Fixes: 4eb64dc ("KVM: x86: dump VMCS on invalid entry")
    Signed-off-by: David Edmondson <david.edmondson@oracle.com>
    Message-Id: <20210318120841.133123-2-david.edmondson@oracle.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    dme authored and bonzini committed Apr 2, 2021
  12. KVM: nSVM: improve SYSENTER emulation on AMD

    Currently to support Intel->AMD migration, if CPU vendor is GenuineIntel,
    we emulate the full 64 value for MSR_IA32_SYSENTER_{EIP|ESP}
    msrs, and we also emulate the sysenter/sysexit instruction in long mode.
    
    (Emulator does still refuse to emulate sysenter in 64 bit mode, on the
    ground that the code for that wasn't tested and likely has no users)
    
    However when virtual vmload/vmsave is enabled, the vmload instruction will
    update these 32 bit msrs without triggering their msr intercept,
    which will lead to having stale values in kvm's shadow copy of these msrs,
    which relies on the intercept to be up to date.
    
    Fix/optimize this by doing the following:
    
    1. Enable the MSR intercepts for SYSENTER MSRs iff vendor=GenuineIntel
       (This is both a tiny optimization and also ensures that in case
       the guest cpu vendor is AMD, the msrs will be 32 bit wide as
       AMD defined).
    
    2. Store only high 32 bit part of these msrs on interception and combine
       it with hardware msr value on intercepted read/writes
       iff vendor=GenuineIntel.
    
    3. Disable vmload/vmsave virtualization if vendor=GenuineIntel.
       (It is somewhat insane to set vendor=GenuineIntel and still enable
       SVM for the guest but well whatever).
       Then zero the high 32 bit parts when kvm intercepts and emulates vmload.
    
    Thanks a lot to Paulo Bonzini for helping me with fixing this in the most
    correct way.
    
    This patch fixes nested migration of 32 bit nested guests, that was
    broken because incorrect cached values of SYSENTER msrs were stored in
    the migration stream if L1 changed these msrs with
    vmload prior to L2 entry.
    
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401111928.996871-3-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  13. KVM: x86: add guest_cpuid_is_intel

    This is similar to existing 'guest_cpuid_is_amd_or_hygon'
    
    Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
    Message-Id: <20210401111928.996871-2-mlevitsk@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Maxim Levitsky authored and bonzini committed Apr 2, 2021
  14. KVM: x86: Account a variety of miscellaneous allocations

    Switch to GFP_KERNEL_ACCOUNT for a handful of allocations that are
    clearly associated with a single task/VM.
    
    Note, there are a several SEV allocations that aren't accounted, but
    those can (hopefully) be fixed by using the local stack for memory.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331023025.2485960-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  15. KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created

    Reject KVM_SEV_INIT and KVM_SEV_ES_INIT if they are attempted after one
    or more vCPUs have been created.  KVM assumes a VM is tagged SEV/SEV-ES
    prior to vCPU creation, e.g. init_vmcb() needs to mark the VMCB as SEV
    enabled, and svm_create_vcpu() needs to allocate the VMSA.  At best,
    creating vCPUs before SEV/SEV-ES init will lead to unexpected errors
    and/or behavior, and at worst it will crash the host, e.g.
    sev_launch_update_vmsa() will dereference a null svm->vmsa pointer.
    
    Fixes: 1654efc ("KVM: SVM: Add KVM_SEV_INIT command")
    Fixes: ad73109 ("KVM: SVM: Provide support to launch and run an SEV-ES guest")
    Cc: stable@vger.kernel.org
    Cc: Brijesh Singh <brijesh.singh@amd.com>
    Cc: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331031936.2495277-4-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  16. KVM: SVM: Do not set sev->es_active until KVM_SEV_ES_INIT completes

    Set sev->es_active only after the guts of KVM_SEV_ES_INIT succeeds.  If
    the command fails, e.g. because SEV is already active or there are no
    available ASIDs, then es_active will be left set even though the VM is
    not fully SEV-ES capable.
    
    Refactor the code so that "es_active" is passed on the stack instead of
    being prematurely shoved into sev_info, both to avoid having to unwind
    sev_info and so that it's more obvious what actually consumes es_active
    in sev_guest_init() and its helpers.
    
    Fixes: ad73109 ("KVM: SVM: Provide support to launch and run an SEV-ES guest")
    Cc: stable@vger.kernel.org
    Cc: Brijesh Singh <brijesh.singh@amd.com>
    Cc: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331031936.2495277-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  17. KVM: SVM: Use online_vcpus, not created_vcpus, to iterate over vCPUs

    Use the kvm_for_each_vcpu() helper to iterate over vCPUs when encrypting
    VMSAs for SEV, which effectively switches to use online_vcpus instead of
    created_vcpus.  This fixes a possible null-pointer dereference as
    created_vcpus does not guarantee a vCPU exists, since it is updated at
    the very beginning of KVM_CREATE_VCPU.  created_vcpus exists to allow the
    bulk of vCPU creation to run in parallel, while still correctly
    restricting the max number of max vCPUs.
    
    Fixes: ad73109 ("KVM: SVM: Provide support to launch and run an SEV-ES guest")
    Cc: stable@vger.kernel.org
    Cc: Brijesh Singh <brijesh.singh@amd.com>
    Cc: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331031936.2495277-2-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  18. KVM: x86/mmu: Simplify code for aging SPTEs in TDP MMU

    Use a basic NOT+AND sequence to clear the Accessed bit in TDP MMU SPTEs,
    as opposed to the fancy ffs()+clear_bit() logic that was copied from the
    legacy MMU.  The legacy MMU uses clear_bit() because it is operating on
    the SPTE itself, i.e. clearing needs to be atomic.  The TDP MMU operates
    on a local variable that it later writes to the SPTE, and so doesn't need
    to be atomic or even resident in memory.
    
    Opportunistically drop unnecessary initialization of new_spte, it's
    guaranteed to be written before being accessed.
    
    Using NOT+AND instead of ffs()+clear_bit() reduces the sequence from:
    
       0x0000000000058be6 <+134>:	test   %rax,%rax
       0x0000000000058be9 <+137>:	je     0x58bf4 <age_gfn_range+148>
       0x0000000000058beb <+139>:	test   %rax,%rdi
       0x0000000000058bee <+142>:	je     0x58cdc <age_gfn_range+380>
       0x0000000000058bf4 <+148>:	mov    %rdi,0x8(%rsp)
       0x0000000000058bf9 <+153>:	mov    $0xffffffff,%edx
       0x0000000000058bfe <+158>:	bsf    %eax,%edx
       0x0000000000058c01 <+161>:	movslq %edx,%rdx
       0x0000000000058c04 <+164>:	lock btr %rdx,0x8(%rsp)
       0x0000000000058c0b <+171>:	mov    0x8(%rsp),%r15
    
    to:
    
       0x0000000000058bdd <+125>:	test   %rax,%rax
       0x0000000000058be0 <+128>:	je     0x58beb <age_gfn_range+139>
       0x0000000000058be2 <+130>:	test   %rax,%r8
       0x0000000000058be5 <+133>:	je     0x58cc0 <age_gfn_range+352>
       0x0000000000058beb <+139>:	not    %rax
       0x0000000000058bee <+142>:	and    %r8,%rax
       0x0000000000058bf1 <+145>:	mov    %rax,%r15
    
    thus eliminating several memory accesses, including a locked access.
    
    Cc: Ben Gardon <bgardon@google.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331004942.2444916-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  19. KVM: x86/mmu: Remove spurious clearing of dirty bit from TDP MMU SPTE

    Don't clear the dirty bit when aging a TDP MMU SPTE (in response to a MMU
    notifier event).  Prematurely clearing the dirty bit could cause spurious
    PML updates if aging a page happened to coincide with dirty logging.
    
    Note, tdp_mmu_set_spte_no_acc_track() flows into __handle_changed_spte(),
    so the host PFN will be marked dirty, i.e. there is no potential for data
    corruption.
    
    Fixes: a6a0b05 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
    Cc: Ben Gardon <bgardon@google.com>
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210331004942.2444916-2-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  20. KVM: x86/mmu: Drop trace_kvm_age_page() tracepoint

    Remove x86's trace_kvm_age_page() tracepoint.  It's mostly redundant with
    the common trace_kvm_age_hva() tracepoint, and if there is a need for the
    extra details, e.g. gfn, referenced, etc... those details should be added
    to the common tracepoint so that all architectures and MMUs benefit from
    the info.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-19-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  21. KVM: Move arm64's MMU notifier trace events to generic code

    Move arm64's MMU notifier trace events into common code in preparation
    for doing the hva->gfn lookup in common code.  The alternative would be
    to trace the gfn instead of hva, but that's not obviously better and
    could also be done in common code.  Tracing the notifiers is also quite
    handy for debug regardless of architecture.
    
    Remove a completely redundant tracepoint from PPC e500.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-10-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  22. KVM: Move prototypes for MMU notifier callbacks to generic code

    Move the prototypes for the MMU notifier callbacks out of arch code and
    into common code.  There is no benefit to having each arch replicate the
    prototypes since any deviation from the invocation in common code will
    explode.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-9-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  23. KVM: x86/mmu: Use leaf-only loop for walking TDP SPTEs when changing …

    …SPTE
    
    Use the leaf-only TDP iterator when changing the SPTE in reaction to a
    MMU notifier.  Practically speaking, this is a nop since the guts of the
    loop explicitly looks for 4k SPTEs, which are always leaf SPTEs.  Switch
    the iterator to match age_gfn_range() and test_age_gfn() so that a future
    patch can consolidate the core iterating logic.
    
    No real functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-8-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  24. KVM: x86/mmu: Pass address space ID to TDP MMU root walkers

    Move the address space ID check that is performed when iterating over
    roots into the macro helpers to consolidate code.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-7-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  25. KVM: x86/mmu: Pass address space ID to __kvm_tdp_mmu_zap_gfn_range()

    Pass the address space ID to TDP MMU's primary "zap gfn range" helper to
    allow the MMU notifier paths to iterate over memslots exactly once.
    Currently, both the legacy MMU and TDP MMU iterate over memslots when
    looking for an overlapping hva range, which can be quite costly if there
    are a large number of memslots.
    
    Add a "flush" parameter so that iterating over multiple address spaces
    in the caller will continue to do the right thing when yielding while a
    flush is pending from a previous address space.
    
    Note, this also has a functional change in the form of coalescing TLB
    flushes across multiple address spaces in kvm_zap_gfn_range(), and also
    optimizes the TDP MMU to utilize range-based flushing when running as L1
    with Hyper-V enlightenments.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-6-seanjc@google.com>
    [Keep separate for loops to prepare for other incoming patches. - Paolo]
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  26. KVM: x86/mmu: Coalesce TLB flushes across address spaces for gfn rang…

    …e zap
    
    Gather pending TLB flushes across both address spaces when zapping a
    given gfn range.  This requires feeding "flush" back into subsequent
    calls, but on the plus side sets the stage for further batching
    between the legacy MMU and TDP MMU.  It also allows refactoring the
    address space iteration to cover the legacy and TDP MMUs without
    introducing truly ugly code.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-5-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  27. KVM: x86/mmu: Coalesce TLB flushes when zapping collapsible SPTEs

    Gather pending TLB flushes across both the legacy and TDP MMUs when
    zapping collapsible SPTEs to avoid multiple flushes if both the legacy
    MMU (for nested guests) and TDP MMU have mappings for the memslot.
    
    Note, this also optimizes the TDP MMU to flush only the relevant range
    when running as L1 with Hyper-V enlightenments.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-4-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
  28. KVM: x86/mmu: Move flushing for "slot" handlers to caller for legacy MMU

    Place the onus on the caller of slot_handle_*() to flush the TLB, rather
    than handling the flush in the helper, and rename parameters accordingly.
    This will allow future patches to coalesce flushes between address spaces
    and between the legacy and TDP MMUs.
    
    No functional change intended.
    
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Message-Id: <20210326021957.1424875-3-seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    sean-jc authored and bonzini committed Apr 2, 2021
Older