Mingwei-Zhang/…
Commits on Aug 16, 2021
-
KVM: SVM: move sev_unbind_asid and DF_FLUSH logic into psp
In KVM SEV code, sev_unbind_asid and sev_guest_df_flush needs to be serialized because DEACTIVATE command in PSP may clear the WBINVD indicator and cause DF_FLUSH to fail. This is a PSP level detail that is not necessary to expose to KVM. So put both functions as well as the RWSEM into the sev-dev.c. Cc: Alper Gun <alpergun@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Rienjes <rientjes@google.com> Cc: Marc Orr <marcorr@google.com> Cc: John Allen <john.allen@amd.com> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vipin Sharma <vipinsh@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com>
-
KVM: SVM: move sev_bind_asid to psp
ccp/sev-dev.c is the software layer in psp that allows KVM to manage SEV/ES/SNP enabled VMs. Since psp API provides only primitive sev command invocation, KVM has to do extra processing that are specific only to psp with KVM level wrapper function. sev_bind_asid is such a KVM function that literally wraps around sev_guest_activate in psp with extra steps like psp data structure creation and error processing: invoking sev_guest_decommission on activation failure. Adding sev_guest_decommission is essentially required on all sev_bin_asid call sites. This is error prone and in fact the upstream code in KVM still have an issue on sev_receive_start where sev_guest_decommission is missing. Since sev_bind_asid code logic is purely psp specific, putting it into psp layer should make it more robust, since KVM code does not have to worry about error handling of asid binding failure. So replace the KVM pointer in sev_bind_asid with primitive arguments: asid and handle; slightly change the name to sev_guest_bind_asid make it consistent with other psp APIs; add the error handling code inside sev_guest_bind_asid and; put it into the sev-dev.c. Cc: Alper Gun <alpergun@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Rienjes <rientjes@google.com> Cc: Marc Orr <marcorr@google.com> Cc: John Allen <john.allen@amd.com> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vipin Sharma <vipinsh@google.com> Fixes: af43cbb ("KVM: SVM: Add support for KVM_SEV_RECEIVE_START command") Signed-off-by: Mingwei Zhang <mizhang@google.com>
-
KVM: SVM: move sev_decommission to psp driver
ccp/sev-dev.c is part of the software layer in psp that allows KVM to manage SEV/ES/SNP enabled VMs. Among the APIs exposed in sev-dev, many of them requires caller (KVM) to understand psp specific data structures. This often ends up with the fact that KVM has to create its own 'wrapper' API to make it easy to use. The following is the pattern: kvm_func(unsigned int handle) { psp_data_structure data; data.handle = handle; psp_func(&data, NULL); } psp_func(psp_data_structure *data, int *error) { sev_do_cmd(data, error); } struct psp_data_structure { u32 handle; }; sev_decommission is one example following the above pattern. Since KVM is the only user for this API and 'handle' is the only data that is meaningful to KVM, simplify the interface by putting the code from kvm function sev_decommission into the psp function sev_guest_decomssion. Cc: Alper Gun <alpergun@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Rienjes <rientjes@google.com> Cc: Marc Orr <marcorr@google.com> Cc: John Allen <john.allen@amd.com> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vipin Sharma <vipinsh@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com>
Commits on Aug 11, 2021
-
KVM: MMU: change tracepoints arguments to kvm_page_fault
Pass struct kvm_page_fault to tracepoints instead of extracting the arguments from the struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change disallowed_hugepage_adjust() arguments to kvm_page_f…
…ault Pass struct kvm_page_fault to disallowed_hugepage_adjust() instead of extracting the arguments from the struct. Tweak a bit the conditions to avoid long lines. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change kvm_mmu_hugepage_adjust() arguments to kvm_page_fault
Pass struct kvm_page_fault to kvm_mmu_hugepage_adjust() instead of extracting the arguments from the struct; the results are also stored in the struct, so the callers are adjusted consequently. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change fast_page_fault() arguments to kvm_page_fault
Pass struct kvm_page_fault to fast_page_fault() instead of extracting the arguments from the struct. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change tdp_mmu_map_handle_target_level() arguments to kvm_p…
…age_fault Pass struct kvm_page_fault to tdp_mmu_map_handle_target_level() instead of extracting the arguments from the struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change kvm_tdp_mmu_map() arguments to kvm_page_fault
Pass struct kvm_page_fault to kvm_tdp_mmu_map() instead of extracting the arguments from the struct. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change FNAME(fetch)() arguments to kvm_page_fault
Pass struct kvm_page_fault to FNAME(fetch)() instead of extracting the arguments from the struct. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change __direct_map() arguments to kvm_page_fault
Pass struct kvm_page_fault to __direct_map() instead of extracting the arguments from the struct. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change handle_abnormal_pfn() arguments to kvm_page_fault
Pass struct kvm_page_fault to handle_abnormal_pfn() instead of extracting the arguments from the struct. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change try_async_pf() arguments to kvm_page_fault
Add fields to struct kvm_page_fault corresponding to outputs of try_async_pf(). For now they have to be extracted again from struct kvm_page_fault in the subsequent steps, but this is temporary until other functions in the chain are switched over as well. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change page_fault_handle_page_track() arguments to kvm_page…
…_fault Add fields to struct kvm_page_fault corresponding to the arguments of page_fault_handle_page_track(). The fields are initialized in the callers, and page_fault_handle_page_track() receives a struct kvm_page_fault instead of having to extract the arguments out of it. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change direct_page_fault() arguments to kvm_page_fault
Add fields to struct kvm_page_fault corresponding to the arguments of direct_page_fault(). The fields are initialized in the callers, and direct_page_fault() receives a struct kvm_page_fault instead of having to extract the arguments out of it. Also adjust FNAME(page_fault) to store the max_level in struct kvm_page_fault, to keep it similar to the direct map path. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: change mmu->page_fault() arguments to kvm_page_fault
Pass struct kvm_page_fault to mmu->page_fault() instead of extracting the arguments from the struct. FNAME(page_fault) can use the precomputed bools from the error code. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: Introduce struct kvm_page_fault
Create a single structure for arguments that are passed from kvm_mmu_do_page_fault to the page fault handlers. Later the structure will grow to include various output parameters that are passed back to the next steps in the page fault handling. Suggested-by: Isaku Yamahata <isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: x86: clamp host mapping level to max_level in kvm_mmu_max_mappin…
…g_level This patch started as a way to make kvm_mmu_hugepage_adjust a bit simpler, in preparation for switching it to struct kvm_page_fault, but it does fix a microscopic bug in zapping collapsible PTEs. If a large page size is disallowed but not all of them, kvm_mmu_max_mapping_level will return the host mapping level and the small PTEs will be zapped up to that level. However, if e.g. 1GB are prohibited, we can still zap 4KB mapping and preserve the 2MB ones. This can happen for example when NX huge pages are in use. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: MMU: pass unadulterated gpa to direct_page_fault
Do not bother removing the low bits of the gpa. This masking dates back to the very first commit of KVM but it is unnecessary---or even problematic, because the gpa is later used to fill in the MMIO page cache. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: x86/mmu: Drop 'shared' param from tdp_mmu_link_page()
Drop @shared from tdp_mmu_link_page() and hardcode it to work for mmu_lock being held for read. The helper has exactly one caller and in all likelihood will only ever have exactly one caller. Even if KVM adds a path to install translations without an initiating page fault, odds are very, very good that the path will just be a wrapper to the "page fault" handler (both SNP and TDX RFCs propose patches to do exactly that). No functional change intended. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210810224554.2978735-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: x86/mmu: Add detailed page size stats
Existing KVM code tracks the number of large pages regardless of their sizes. Therefore, when large page of 1GB (or larger) is adopted, the information becomes less useful because lpages counts a mix of 1G and 2M pages. So remove the lpages since it is easy for user space to aggregate the info. Instead, provide a comprehensive page stats of all sizes from 4K to 512G. Suggested-by: Ben Gardon <bgardon@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Cc: Jing Zhang <jingzhangos@google.com> Cc: David Matlack <dmatlack@google.com> Cc: Sean Christopherson <seanjc@google.com> Message-Id: <20210803044607.599629-4-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: x86/mmu: Avoid collision with !PRESENT SPTEs in TDP MMU lpage stats
Factor in whether or not the old/new SPTEs are shadow-present when adjusting the large page stats in the TDP MMU. A modified MMIO SPTE can toggle the page size bit, as bit 7 is used to store the MMIO generation, i.e. is_large_pte() can get a false positive when called on a MMIO SPTE. Ditto for nuking SPTEs with REMOVED_SPTE, which sets bit 7 in its magic value. Opportunistically move the logic below the check to verify at least one of the old/new SPTEs is shadow present. Use is/was_leaf even though is/was_present would suffice. The code generation is roughly equivalent since all flags need to be computed prior to the code in question, and using the *_leaf flags will minimize the diff in a future enhancement to account all pages, i.e. will change the check to "is_leaf != was_leaf". Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Fixes: 1699f65 ("kvm/x86: Fix 'lpages' kvm stat for TDM MMU") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20210803044607.599629-3-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: x86/mmu: Remove redundant spte present check in mmu_set_spte
Drop an unnecessary is_shadow_present_pte() check when updating the rmaps after installing a non-MMIO SPTE. set_spte() is used only to create shadow-present SPTEs, e.g. MMIO SPTEs are handled early on, mmu_set_spte() runs with mmu_lock held for write, i.e. the SPTE can't be zapped between writing the SPTE and updating the rmaps. Opportunistically combine the "new SPTE" logic for large pages and rmaps. No functional change intended. Suggested-by: Ben Gardon <bgardon@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20210803044607.599629-2-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: stats: Add halt polling related histogram stats
Add three log histogram stats to record the distribution of time spent on successful polling, failed polling and VCPU wait. halt_poll_success_hist: Distribution of spent time for a successful poll. halt_poll_fail_hist: Distribution of spent time for a failed poll. halt_wait_hist: Distribution of time a VCPU has spent on waiting. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-6-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: stats: Add halt_wait_ns stats for all architectures
Add simple stats halt_wait_ns to record the time a VCPU has spent on waiting for all architectures (not just powerpc). Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-5-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: selftests: Add checks for histogram stats bucket_size field
The bucket_size field should be non-zero for linear histogram stats and should be zero for other stats types. Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-4-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: stats: Update doc for histogram statistics
Add documentations for linear and logarithmic histogram statistics. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-3-jingzhangos@google.com> [Small changes to the phrasing. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: stats: Support linear and logarithmic histogram statistics
Add new types of KVM stats, linear and logarithmic histogram. Histogram are very useful for observing the value distribution of time or size related stats. Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210802165633.1866976-2-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
kvm: x86: abstract locking around pvclock_update_vm_gtod_copy
Updates to the kvmclock parameters needs to do a complicated dance of KVM_REQ_MCLOCK_INPROGRESS and KVM_REQ_CLOCK_UPDATE in addition to taking pvclock_gtod_sync_lock. Place that in two functions that can be called on all of master clock update, KVM_SET_CLOCK, and Hyper-V reenlightenment. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: KVM-on-hyperv: shorten no-entry section on reenlightenment
During re-enlightenment, update kvmclock a VM at a time instead of raising KVM_REQ_MASTERCLOCK_UPDATE for all VMs. Because the guests can now run after TSC emulation has been disabled, invalidate their TSC page so that they refer to the reference time counter MSR while the update is in progress. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
KVM: SVM: AVIC: drop unsupported AVIC base relocation code
APIC base relocation is not supported anyway and won't work correctly so just drop the code that handles it and keep AVIC MMIO bar at the default APIC base. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-17-mlevitsk@redhat.com>
-
KVM: SVM: call avic_vcpu_load/avic_vcpu_put when enabling/disabling AVIC
Currently it is possible to have the following scenario: 1. AVIC is disabled by svm_refresh_apicv_exec_ctrl 2. svm_vcpu_blocking calls avic_vcpu_put which does nothing 3. svm_vcpu_unblocking enables the AVIC (due to KVM_REQ_APICV_UPDATE) and then calls avic_vcpu_load 4. warning is triggered in avic_vcpu_load since AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK was never cleared While it is possible to just remove the warning, it seems to be more robust to fully disable/enable AVIC in svm_refresh_apicv_exec_ctrl by calling the avic_vcpu_load/avic_vcpu_put Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-16-mlevitsk@redhat.com>
-
KVM: SVM: move check for kvm_vcpu_apicv_active outside of avic_vcpu_{…
…put|load} No functional change intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-15-mlevitsk@redhat.com>
-
KVM: SVM: avoid refreshing avic if its state didn't change
Since AVIC can be inhibited and uninhibited rapidly it is possible that we have nothing to do by the time the svm_refresh_apicv_exec_ctrl is called. Detect and avoid this, which will be useful when we will start calling avic_vcpu_load/avic_vcpu_put when the avic inhibition state changes. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-14-mlevitsk@redhat.com>
-
KVM: SVM: remove svm_toggle_avic_for_irq_window
Now that kvm_request_apicv_update doesn't need to drop the kvm->srcu lock, we can call kvm_request_apicv_update directly. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210810205251.424103-13-mlevitsk@redhat.com>