Alexey-Kardash…
Commits on Dec 15, 2021
-
KVM: PPC: Merge powerpc's debugfs entry content into generic entry
At the moment KVM on PPC creates 3 types of entries under the kvm debugfs: 1) "%pid-%fd" per a KVM instance (for all platforms); 2) "vm%pid" (for PPC Book3s HV KVM); 3) "vm%u_vcpu%u_timing" (for PPC Book3e KVM). The problem with this is that multiple VMs per process is not allowed for 2) and 3) which makes it possible for userspace to trigger errors when creating duplicated debugfs entries. This merges all these into 1). This defines kvm_arch_create_kvm_debugfs() similar to kvm_arch_create_vcpu_debugfs(). This defines 2 hooks in kvmppc_ops that allow specific KVM implementations add necessary entries, this adds the _e500 suffix to kvmppc_create_vcpu_debugfs_e500() to make it clear what platform it is for. This makes use of already existing kvm_arch_create_vcpu_debugfs() on PPC. This removes no more used debugfs_dir pointers from PPC kvm_arch structs. This stops removing vcpu entries as once created vcpus stay around for the entire life of a VM and removed when the KVM instance is closed, see commit d56f513 ("KVM: let kvm_destroy_vm_debugfs clean up vCPU debugfs directories"). Suggested-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Commits on Dec 2, 2021
-
KVM: PPC: Book3S: Suppress failed alloc warning in H_COPY_TOFROM_GUEST
H_COPY_TOFROM_GUEST is an hcall for an upper level VM to access its nested VMs memory. The userspace can trigger WARN_ON_ONCE(!(gfp & __GFP_NOWARN)) in __alloc_pages() by constructing a tiny VM which only does H_COPY_TOFROM_GUEST with a too big GPR9 (number of bytes to copy). This silences the warning by adding __GFP_NOWARN. Spotted by syzkaller. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210901084550.1658699-1-aik@ozlabs.ru
-
KVM: PPC: Book3S: Suppress warnings when allocating too big memory slots
The userspace can trigger "vmalloc size %lu allocation failure: exceeds total pages" via the KVM_SET_USER_MEMORY_REGION ioctl. This silences the warning by checking the limit before calling vzalloc() and returns ENOMEM if failed. This does not call underlying valloc helpers as __vmalloc_node() is only exported when CONFIG_TEST_VMALLOC_MODULE and __vmalloc_node_range() is not exported at all. Spotted by syzkaller. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Use 'size' for the variable rather than 'cb'] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210901084512.1658628-1-aik@ozlabs.ru
Commits on Dec 1, 2021
-
KVM: PPC: Book3S HV P9: Remove unused ri_set local variable
ri_set is set and never used. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201052112.2137167-1-npiggin@gmail.com
Commits on Nov 24, 2021
-
KVM: PPC: Book3S HV P9: Remove subcore HMI handling
On POWER9 and newer, rather than the complex HMI synchronisation and subcore state, have each thread un-apply the guest TB offset before calling into the early HMI handler. This allows the subcore state to be avoided, including subcore enter / exit guest, which includes an expensive divide that shows up slightly in profiles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-54-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Stop using vc->dpdes
The P9 path uses vc->dpdes only for msgsndp / SMT emulation. This adds an ordering requirement between vcpu->doorbell_request and vc->dpdes for no real benefit. Use vcpu->doorbell_request directly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-53-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry
This goes further to removing vcores from the P9 path. Also avoid the memset in favour of explicitly initialising all fields. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-52-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Remove most of the vcore logic
The P9 path always uses one vcpu per vcore, so none of the vcore, locks, stolen time, blocking logic, shared waitq, etc., is required. Remove most of it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-51-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit
cpu_in_guest is set to determine if a CPU needs to be IPI'ed to exit the guest and notice the need_tlb_flush bit. This can be implemented as a global per-CPU pointer to the currently running guest instead of per-guest cpumasks, saving 2 atomics per entry/exit. P7/8 doesn't require cpu_in_guest, nor does a nested HV (only the L0 does), so move it to the P9 HV path. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-50-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready
The mmu will almost always be ready. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-49-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit
kvm_hstate.in_guest provides the equivalent of MSR[RI]=0 protection, and it covers the existing MSR[RI]=0 section in late entry and early exit, so clearing and setting MSR[RI] in those cases does not actually do anything useful. Remove the RI manipulation and replace it with comments. Make the in_guest memory accesses a bit closer to a proper critical section pattern. This speeds up guest entry/exit performance. This also removes the MSR[RI] warnings which aren't very interesting and would cause crashes if they hit due to causing an interrupt in non-recoverable code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-48-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving
slbmfee/slbmfev instructions are very expensive, moreso than a regular mfspr instruction, so minimising them significantly improves hash guest exit performance. The slbmfev is only required if slbmfee found a valid SLB entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-47-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry
Rearrange the MSR saving on entry so it does not follow the mtmsrd to disable interrupts, avoiding a possible RAW scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-46-npiggin@gmail.com
-
KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry
mftb() is expensive and one can be avoided on nested guest dispatch. If the time checking code distinguishes between the L0 timer and the nested HV timer, then both can be tested in the same place with the same mftb() value. This also nicely illustrates the relationship between the L0 and nested HV timers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-45-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit
Use the existing TLB flushing logic to IPI the previous CPU and run the necessary barriers before running a guest vCPU on a new physical CPU, to do the necessary radix GTSE barriers for handling the case of an interrupted guest tlbie sequence. This requires the vCPU TLB flush sequence that is currently just done on one thread, to be expanded to ensure the other threads execute a ptesync, because causing them to exit the guest will no longer cause a ptesync by itself. This results in more IPIs than the TLB flush logic requires, but it's a significant win for common case scheduling when the vCPU remains on the same physical CPU. This saves about 520 cycles (nearly 10%) on a guest entry+exit micro benchmark on a POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-44-npiggin@gmail.com
-
KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushing
This creates separate functions for old and new paths for vCPU TLB flushing, which will reduce complexity of the next change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-43-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed
This also moves the PSSCR update in nested entry to avoid a SPR scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-42-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs
Some of the DAWR SPR access is already predicated on dawr_enabled(), apply this to the remainder of the accesses. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-41-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code
Tighten up partition switching code synchronisation and comments. In particular, hwsync ; isync is required after the last access that is performed in the context of a partition, before the partition is switched away from. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-40-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some hos…
…t SPRs Linux implements SPR save/restore including storage space for registers in the task struct for process context switching. Make use of this similarly to the way we make use of the context switching fp/vec save restore. This improves code reuse, allows some stack space to be saved, and helps with avoiding VRSAVE updates if they are not required. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-39-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Demand fault TM facility registers
Use HFSCR facility disabling to implement demand faulting for TM, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-38-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Demand fault EBB facility registers
Use HFSCR facility disabling to implement demand faulting for EBB, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-37-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: More SPR speed improvements
This avoids more scoreboard stalls and reduces mtSPRs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-36-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processor…
…s that require it Use CPU_FTR_P9_RADIX_PREFETCH_BUG to apply the workaround, to test for DD2.1 and below processors. This saves a mtSPR in guest entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-35-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible
This moves PMU switch to guest as late as possible in entry, and switch back to host as early as possible at exit. This helps the host get the most perf coverage of KVM entry/exit code as possible. This is slightly suboptimal for SPR scheduling point of view when the PMU is enabled, but when perf is disabled there is no real difference. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-34-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Implement TM fastpath for guest entry/exit
If TM is not active, only TM register state needs to be saved and restored, avoiding several mfmsr/mtmsrd instructions and improving performance. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-33-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Move remaining SPR and MSR access into low le…
…vel entry Move register saving and loading from kvmhv_p9_guest_entry() into the HV and nested entry handlers. Accesses are scheduled to reduce mtSPR / mfSPR interleaving which reduces SPR scoreboard stalls. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-32-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Move nested guest entry into its own function
Move the part of the guest entry which is specific to nested HV into its own function. This is just refactoring. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-31-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Move host OS save/restore functions to built-in
Move the P9 guest/host register switching functions to the built-in P9 entry code, and export it for nested to use as well. This allows more flexibility in scheduling these supervisor privileged SPR accesses with the HV privileged and PR SPR accesses in the low level entry code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-30-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Move vcpu register save/restore into functions
This should be no functional difference but makes the caller easier to read. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-29-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Juggle SPR switching around
This juggles SPR switching on the entry and exit sides to be more symmetric, which makes the next refactoring patch possible with no functional change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-28-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Only execute mtSPR if the value changed
Keep better track of the current SPR value in places where they are to be loaded with a new context, to reduce expensive mtSPR operations. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-27-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Avoid SPR scoreboard stalls
Avoid interleaving mfSPR and mtSPR to reduce SPR scoreboard stalls. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-26-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Optimise timebase reads
Reduce the number of mfTB executed by passing the current timebase around entry and exit code rather than read it multiple times. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-25-npiggin@gmail.com
-
KVM: PPC: Book3S HV P9: Move TB updates
Move the TB updates between saving and loading guest and host SPRs, to improve scheduling by keeping issue-NTC operations together as much as possible. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-24-npiggin@gmail.com