New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv: Add Native/Paravirt qspinlock support #420
Conversation
Upstream branch: f352a28 |
The arch_spinlock_t of qspinlock has contained the atomic_t val, which satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t into qspinlock_types.h. This is the preparation for the next combo spinlock. Reviewed-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/linux-riscv/CAK8P3a2rnz9mQqhN6-e0CGUUv9rntRELFdxt_weiD7FxH7fkfQ@mail.gmail.com/ Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Add a separate ticket-lock.h to include multiple spinlock versions and select one at compile time or runtime. Reviewed-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/linux-riscv/CAK8P3a2rnz9mQqhN6-e0CGUUv9rntRELFdxt_weiD7FxH7fkfQ@mail.gmail.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Move errata vendor func-id definitions from errata_list into vendorid_list.h. Unifying these definitions is also for following rwonce errata implementation. Suggested-by: Leonardo Bras <leobras@redhat.com> Link: https://lore.kernel.org/linux-riscv/ZQLFJ1cmQ8PAoMHm@redhat.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
The early version of T-Head C9xx cores has a store merge buffer delay problem. The store merge buffer could improve the store queue performance by merging multi-store requests, but when there are not continued store requests, the prior single store request would be waiting in the store queue for a long time. That would cause significant problems for communication between multi-cores. This problem was found on sg2042 & th1520 platforms with the qspinlock lock torture test. So appending a fence w.o could immediately flush the store merge buffer and let other cores see the write result. This will apply the WRITE_ONCE errata to handle the non-standard behavior via appending a fence w.o instruction for WRITE_ONCE(). Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
The requirements of qspinlock have been documented by commit: a8ad07e ("asm-generic: qspinlock: Indicate the use of mixed-size atomics"). Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which doesn't satisfy the requirements of qspinlock above, it won't prevent some riscv vendors from implementing a strong fwd guarantee LR/SC in microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor is the one. We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress test on Fedora & Ubuntu & OpenEuler ... Here is the performance comparison between qspinlock and ticket_lock on sg2042 (64 cores): sysbench test=threads threads=32 yields=100 lock=8 (+13.8%): queued_spinlock 0.5109/0.00 ticket_spinlock 0.5814/0.00 perf futex/hash (+6.7%): queued_spinlock 1444393 operations/sec (+- 0.09%) ticket_spinlock 1353215 operations/sec (+- 0.15%) perf futex/wake-parallel (+8.6%): queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%) ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%) perf futex/requeue (+4.2%): queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%) ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%) System Benchmarks (+6.4%) queued_spinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 628613745.4 53865.8 Double-Precision Whetstone 55.0 182422.8 33167.8 Execl Throughput 43.0 13116.6 3050.4 File Copy 1024 bufsize 2000 maxblocks 3960.0 7762306.2 19601.8 File Copy 256 bufsize 500 maxblocks 1655.0 3417556.8 20649.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 7427995.7 12806.9 Pipe Throughput 12440.0 23058600.5 18535.9 Pipe-based Context Switching 4000.0 2835617.7 7089.0 Process Creation 126.0 12537.3 995.0 Shell Scripts (1 concurrent) 42.4 57057.4 13456.9 Shell Scripts (8 concurrent) 6.0 7367.1 12278.5 System Call Overhead 15000.0 33308301.3 22205.5 ======== System Benchmarks Index Score 12426.1 ticket_spinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 626541701.9 53688.2 Double-Precision Whetstone 55.0 181921.0 33076.5 Execl Throughput 43.0 12625.1 2936.1 File Copy 1024 bufsize 2000 maxblocks 3960.0 6553792.9 16550.0 File Copy 256 bufsize 500 maxblocks 1655.0 3189231.6 19270.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 7221277.0 12450.5 Pipe Throughput 12440.0 20594018.7 16554.7 Pipe-based Context Switching 4000.0 2571117.7 6427.8 Process Creation 126.0 10798.4 857.0 Shell Scripts (1 concurrent) 42.4 57227.5 13497.1 Shell Scripts (8 concurrent) 6.0 7329.2 12215.3 System Call Overhead 15000.0 30766778.4 20511.2 ======== System Benchmarks Index Score 11670.7 The qspinlock has a significant improvement on SOPHGO SG2042 64 cores platform than the ticket_lock. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via command line. Here is the func size (Bytes) comparison table below: TYPE : COMBO | TICKET | QUEUED arch_spin_lock : 106 | 60 | 50 arch_spin_unlock : 54 | 36 | 26 arch_spin_trylock : 110 | 72 | 54 arch_spin_is_locked : 48 | 34 | 20 arch_spin_is_contended : 56 | 40 | 24 rch_spin_value_unlocked : 48 | 34 | 24 One example of disassemble combo arch_spin_unlock: <+14>: nop # detour slot <+18>: fence rw,w --+-> queued_spin_unlock <+22>: sb zero,0(a4) --+ (2 instructions) <+26>: ld s0,8(sp) <+28>: addi sp,sp,16 <+30>: ret <+32>: lw a5,0(a4) --+-> ticket_spin_unlock <+34>: sext.w a5,a5 | (7 instructions) <+36>: fence rw,w | <+40>: addiw a5,a5,1 | <+42>: slli a5,a5,0x30 | <+44>: srli a5,a5,0x30 | <+46>: sh a5,0(a4) --+ <+50>: ld s0,8(sp) <+52>: addi sp,sp,16 <+54>: ret The qspinlock is smaller and faster than ticket-lock when all are in a fast path. The combo spinlock could provide a compatible Linux Image for different micro-arch designs that have/haven't forward progress guarantee. Use command line options to select between qspinlock and ticket-lock, and the default is ticket-lock. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Add a static key controlling whether virt_spin_lock() should be called or not. When running on bare metal set the new key to false. The VM guests should fall back to a Test-and-Set spinlock, because fair locks have horrible lock 'holder' preemption issues. The virt_spin_lock_key would shortcut for the queued_spin_lock_- slowpath() function that allow virt_spin_lock to hijack it. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Force to enable virt_spin_lock when KVM guest, because fair locks have horrible lock 'holder' preemption issues. Suggested-by: Leonardo Bras <leobras@redhat.com> Link: https://lkml.kernel.org/kvm/ZQK9-tn2MepXlY1u@redhat.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Add the files functions needed to support the SBI PVLOCK (paravirt qspinlock kick_cpu) extension. Implement kvm_sbi_ext_pvlock_kick_- cpu(), and we only need to call the kvm_vcpu_kick() and bring target_vcpu from the halt state. No irq raised, no other request, just a pure vcpu_kick. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Using static_call to switch between: native_queued_spin_lock_slowpath() __pv_queued_spin_lock_slowpath() native_queued_spin_unlock() __pv_queued_spin_unlock() Finish the pv_wait implementation, but pv_kick needs the SBI definition of the next patches. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Implement pv_kick with SBI guest implementation, and add SBI_EXT_PVLOCK extension detection. The backend part is in the KVM pvqspinlock patch. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Disables the qspinlock slow path using PV optimizations which allow the hypervisor to 'idle' the guest on lock contention. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Add kconfig entry for paravirt_spinlock, an unfair qspinlock virtualization-friendly backend, by halting the virtual CPU rather than spinning. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Add trace point for pv_kick&wait, here is the output: ls /sys/kernel/debug/tracing/events/paravirt/ enable filter pv_kick pv_wait cat /sys/kernel/debug/tracing/trace entries-in-buffer/entries-written: 33927/33927 #P:12 _-----=> irqs-off/BH-disabled / _----=> need-resched | / _---=> hardirq/softirq || / _--=> preempt-depth ||| / _-=> migrate-disable |||| / delay TASK-PID CPU# ||||| TIMESTAMP FUNCTION | | | ||||| | | sh-100 [001] d..2. 28.312294: pv_wait: cpu 1 out of wfi <idle>-0 [000] d.h4. 28.322030: pv_kick: cpu 0 kick target cpu 1 sh-100 [001] d..2. 30.982631: pv_wait: cpu 1 out of wfi <idle>-0 [000] d.h4. 30.993289: pv_kick: cpu 0 kick target cpu 1 sh-100 [002] d..2. 44.987573: pv_wait: cpu 2 out of wfi <idle>-0 [000] d.h4. 44.989000: pv_kick: cpu 0 kick target cpu 2 <idle>-0 [003] d.s3. 51.593978: pv_kick: cpu 3 kick target cpu 4 rcu_sched-15 [004] d..2. 51.595192: pv_wait: cpu 4 out of wfi lock_torture_wr-115 [004] ...2. 52.656482: pv_kick: cpu 4 kick target cpu 2 lock_torture_wr-113 [002] d..2. 52.659146: pv_wait: cpu 2 out of wfi lock_torture_wr-114 [008] d..2. 52.659507: pv_wait: cpu 8 out of wfi lock_torture_wr-114 [008] d..2. 52.663503: pv_wait: cpu 8 out of wfi lock_torture_wr-113 [002] ...2. 52.666128: pv_kick: cpu 2 kick target cpu 8 lock_torture_wr-114 [008] d..2. 52.667261: pv_wait: cpu 8 out of wfi lock_torture_wr-114 [009] .n.2. 53.141515: pv_kick: cpu 9 kick target cpu 11 lock_torture_wr-113 [002] d..2. 53.143339: pv_wait: cpu 2 out of wfi lock_torture_wr-116 [007] d..2. 53.143412: pv_wait: cpu 7 out of wfi lock_torture_wr-118 [000] d..2. 53.143457: pv_wait: cpu 0 out of wfi lock_torture_wr-115 [008] d..2. 53.143481: pv_wait: cpu 8 out of wfi lock_torture_wr-117 [011] d..2. 53.143522: pv_wait: cpu 11 out of wfi lock_torture_wr-117 [011] ...2. 53.143987: pv_kick: cpu 11 kick target cpu 8 lock_torture_wr-115 [008] ...2. 53.144269: pv_kick: cpu 8 kick target cpu 7 Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Upstream branch: f352a28 |
1134fc1
to
233e42e
Compare
Upstream branch: 5a2cf77 Pull request is NOT updated. Failed to apply https://patchwork.kernel.org/project/linux-riscv/list/?series=812817
conflict:
|
At least one diff in series https://patchwork.kernel.org/project/linux-riscv/list/?series=812817 expired. Closing PR. |
Pull request for series with
subject: riscv: Add Native/Paravirt qspinlock support
version: 12
url: https://patchwork.kernel.org/project/linux-riscv/list/?series=812817