riscv: Add Native/Paravirt qspinlock support #420

bjoto · 2023-12-25T13:17:57Z

Pull request for series with
subject: riscv: Add Native/Paravirt qspinlock support
version: 12
url: https://patchwork.kernel.org/project/linux-riscv/list/?series=812817

bjoto · 2023-12-25T13:17:58Z

Upstream branch: f352a28
series: https://patchwork.kernel.org/project/linux-riscv/list/?series=812817
version: 12

The arch_spinlock_t of qspinlock has contained the atomic_t val, which satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t into qspinlock_types.h. This is the preparation for the next combo spinlock. Reviewed-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/linux-riscv/CAK8P3a2rnz9mQqhN6-e0CGUUv9rntRELFdxt_weiD7FxH7fkfQ@mail.gmail.com/ Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Add a separate ticket-lock.h to include multiple spinlock versions and select one at compile time or runtime. Reviewed-by: Leonardo Bras <leobras@redhat.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/linux-riscv/CAK8P3a2rnz9mQqhN6-e0CGUUv9rntRELFdxt_weiD7FxH7fkfQ@mail.gmail.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Move errata vendor func-id definitions from errata_list into vendorid_list.h. Unifying these definitions is also for following rwonce errata implementation. Suggested-by: Leonardo Bras <leobras@redhat.com> Link: https://lore.kernel.org/linux-riscv/ZQLFJ1cmQ8PAoMHm@redhat.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

The early version of T-Head C9xx cores has a store merge buffer delay problem. The store merge buffer could improve the store queue performance by merging multi-store requests, but when there are not continued store requests, the prior single store request would be waiting in the store queue for a long time. That would cause significant problems for communication between multi-cores. This problem was found on sg2042 & th1520 platforms with the qspinlock lock torture test. So appending a fence w.o could immediately flush the store merge buffer and let other cores see the write result. This will apply the WRITE_ONCE errata to handle the non-standard behavior via appending a fence w.o instruction for WRITE_ONCE(). Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

The requirements of qspinlock have been documented by commit: a8ad07e ("asm-generic: qspinlock: Indicate the use of mixed-size atomics"). Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which doesn't satisfy the requirements of qspinlock above, it won't prevent some riscv vendors from implementing a strong fwd guarantee LR/SC in microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor is the one. We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress test on Fedora & Ubuntu & OpenEuler ... Here is the performance comparison between qspinlock and ticket_lock on sg2042 (64 cores): sysbench test=threads threads=32 yields=100 lock=8 (+13.8%): queued_spinlock 0.5109/0.00 ticket_spinlock 0.5814/0.00 perf futex/hash (+6.7%): queued_spinlock 1444393 operations/sec (+- 0.09%) ticket_spinlock 1353215 operations/sec (+- 0.15%) perf futex/wake-parallel (+8.6%): queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%) ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%) perf futex/requeue (+4.2%): queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%) ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%) System Benchmarks (+6.4%) queued_spinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 628613745.4 53865.8 Double-Precision Whetstone 55.0 182422.8 33167.8 Execl Throughput 43.0 13116.6 3050.4 File Copy 1024 bufsize 2000 maxblocks 3960.0 7762306.2 19601.8 File Copy 256 bufsize 500 maxblocks 1655.0 3417556.8 20649.9 File Copy 4096 bufsize 8000 maxblocks 5800.0 7427995.7 12806.9 Pipe Throughput 12440.0 23058600.5 18535.9 Pipe-based Context Switching 4000.0 2835617.7 7089.0 Process Creation 126.0 12537.3 995.0 Shell Scripts (1 concurrent) 42.4 57057.4 13456.9 Shell Scripts (8 concurrent) 6.0 7367.1 12278.5 System Call Overhead 15000.0 33308301.3 22205.5 ======== System Benchmarks Index Score 12426.1 ticket_spinlock: System Benchmarks Index Values BASELINE RESULT INDEX Dhrystone 2 using register variables 116700.0 626541701.9 53688.2 Double-Precision Whetstone 55.0 181921.0 33076.5 Execl Throughput 43.0 12625.1 2936.1 File Copy 1024 bufsize 2000 maxblocks 3960.0 6553792.9 16550.0 File Copy 256 bufsize 500 maxblocks 1655.0 3189231.6 19270.3 File Copy 4096 bufsize 8000 maxblocks 5800.0 7221277.0 12450.5 Pipe Throughput 12440.0 20594018.7 16554.7 Pipe-based Context Switching 4000.0 2571117.7 6427.8 Process Creation 126.0 10798.4 857.0 Shell Scripts (1 concurrent) 42.4 57227.5 13497.1 Shell Scripts (8 concurrent) 6.0 7329.2 12215.3 System Call Overhead 15000.0 30766778.4 20511.2 ======== System Benchmarks Index Score 11670.7 The qspinlock has a significant improvement on SOPHGO SG2042 64 cores platform than the ticket_lock. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Combo spinlock could support queued and ticket in one Linux Image and select them during boot time via command line. Here is the func size (Bytes) comparison table below: TYPE : COMBO | TICKET | QUEUED arch_spin_lock : 106 | 60 | 50 arch_spin_unlock : 54 | 36 | 26 arch_spin_trylock : 110 | 72 | 54 arch_spin_is_locked : 48 | 34 | 20 arch_spin_is_contended : 56 | 40 | 24 rch_spin_value_unlocked : 48 | 34 | 24 One example of disassemble combo arch_spin_unlock: <+14>: nop # detour slot <+18>: fence rw,w --+-> queued_spin_unlock <+22>: sb zero,0(a4) --+ (2 instructions) <+26>: ld s0,8(sp) <+28>: addi sp,sp,16 <+30>: ret <+32>: lw a5,0(a4) --+-> ticket_spin_unlock <+34>: sext.w a5,a5 | (7 instructions) <+36>: fence rw,w | <+40>: addiw a5,a5,1 | <+42>: slli a5,a5,0x30 | <+44>: srli a5,a5,0x30 | <+46>: sh a5,0(a4) --+ <+50>: ld s0,8(sp) <+52>: addi sp,sp,16 <+54>: ret The qspinlock is smaller and faster than ticket-lock when all are in a fast path. The combo spinlock could provide a compatible Linux Image for different micro-arch designs that have/haven't forward progress guarantee. Use command line options to select between qspinlock and ticket-lock, and the default is ticket-lock. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Add a static key controlling whether virt_spin_lock() should be called or not. When running on bare metal set the new key to false. The VM guests should fall back to a Test-and-Set spinlock, because fair locks have horrible lock 'holder' preemption issues. The virt_spin_lock_key would shortcut for the queued_spin_lock_- slowpath() function that allow virt_spin_lock to hijack it. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Force to enable virt_spin_lock when KVM guest, because fair locks have horrible lock 'holder' preemption issues. Suggested-by: Leonardo Bras <leobras@redhat.com> Link: https://lkml.kernel.org/kvm/ZQK9-tn2MepXlY1u@redhat.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Add the files functions needed to support the SBI PVLOCK (paravirt qspinlock kick_cpu) extension. Implement kvm_sbi_ext_pvlock_kick_- cpu(), and we only need to call the kvm_vcpu_kick() and bring target_vcpu from the halt state. No irq raised, no other request, just a pure vcpu_kick. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Using static_call to switch between: native_queued_spin_lock_slowpath() __pv_queued_spin_lock_slowpath() native_queued_spin_unlock() __pv_queued_spin_unlock() Finish the pv_wait implementation, but pv_kick needs the SBI definition of the next patches. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Implement pv_kick with SBI guest implementation, and add SBI_EXT_PVLOCK extension detection. The backend part is in the KVM pvqspinlock patch. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Disables the qspinlock slow path using PV optimizations which allow the hypervisor to 'idle' the guest on lock contention. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Add kconfig entry for paravirt_spinlock, an unfair qspinlock virtualization-friendly backend, by halting the virtual CPU rather than spinning. Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

Add trace point for pv_kick&wait, here is the output: ls /sys/kernel/debug/tracing/events/paravirt/ enable filter pv_kick pv_wait cat /sys/kernel/debug/tracing/trace entries-in-buffer/entries-written: 33927/33927 #P:12 _-----=> irqs-off/BH-disabled / _----=> need-resched | / _---=> hardirq/softirq || / _--=> preempt-depth ||| / _-=> migrate-disable |||| / delay TASK-PID CPU# ||||| TIMESTAMP FUNCTION | | | ||||| | | sh-100 [001] d..2. 28.312294: pv_wait: cpu 1 out of wfi <idle>-0 [000] d.h4. 28.322030: pv_kick: cpu 0 kick target cpu 1 sh-100 [001] d..2. 30.982631: pv_wait: cpu 1 out of wfi <idle>-0 [000] d.h4. 30.993289: pv_kick: cpu 0 kick target cpu 1 sh-100 [002] d..2. 44.987573: pv_wait: cpu 2 out of wfi <idle>-0 [000] d.h4. 44.989000: pv_kick: cpu 0 kick target cpu 2 <idle>-0 [003] d.s3. 51.593978: pv_kick: cpu 3 kick target cpu 4 rcu_sched-15 [004] d..2. 51.595192: pv_wait: cpu 4 out of wfi lock_torture_wr-115 [004] ...2. 52.656482: pv_kick: cpu 4 kick target cpu 2 lock_torture_wr-113 [002] d..2. 52.659146: pv_wait: cpu 2 out of wfi lock_torture_wr-114 [008] d..2. 52.659507: pv_wait: cpu 8 out of wfi lock_torture_wr-114 [008] d..2. 52.663503: pv_wait: cpu 8 out of wfi lock_torture_wr-113 [002] ...2. 52.666128: pv_kick: cpu 2 kick target cpu 8 lock_torture_wr-114 [008] d..2. 52.667261: pv_wait: cpu 8 out of wfi lock_torture_wr-114 [009] .n.2. 53.141515: pv_kick: cpu 9 kick target cpu 11 lock_torture_wr-113 [002] d..2. 53.143339: pv_wait: cpu 2 out of wfi lock_torture_wr-116 [007] d..2. 53.143412: pv_wait: cpu 7 out of wfi lock_torture_wr-118 [000] d..2. 53.143457: pv_wait: cpu 0 out of wfi lock_torture_wr-115 [008] d..2. 53.143481: pv_wait: cpu 8 out of wfi lock_torture_wr-117 [011] d..2. 53.143522: pv_wait: cpu 11 out of wfi lock_torture_wr-117 [011] ...2. 53.143987: pv_kick: cpu 11 kick target cpu 8 lock_torture_wr-115 [008] ...2. 53.144269: pv_kick: cpu 8 kick target cpu 7 Reviewed-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>

bjoto · 2024-01-05T13:38:50Z

Upstream branch: f352a28
series: https://patchwork.kernel.org/project/linux-riscv/list/?series=812817
version: 12

bjoto · 2024-01-05T21:57:14Z

Upstream branch: 5a2cf77
series: https://patchwork.kernel.org/project/linux-riscv/list/?series=812817
version: 12

Pull request is NOT updated. Failed to apply https://patchwork.kernel.org/project/linux-riscv/list/?series=812817
error message:

Cmd('git') failed due to: exit code(128)
  cmdline: git am -s --3way
  stdout: 'Applying: asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
Applying: asm-generic: ticket-lock: Add separate ticket-lock.h
Applying: riscv: errata: Move errata vendor func-id into vendorid_list.h
Applying: riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
Applying: riscv: qspinlock: Add basic queued_spinlock support
Applying: riscv: qspinlock: Introduce combo spinlock
Applying: riscv: qspinlock: Add virt_spin_lock() support for VM guest
Using index info to reconstruct a base tree...
M	arch/riscv/kernel/setup.c
Falling back to patching base and 3-way merge...
Auto-merging arch/riscv/kernel/setup.c
CONFLICT (content): Merge conflict in arch/riscv/kernel/setup.c
Patch failed at 0007 riscv: qspinlock: Add virt_spin_lock() support for VM guest
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".'
  stderr: 'error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch'

conflict:

diff --cc arch/riscv/kernel/setup.c
index 729e4361f13c,0bafb9fd6ea3..000000000000
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@@ -26,6 -26,8 +26,11 @@@
  #include <asm/alternative.h>
  #include <asm/cacheflush.h>
  #include <asm/cpufeature.h>
++<<<<<<< HEAD
++=======
+ #include <asm/cpu_ops.h>
+ #include <asm/cpufeature.h>
++>>>>>>> riscv: qspinlock: Add virt_spin_lock() support for VM guest
  #include <asm/early_ioremap.h>
  #include <asm/pgtable.h>
  #include <asm/setup.h>

bjoto · 2024-01-08T09:39:44Z

At least one diff in series https://patchwork.kernel.org/project/linux-riscv/list/?series=812817 expired. Closing PR.

bjoto added new for-next V12 labels Dec 25, 2023

adding ci files

10947c1

bjoto force-pushed the for-next_base branch from fb37e30 to 10947c1 Compare January 5, 2024 13:32

guoren83 added 14 commits January 5, 2024 13:38

bjoto force-pushed the series/782738=>for-next branch from 1134fc1 to 233e42e Compare January 5, 2024 13:38

bjoto force-pushed the for-next_base branch from 10947c1 to 67e0de6 Compare January 5, 2024 21:51

bjoto added the merge-conflict label Jan 5, 2024

bjoto added changes-requested and removed merge-conflict new labels Jan 8, 2024

bjoto closed this Jan 8, 2024

bjoto deleted the series/782738=>for-next branch January 10, 2024 09:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

riscv: Add Native/Paravirt qspinlock support #420

riscv: Add Native/Paravirt qspinlock support #420

bjoto commented Dec 25, 2023

bjoto commented Dec 25, 2023

bjoto commented Jan 5, 2024

bjoto commented Jan 5, 2024

bjoto commented Jan 8, 2024

riscv: Add Native/Paravirt qspinlock support #420

riscv: Add Native/Paravirt qspinlock support #420

Conversation

bjoto commented Dec 25, 2023

bjoto commented Dec 25, 2023

bjoto commented Jan 5, 2024

bjoto commented Jan 5, 2024

bjoto commented Jan 8, 2024