runtime: implement procyield as ISB instruction on arm64 #69232
Labels
arch-arm64
compiler/runtime
Issues related to the Go compiler and/or runtime.
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
When looking into #68578, I found that the implementation of
runtime.procyield
on GOARCH=arm64 uses theYIELD
instruction, and that theYIELD
instruction is in effect a (fast)NOP
.The current code is at
go/src/runtime/asm_arm64.s
Lines 917 to 923 in 9a4fe7e
runtime.BenchmarkProcYield
, at https://go.dev/cl/601396 .The difference in delay between amd64 (slow, using
PAUSE
) and arm64 (fast, usingYIELD
) makes it hard to be confident in the tuning of theruntime.lock2
spin loop. Note that it's easy to tune the spin loop for the specific duration of a microbenchmark's critical section, which might not be the best tuning for Go overall.It looks like Rust uses
ISB SY
, https://github.com/rust-lang/rust/blob/d6c8169c186ab16a3404cd0d0866674018e8a19e/library/core/src/hint.rs#L291-L295 , changed in rust-lang/rust@c064b65 . I've confirmed that usingISB
results in a longer delay on the hardware most easily available to me (M1 MacBook Air), which I'd expect to be a benefit toruntime.lock2
, both reducing the likelihood of acquiring the lock without a sleep and controlling the electrical energy used to do so.What I'd most prefer is for Go 1.24 to include a fix for #68578 , and for that to be the only change to
runtime.lock2
in the Go 1.24 cycle (so it's clear whether that change is to blame for any changes in mutex performance). But I'm opening this now so we at least don't lose track of it.CC @golang/runtime @golang/arm
The text was updated successfully, but these errors were encountered: