cmd/compile: arm64: emit double-register loads and stores in lowering #19715
ARMv8 has "double-register" loads and stores (
Since these instructions can accept any general-purpose register (unlike the corresponding ARMv7 instructions
In particular, I'd expect this change to make the sequence of instructions preceding and following a call op (with the existing ABI) to be reduced by up to a factor of 2. We can also use
CC: @williamweixiao I'm happy to take this on, but please let me know if you have similar work planned.
I don't plan similar work to optimize arm64 based on "ldp" and "stp". Please go ahead with it!