/ go Public
cmd/compile: shorter slicing instruction sequence on x86/amd64 #47969
Issues related to the Go compiler and/or runtime.
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Compiling on tip:
and cutting out the function prologue on amd64 we get:
We should be able to slim that down to (not tested):
by pulling the AND and OpSlicemask operation in the ssa generation phase into a single new OpSlicedelta operation:
By either making the compiler SSA optimizations smarter or pulling even more operations into a special SSA Op we could save the TESTQ and be able to get to:
However it is unclear if this will be any faster (or worth the complexity) without benchmarking when the scaling of the index for the delta happens after the CMOV.
A further reduction in instructions is possible by moving the panic jumps to be dependent on the SUB instructions:
That then will need extra handling in recovering the original slice len/cap in the panicpath.
At last for this specific case the SHL and ADD can be folded into a LEA:
/cc @randall77 @josharian @mdempsky
The text was updated successfully, but these errors were encountered: