package p
import "encoding/binary"
func f(b []byte, x *[8]byte) {
_ = b[8]
t := binary.BigEndian.Uint64(x[:])
binary.BigEndian.PutUint64(b, t)
}
This should compile down to two MOVQs on amd64, one to load from x and one to write to b.
Instead, it contains two successive BSWAP instructions. We should eliminate those. As a bonus, we should eliminate any BSWAPS separated only by bitwise operations. For example:
package p
import "encoding/binary"
func f(b []byte, x *[8]byte) {
_ = b[8]
t := binary.BigEndian.Uint64(x[:])
t = ^t
binary.BigEndian.PutUint64(b, t)
}
(In addition to unary ^, there's binary &, |, and ^. Maybe some others?)
One advantage to this kind of optimization is that it reduces the important of whether the endianness selected in the code matches the endianness of the architecture the code is being executed on.
Related: #41663
cc @agarciamontoro
This should compile down to two MOVQs on amd64, one to load from x and one to write to b.
Instead, it contains two successive BSWAP instructions. We should eliminate those. As a bonus, we should eliminate any BSWAPS separated only by bitwise operations. For example:
(In addition to unary ^, there's binary &, |, and ^. Maybe some others?)
One advantage to this kind of optimization is that it reduces the important of whether the endianness selected in the code matches the endianness of the architecture the code is being executed on.
Related: #41663
cc @agarciamontoro