You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When shift == 0, shlVU and shrVU reduce to a memcopy. When z.ptr == x.ptr, it further reduces to a no-op. The pure Go implementation has these optimizations, as of https://go-review.googlesource.com/c/go/+/164967. The arm64 implementation has one of them (see #31084 (comment)). We should add both to the amd64 implementation.
DO NOT MAIL
TODO: shrVU too
TODO: benchmarks
TODO: fuzz for confidence
TODO: better commit message
When shift == 0, shlVU and shrVU reduce to a memcopy. When z.ptr == x.ptr, it further reduces to a no-op. The pure Go implementation has these optimizations, as of https://go-review.googlesource.com/c/go/+/164967. The arm64 implementation has one of them (see golang#31084 (comment)). We should add both to the amd64 implementation.
cc @griesemerFixesgolang#31097
Change-Id: I3979d7c82a63e1840c8191636a8947e8f440af3b
Can this be done in the wrappers/callers instead so the per-arch assembler as well as the generic can just assume that this optimization has been applied?
This also allows SSA to see where these conditions might be constant either now or in the future.
Good question. As of this moment there aren’t any pure go wrappers for these functions—they all go straight to the assembly implementations. Now that we have mid-stack inlining, it might make sense to change that, and do optimizations like this in the wrappers, so they can skip the call entirely. Want to experiment and send a CL for 1.16 if appropriate?
When
shift == 0
,shlVU
andshrVU
reduce to a memcopy. Whenz.ptr == x.ptr
, it further reduces to a no-op. The pure Go implementation has these optimizations, as of https://go-review.googlesource.com/c/go/+/164967. The arm64 implementation has one of them (see #31084 (comment)). We should add both to the amd64 implementation.cc @griesemer
The text was updated successfully, but these errors were encountered: