-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: eliminate redundant zeroing after lower pass #47107
Comments
The zeroing could still be detected as unnecessary in the current dead store pass before the writes are folded into one |
I don't think dse can elide this. |
I tried this rule that elided the redundant zero store rule
|
Yes, that's the current problem. It's not combining the ranges of the individual byte writes, so when it gets to the 8-byte Zero it is bigger than the shadow size. We need a more robust computation of what is shadowed. We need to combine byte writes into larger regions.
This should shadow 2 bytes at v10/v18. Currently dse treats those
I'd really rather fix dse. Your rule works only for 8-byte constant stores, and only for amd64. We'd need a lot more rules to fix it for all cases. |
For me, on master, the example from the original comment compiles ideally:
However, the following still has unnecessary zeroing:
Interestingly, there's no zeroing when storing through a pointer:
but I haven't checked to see if the root cause for that is the same. |
@dominikh There is still redundant zeroing. It seems TEXT command-line-arguments.foo(SB), NOSPLIT|ABIInternal, $0-16
...
MOVQ $0, command-line-arguments.b+8(SP) // <- zero the result value
MOVQ AX, command-line-arguments.b+8(SP) // then overwrite it using the first argument
RET |
You're right. I must've misread. |
Change https://go.dev/cl/629016 mentions this issue: |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
https://go.godbolt.org/z/64qsjGTcd
What did you expect to see?
Compiler should generate
What did you see instead?
Compiler generates
According to SSA dump and rules, I see that one-byte shift-and-moves are folded to one
MOVQ
duringlower
pass, so zeroing cannot be elimated duringopt
passes.go/src/cmd/compile/internal/ssa/gen/AMD64.rules
Lines 1965 to 1982 in 04cd717
The text was updated successfully, but these errors were encountered: