-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s390x: update some regalloc metadata to remove use of reg_mod
.
#4856
Conversation
cc @uweigand to review? I also spent the past day trying to go further in this branch, and managed to clean up a lot more, but there are some panics from undefined regs there so I don't have it quite right. @uweigand if you have time to poke at this more I'd very much appreciate it (but it's not urgent at all!). My long-term goal is to push the regalloc input toward fully-SSA (no mods, no multiple defs) and fully constraint-based (no pinned vregs) input, which allows for more flexibility in managing copies and spills, and makes for a more efficient solver generally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks generally good to me, with the exception of the assembler output issue (see inline comment). Thanks!
let inst = Inst::AluRR { | ||
alu_op, | ||
rd, | ||
ri: rd.to_reg(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a minor nit, but it would feel slightly cleaner to me to pass rn
instead of rd.to_reg()
. (Of course those two expression have the same value in this branch of the if
.) In fact, I'm even wondering whether it wouldn't be cleanest to rename all those ri
to rn
-- that should make it more obious that a AluRR { op, rd, rn, rm }
has identical semantic to a AluRRR { op, rd, rn, rm }
if rd
is tied to rn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use rn
's value, thanks. I actually lean slightly toward keeping these fields named ri
, to make it clear that they are artificial instruction fields, the "input" side of the dest reg, rather than a real rn
field (this also made it easier to grep for things when updating tests just now!). But I'm happy to alter the field name as well if you feel strongly about this.
The main problem here seems to be the tricks I had been playing with But fortunately, with the new model we can instead just load up the two halves into independent vregs and just construct a regpair from those two vregs. That fixes the "udiv" case. For the "sdiv" case, the instruction actually does not read the high half of the input regpair, so it actually should be uninitialized. But here we can simply change the Overall, this change simplifies the logic around regpairs anyway, so I like it. I've attached a patch to implement those changes. In addition, I noticed that you've consistently swapped register numbers: the high half of the pair goes into Now, I'm running into a new error:
This looks like a regalloc problem (at first glance, it occurs when using multiple wide multiplications in a row, so maybe regalloc runs into conflicts since they're all forced into the same physical register pair?) ... could you have a look here? |
Thanks @uweigand! I've updated based on feedback (and, importantly, reverted the 2-to-3-arg change in assembly printing). Thanks for looking further at the followup patch as well; I will pick that up and try to finish it next week, most likely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. Note the inline comment about one minor regalloc regression, but that doesn't block this PR.
; lgr %r3, %r2 | ||
; llihf %r2, 2863311530 | ||
; iilf %r2, 2863311530 | ||
; lgr %r5, %r3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lgr
look new - this is why the new code is two instructions longer than the old code. Not sure if this is something that could still be improved in regalloc, or if this is just one of those random changes ... In any case, not a big deal, I just wanted to point it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly looked but it was nothing really obvious. It's possible that my changes in bytecodealliance/regalloc2#74 might help a bit, but I'm not sure; let's see if it reverts back once I update tests there with this merged :-)
This is a step toward ultimately removing modify-operands, which along with removal of pinned vregs, lets us move to a completely constraint-based and fully-SSA regalloc input and get some nice advantages eventually. There are still a few uses of `mod` operands and pinned vregs remaining, especially around the "regpair" abstraction. Those proved to be a bit trickier to update though, so will have to be done separately.
8728bcd
to
d7441f4
Compare
Ah, I think this needs an r+ from someone with write access to the repo -- anyone want to give a rubberstamp on top of Ulrich's review above? |
This is a step toward ultimately removing modify-operands, which along
with removal of pinned vregs, lets us move to a completely
constraint-based and fully-SSA regalloc input and get some nice
advantages eventually.
There are still a few uses of
mod
operands and pinned vregs remaining,especially around the "regpair" abstraction. Those proved to be a bit
trickier to update though, so will have to be done separately.