s390x: update some regalloc metadata to remove use of `reg_mod`. #4856

cfallin · 2022-09-02T19:12:50Z

This is a step toward ultimately removing modify-operands, which along
with removal of pinned vregs, lets us move to a completely
constraint-based and fully-SSA regalloc input and get some nice
advantages eventually.

There are still a few uses of mod operands and pinned vregs remaining,
especially around the "regpair" abstraction. Those proved to be a bit
trickier to update though, so will have to be done separately.

cfallin · 2022-09-02T19:15:22Z

cc @uweigand to review?

I also spent the past day trying to go further in this branch, and managed to clean up a lot more, but there are some panics from undefined regs there so I don't have it quite right. @uweigand if you have time to poke at this more I'd very much appreciate it (but it's not urgent at all!). My long-term goal is to push the regalloc input toward fully-SSA (no mods, no multiple defs) and fully constraint-based (no pinned vregs) input, which allows for more flexibility in managing copies and spills, and makes for a more efficient solver generally.

uweigand

Looks generally good to me, with the exception of the assembler output issue (see inline comment). Thanks!

uweigand · 2022-09-05T13:48:14Z

cranelift/codegen/src/isa/s390x/inst/emit.rs

+                    let inst = Inst::AluRR {
+                        alu_op,
+                        rd,
+                        ri: rd.to_reg(),


This is just a minor nit, but it would feel slightly cleaner to me to pass rn instead of rd.to_reg(). (Of course those two expression have the same value in this branch of the if.) In fact, I'm even wondering whether it wouldn't be cleanest to rename all those ri to rn -- that should make it more obious that a AluRR { op, rd, rn, rm } has identical semantic to a AluRRR { op, rd, rn, rm } if rd is tied to rn.

Updated to use rn's value, thanks. I actually lean slightly toward keeping these fields named ri, to make it clear that they are artificial instruction fields, the "input" side of the dest reg, rather than a real rn field (this also made it easier to grep for things when updating tests just now!). But I'm happy to alter the field name as well if you feel strongly about this.

cranelift/filetests/filetests/isa/s390x/arithmetic.clif

cranelift/codegen/src/isa/s390x/inst.isle

uweigand · 2022-09-05T15:52:16Z

I also spent the past day trying to go further in this branch, and managed to clean up a lot more, but there are some panics from undefined regs there so I don't have it quite right. @uweigand if you have time to poke at this more I'd very much appreciate it (but it's not urgent at all!). My long-term goal is to push the regalloc input toward fully-SSA (no mods, no multiple defs) and fully constraint-based (no pinned vregs) input, which allows for more flexibility in managing copies and spills, and makes for a more efficient solver generally.

The main problem here seems to be the tricks I had been playing with uninitialized_regpair. This was intended to solve the problem of how to initialize a register pair for those instructions that use one as input (basically, divides). My model has been that I need to allocate a register pair (uninitialized at this point), and then load up low and high parts of it. That used to work with the old regpair method, but with the new method it now exposes those uninitialized registers to regalloc, which it doesn't like.

But fortunately, with the new model we can instead just load up the two halves into independent vregs and just construct a regpair from those two vregs. That fixes the "udiv" case. For the "sdiv" case, the instruction actually does not read the high half of the input regpair, so it actually should be uninitialized. But here we can simply change the sdivmod pattern to just only take a Reg instead of a RegPair as input, which is closer to the true semantics anyway.

Overall, this change simplifies the logic around regpairs anyway, so I like it. I've attached a patch to implement those changes.
regpair-patch.txt

In addition, I noticed that you've consistently swapped register numbers: the high half of the pair goes into %r0, and the low half goes into %r1 (we're bigendian, after all ...). Also the two inputs to umul_wide were swapped (I guess the operation is commutative, but it still was a surprise). I've added those changes to the patch as well.

Now, I'm running into a new error:

FAIL filetests/filetests/isa/s390x/vec-arithmetic.clif: panicked in worker #10: Could not allocate minimal bundle, but the allocation problem should be possible to solve

This looks like a regalloc problem (at first glance, it occurs when using multiple wide multiplications in a row, so maybe regalloc runs into conflicts since they're all forced into the same physical register pair?) ... could you have a look here?

cfallin · 2022-09-08T23:49:24Z

Thanks @uweigand! I've updated based on feedback (and, importantly, reverted the 2-to-3-arg change in assembly printing). Thanks for looking further at the followup patch as well; I will pick that up and try to finish it next week, most likely.

uweigand

LGTM now. Note the inline comment about one minor regalloc regression, but that doesn't block this PR.

uweigand · 2022-09-09T09:53:39Z

cranelift/filetests/filetests/isa/s390x/bitops.clif

+;   lgr %r3, %r2
+;   llihf %r2, 2863311530
+;   iilf %r2, 2863311530
+;   lgr %r5, %r3


These two lgr look new - this is why the new code is two instructions longer than the old code. Not sure if this is something that could still be improved in regalloc, or if this is just one of those random changes ... In any case, not a big deal, I just wanted to point it out.

I briefly looked but it was nothing really obvious. It's possible that my changes in bytecodealliance/regalloc2#74 might help a bit, but I'm not sure; let's see if it reverts back once I update tests there with this merged :-)

This is a step toward ultimately removing modify-operands, which along with removal of pinned vregs, lets us move to a completely constraint-based and fully-SSA regalloc input and get some nice advantages eventually. There are still a few uses of `mod` operands and pinned vregs remaining, especially around the "regpair" abstraction. Those proved to be a bit trickier to update though, so will have to be done separately.

cfallin · 2022-09-09T22:43:45Z

Ah, I think this needs an r+ from someone with write access to the repo -- anyone want to give a rubberstamp on top of Ulrich's review above?

github-actions bot added the cranelift Issues related to the Cranelift code generator label Sep 2, 2022

uweigand reviewed Sep 5, 2022

View reviewed changes

uweigand approved these changes Sep 9, 2022

View reviewed changes

cfallin added 3 commits September 9, 2022 15:08

Review feedback: restore two-arg pretty-print form.

68d0ff6

Review feedback.

d7441f4

cfallin force-pushed the s390x-ra2-semantics branch from 8728bcd to d7441f4 Compare September 9, 2022 22:12

cfallin enabled auto-merge (squash) September 9, 2022 22:13

cfallin requested review from alexcrichton, elliottt and fitzgen September 9, 2022 22:44

alexcrichton approved these changes Sep 9, 2022

View reviewed changes

cfallin merged commit 96bfd4e into bytecodealliance:main Sep 9, 2022

cfallin deleted the s390x-ra2-semantics branch September 10, 2022 00:05

elliottt mentioned this pull request Oct 19, 2022

Remove uses of reg_mod from s390x #5073

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s390x: update some regalloc metadata to remove use of `reg_mod`. #4856

s390x: update some regalloc metadata to remove use of `reg_mod`. #4856

cfallin commented Sep 2, 2022

cfallin commented Sep 2, 2022

uweigand left a comment

uweigand Sep 5, 2022

cfallin Sep 8, 2022

uweigand commented Sep 5, 2022

cfallin commented Sep 8, 2022

uweigand left a comment

uweigand Sep 9, 2022

cfallin Sep 9, 2022

cfallin commented Sep 9, 2022

s390x: update some regalloc metadata to remove use of reg_mod. #4856

s390x: update some regalloc metadata to remove use of reg_mod. #4856

Conversation

cfallin commented Sep 2, 2022

cfallin commented Sep 2, 2022

uweigand left a comment

Choose a reason for hiding this comment

uweigand Sep 5, 2022

Choose a reason for hiding this comment

cfallin Sep 8, 2022

Choose a reason for hiding this comment

uweigand commented Sep 5, 2022

cfallin commented Sep 8, 2022

uweigand left a comment

Choose a reason for hiding this comment

uweigand Sep 9, 2022

Choose a reason for hiding this comment

cfallin Sep 9, 2022

Choose a reason for hiding this comment

cfallin commented Sep 9, 2022

s390x: update some regalloc metadata to remove use of `reg_mod`. #4856

s390x: update some regalloc metadata to remove use of `reg_mod`. #4856