Skip to content

Conversation

@elliottt
Copy link
Member

@elliottt elliottt commented Jun 23, 2023

As reported in #145, under-constrained register uses when an instruction clobbers all registers can lead to the TooManyLiveRegs error being raised, despite the problem being resolvable by spilling.

This PR addresses the problem with a bit of a hack: rewriting the OperandConstraint::Reg uses in this situation to fixed register constraints, where the physical register chosen is known to not conflict with other uses in the instruction. This allows us to reuse the existing logic present when resolving clobbered fixed register constraints, though the solution feels bad.

I think that a better solution would be to emit virtual stack uses when this situation is detected instead, to force the value to live on the stack for the duration of the instruction. However, this exercises an unhandled case in fixup_multi_fixed_vregs: values living in registers and the stack at the same time. I think this could be overcome by reviving parts of the original branch to support overlaps by generating overlapping liveranges for the values to live on the stack. However it's unclear to me if this will resolve the original problem as the problematic liveranges will still be split on instruction boundaries, and will still conflict with the preg liveranges inserted for the clobbers.

@elliottt elliottt changed the title trevor/everything clobbered 2 Handle under-constrained uses when all registers are clobbered Jun 23, 2023
@elliottt elliottt marked this pull request as ready for review June 23, 2023 19:25
@elliottt elliottt requested a review from cfallin June 23, 2023 20:07
@elliottt
Copy link
Member Author

I've updated the PR to also handle defs. The two bugs that I was working on are present in cranelift prior to bytecodealliance/wasmtime#6632. The first is a non-tail call of a function that uses the tail calling convention:

test compile
set opt_level=speed_and_size
target x86_64

function u1:6() system_v {
    sig0 = () tail
    fn0 = u1:7 sig0

block0:
    v5 = func_addr.i64 fn0
    call_indirect sig0, v5()
    call_indirect sig0, v5()
    return
}

Prior to this PR, the program will fail with TooManyLiveRegs as there will be no possible assignment for v5. The bundle is split twice, and the resulting liverange covers the entire call instruction, which then conflicts with the preg allocations for the clobbers (the tail convention clobbers everything).

The second case is where a function with the tail calling convention returns a value:

test compile
set opt_level=speed_and_size
target x86_64

function u1:6() system_v {
    sig0 = () -> i8 tail
    fn0 = u1:7 sig0

block0:
    v5 = func_addr.i64 fn0
    v0 = call_indirect sig0, v5()
    v1 = call_indirect sig0, v5()
    return
}

This fails in a similar way, but we can avoid it by checking to see if the combination of clobbers and fixed register defs consumes the whole set of available registers for a register class.

@cfallin
Copy link
Member

cfallin commented Jun 27, 2023

I talked with Nick a bit about this today, and would love to talk with you more when you're back, @elliottt. Overall, I agree this is quite an annoying issue and the current limitations are unfortunate. However, I think I'm coming around to a "don't hold it that way" conclusion (i.e., it is an error to present regalloc2 with a situation where there aren't enough non-clobbered regs to hold reg-constrained values that must remain live after the inst) because just detecting "all regs in a class are clobbered" and doing something special isn't enough.

To see why: consider the case where all but one register in a class is clobbered, but we have two args to that instruction with Reg constraints and both are downward-live. The logic in this PR won't fire, but nevertheless the constraints are impossible. So then we need a notion of register pressure, and we need to relax the constraints more generally ahead of time, using some (conservative but no-false-negatives) model of what will "fit". We need to make sure that model covers any limits implied by the invariants later in the pipeline.

Or, we need to somehow detect at allocation time that we've probed all possible registers, none fit, but we have one free for the use-point, and turn a split at the middle of the instruction into overlapping liveranges. In full generality I actually like this option the best -- relax the "always split at inst boundaries" logic, and instead just start the second half of the split one half-inst early, so it can do its move beforehand. Then we'd just naturally split after the uses and before the clobbers on the problematic instruction. But it's a larger refactor, or at least, I can't convince myself on the spot that it won't break something else.

Absent that, I don't want to add special cases that detect register pressure situations, because this requires careful thinking about which cases exactly we can handle (or seen another way: what is the precise set of invariants on the constraint problem at each stage of lowering) and more tightly couples things, and we can miss cases (as above). So I think the best short-term solution is probably to adjust the calling convention in Cranelift as was one in bytecodealliance/wasmtime#6632. And we declare the invariant: there must be enough registers excluded from the clobber-set on any inst to allow downward-live any-reg args to fit somewhere. What do you think?

@elliottt
Copy link
Member Author

elliottt commented Jun 27, 2023

I like the approach that handles this situation by splitting in the middle of an inst the best. I agree that it's going to be a bit of a larger refactor, but at least we have support for overlapping liveranges now! I'd be happy to brainstorm a bit with you on that once I'm back, and then tackle it during my free time.

Probably a good first step is to sanity-check instruction operands to ensure that they are solvable, in a similar way that's done on this pr, but with an assert instead of trying to build up the available register iterator.

@elliottt elliottt closed this Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants