[Wasm RyuJIT] Fixes for two codegen asserts#126514
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
Fixes two WebAssembly RyuJIT codegen assertion failures by handling edge cases around struct locals whose effective “register type” is not directly representable as TYP_STRUCT in wasm codegen.
Changes:
- Ignore leftover
GT_LCL_VARnodes whose computed register type isTYP_UNDEFwhen they are marked asIsUnusedValue()(e.g., after block-store pruning in LIR). - Ensure
GT_STORE_LCL_VARfor enregistered struct locals uses the struct layout’s register type when determining thelocal.setoperand size.
|
@dotnet/jit-contrib PTAL |
src/coreclr/jit/liveness.cpp
Outdated
| } | ||
| } | ||
| LclVarDsc& varDsc = m_compiler->lvaTable[lclAddr->AsLclVarCommon()->GetLclNum()]; | ||
| isDeadStore = varDsc.lvTracked && !VarSetOps::IsMember(m_compiler, life, varDsc.lvVarIndex); |
There was a problem hiding this comment.
@jakobbotsch Do you have thoughts on how to do this correctly? Ideally we would call ComputeLifeLocal here to get the exact same true/false value as the old analysis, but doing that will introduce liveness tracking bugs because ComputeLifeLocal will get called again later on the same node. Should I rework ComputeLifeLocal so it has a read-only mode that can be used here?
For more context the problem is that on wasm we get the LIR order LCL_ADDR, LCL_VAR, STORE_BLK while other targets have LCL_VAR, LCL_ADDR, STORE_BLK. This means that the existing approach of pruning the block store from inside of the lcl_addr doesn't work anymore - it leaves a stray lcl_var behind.
There was a problem hiding this comment.
it leaves a stray lcl_var behind.
Why is that a problem? As a principle the backends should handle unused values whether or not they get removed by this liveness pass.
There was a problem hiding this comment.
the lcl_var is a struct, so we have no way to push it onto the evaluation stack. I think Andy said that other backends don't like this either (stray lcl_vars that don't fit into a register)
There was a problem hiding this comment.
It sounds like WASM is not marking the LCL_VAR as contained then. I think other backends will do that in lowering for STORE_BLK and I would expect that to make their respective codegens handle the stray LCL_VAR fine.
There was a problem hiding this comment.
It sounds like WASM is not marking the LCL_VAR as contained then. I think other backends will do that in lowering for STORE_BLK and I would expect that to make their respective codegens handle the stray LCL_VAR fine.
We don't have special handling for stay unused TYP_STRUCT locals in codegen (see e. g.
runtime/src/coreclr/jit/codegenxarch.cpp
Line 5081 in 1c60926
I don't think it would be especially useful to make that (TYP_STRUCT local legal if unused) a new contract, since it won't really solve the problem with WASM where we can't get away with treating LCL_ADDR as the location of the def.
For me the question here is what do we need to do to support async's use of liveness, since for LIR purposes it would be sufficient to only handle assorted 'blessed' IR shapes for indirect stores to tracked (or promoted) locals.
There was a problem hiding this comment.
Thanks for diving into this, you two. I think what I'll do is make sure we handle unused nodes properly in codegen + teach liveness to remove the stray LCL_VAR when it removes the STORE_BLK.
There was a problem hiding this comment.
I can't remove the stray lcl_var trivially without breaking liveness, will keep digging into it but I'd kind of prefer to just move on and fix that part later, if that's okay with you two?
There was a problem hiding this comment.
I think I found a way to do this that doesn't break anything, updated the PR.
There was a problem hiding this comment.
It seems like an "ad-hoc" kind of limitation if we phrase it the they way we'be been looking at it here. I think it's justifiable if we consider it to just be a subset of the "no
TYP_STRUCTvalues allowed in IR post-lower" (modulo the multi-reg special case) rule that has deeper reasons for existing than just codegen not having someifs.
My general opinion is that it should be possible to remove IR with BlockRange().Remove(node, true), without having to depend on some optional pass (like liveness) to come in and clean up after you. Otherwise we are making it too hard to manipulate our own IR.
There was a problem hiding this comment.
My general opinion is that it should be possible to remove IR with BlockRange().Remove(node, true), without having to depend on some optional pass (like liveness) to come in and clean up after you. Otherwise we are making it too hard to manipulate our own IR.
It is a tradeoff, isn't it. Without first-class struct values after lowering, you either get this restriction we have with struct values today, or you need to contort codegen to support these struct values, with the assumption they're unused. The latter seems like a muddier contract to me.
There is a somewhat separate question of whether it is a good restriction to have before lowering. I can see it not being the case and lowering being responsible for transforming such nodes (as it already is partially today).
src/coreclr/jit/liveness.cpp
Outdated
| { | ||
| Lowering::TransformUnusedIndirection(data->AsIndir(), m_compiler, block); | ||
| } | ||
| else if (data == mostRecentLocalVarOrField) |
There was a problem hiding this comment.
You can use LIR::LastNode here and remove mostRecentLocalVarOrField to make this work robustly.
There was a problem hiding this comment.
But I am surprised you need this at all. What was the problem with just removing stray local nodes as the data unconditionally?
There was a problem hiding this comment.
It would explode when visiting the removed node a second time and trying to remove it a second time. I worked around that by preventing it from removing it twice, but then got random crashes in the x64 emitter.
There was a problem hiding this comment.
That sounds odd. How did we end up visiting the removed node? We shouldn't see it since it has been removed.
There was a problem hiding this comment.
That didn't make sense to me either. I checked and its gtPrev and gtNext were both nullptr since it had been removed, so I think removing it was leaving a dangling reference elsewhere somehow?
There was a problem hiding this comment.
You can use
LIR::LastNodehere and removemostRecentLocalVarOrFieldto make this work robustly.
I've been looking at this and it's not clear how I would substitute LIR::LastNode for my variable here. Is it guaranteed that the range will only contain the store and its dependencies?
There was a problem hiding this comment.
That didn't make sense to me either. I checked and its gtPrev and gtNext were both nullptr since it had been removed, so I think removing it was leaving a dangling reference elsewhere somehow?
I looked over the code again and realized it's because ComputeLifeLIR does its iteration custom so it was iterating over dead nodes. This seems like a defect in how it's designed and could potentially happen for any node we remove inside of this method. I found a workaround, so let me know if you think what I did is acceptable. I'm willing to rework the method to not be vulnerable to this.
| next = data->gtPrev; | ||
| if (end == data) | ||
| end = data->gtNext; |
There was a problem hiding this comment.
The end == data check looks unreachable here: end is initialized to firstNode->gtPrev (a node outside the block’s range), while data is a node inside the range. Removing this branch (and adding braces for the two if statements for consistency with surrounding code) would reduce confusion and make the iterator-adjustment logic clearer.
| next = data->gtPrev; | |
| if (end == data) | |
| end = data->gtNext; | |
| { | |
| next = data->gtPrev; | |
| } |
Fixes BitConverter.ToBFloat16 and Vector`1.op_Multiply