Conversation
CarolEidt
commented
Aug 25, 2017
- Fix a couple of issues with TreeNodeInfoInit for struct arguments
- Remove duplicate calls to LowerBlockStore from TreeNodeInfoInit for arm and arm64
- Eliminate duplicative isMultiReg method from GenTree and fix condition for OperIsMultiRegOp()
@@ -3517,6 +3517,15 @@ static int ComputeOperandDstCount(GenTree* operand) | |||
// pointers to argument setup stores. | |||
return 0; | |||
} | |||
#ifdef _TARGET_ARMARCH_ | |||
else if (operand->OperIsPutArgStk()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to find why srcCount of PUTARG_STK is 0 instead of adding this case? I think setting of srcCount for PUTARG_STK has some problem on ARMARCH.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this method, we are only interested in operands that either produce a register, or are contained. In the case of PUTARG_STK, it may have a srcCount of zero if its child is a GT_OBJ
with a GT_LCL_VAR_ADDR
child (see https://github.com/dotnet/coreclr/blob/master/src/jit/lower.cpp#L868)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it correct on lsra operation & codegen when we set srcCount for PUTARG_STK as 0 if its child is a GT_OBJ
with a GT_LCL_VAR_ADDR
child? In x64/x86, srcCount is set by using GetOperandSourceCount()
(https://github.com/dotnet/coreclr/blob/master/src/jit/lsraxarch.cpp#L1635), but In arm32/arm64, srcCount is set by using only child's source (https://github.com/dotnet/coreclr/blob/master/src/jit/lsraarmarch.cpp#L670)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CarolEidt This code can address assertion #13565. However I also have a same question @hseok-oh had.
Here is an exmaple from #13565.
Before #13198, GT_PUTARG_STK
had one srcCount and consume one register during LSRA.
// Before
N4533 (???,???) [043750] ----GO------- * PUTARG_STK [+0x08] stru|tmp550 [+4] $cbe
ct REG NA
+<TreeNodeInfo @ 4533 0=1 1i 0f src=[r0-r9 r12 lr] int=[r0-r9 r12 lr] dst=[r0-r9 r12 lr] I>
..(suppressed)..
N4533. PUTARG_STK [+0x08]
consume=1 produce=0
...
N4555. CALL
consume=4 produce=0
...
/--* t22477 struct
Generating: N4533 (???,???) [043750] ----GO------- * PUTARG_STK [+0x08] struct REG NA
Byref regs: 0008 {r3} => 0000 {}
IN0a2a: ldr r2, [r3]
IN0a2b: str r2, [sp+0x08] // [V1427 OutArgs+0x08]
IN0a2c: ldr r2, [r3+4]
IN0a2d: str r2, [sp+0x0c] // [V1427 OutArgs+0x0c]
IN0a2e: ldr r2, [r3+8]
IN0a2f: str r2, [sp+0x10] // [V1427 OutArgs+0x10]
IN0a30: ldr r2, [r3+12]
IN0a31: str r2, [sp+0x14] // [V1427 OutArgs+0x14]
Now, GT_PUTARG_STK
has zero srcCount and raises assertion during LSRA.
// Now
N4533 (???,???) [043748] ----GO------- * PUTARG_STK [+0x08] struct REG NA
+<TreeNodeInfo @ 4533 0=0 1i 0f src=[r0-r9 r12 lr] int=[r0-r9 r12 lr] dst=[r0-r9 r12 lr] I>`
...
N4533. PUTARG_STK [+0x08]
consume=0 produce=0
// Is it expected behavior ?
// There is no consume for PUTARG_STK which is actually
// use one register (r3 in above example) as source
// and use one internal register (r2 in above example).
...
N4555. CALL
consume=4 produce=0
// An assertion raised while processing this CALL related to above PUTARG_STK
// And above fix can remove this assertion
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the dump you provided, I can't tell what the child of the PUTARG_STK
node is. Can you tell me which test this is from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the child is a lclVar, there should be no source register; each word of the struct would be loaded from its stack location into the temp register (note the "1i" in the TreeNodeInfo which indicates that 1 internal register is used). Then it will be stored to the outgoing argument stack location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat late to this comment.
It seems that PUTARG_STK
in above example has GT_ADD
as a child.
N003 ( 1, 1) [022473] ------------- t22473 = LCL_VAR ref V01 loc1 u:3 $2c0
N004 ( 1, 1) [025101] ------------- t25101 = CNS_INT int 56 field offset Fseq[m_cl_op2] $f50
/--* t22473 ref
+--* t25101 int
N005 ( 3, 3) [025102] ------------- t25102 = * ADD byref $1044
/--* t25102 byref
N006 ( 9, 7) [022477] ----GO------- t22477 = * OBJ(16) struct <l:$896, c:$d04>
/--* t22477 struct
[043750] ----GO------- * PUTARG_STK [+0x08] struct
What do you think of this case ?
@pgavlin @BruceForstall PTAL |
src/jit/gentree.h
Outdated
@@ -1333,7 +1333,11 @@ struct GenTree | |||
bool OperIsMultiRegOp() const | |||
{ | |||
#if !defined(LEGACY_BACKEND) && defined(_TARGET_ARM_) | |||
if (gtOper == GT_MUL_LONG || gtOper == GT_PUTARG_REG || gtOper == GT_COPY) | |||
if (gtOper == GT_MUL_LONG || | |||
#ifdef SOFT_FP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be #ifdef ARM_SOFTFP
, looks like a typo.
And I think GT_COPY
is MultiRegOp on armel only as well as GT_PUTARG_REG
since these two nodes come together for double register argument passing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching the typo! I believe that GT_COPY
should be enabled for both. At present, the register allocator doesn't insert copies of tree temps, but it may in future (e.g. to support conflicting register requirements, or to avoid a spill), in which case it will be needed if a GT_MUL_LONG
is copied.
@@ -7030,7 +7030,7 @@ GenTreePtr Compiler::gtNewPutArgReg(var_types type, GenTreePtr arg, regNumber ar | |||
assert(arg != nullptr); | |||
|
|||
GenTreePtr node = nullptr; | |||
#if !defined(LEGACY_BACKEND) && defined(_TARGET_ARM_) | |||
#if !defined(LEGACY_BACKEND) && defined(ARM_SOFTFP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change this way, I think we can remove the if statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sounds good.
Here are the test results from my pi: 19 total failures 11 Runtime Failures:
2 Assertion failed 'info->dstCount == 0' (lsraarm.cpp Line: 473)
4 Assertion failed 'genActualType(op1->TypeGet()) == genActualType(op2->TypeGet()) || varTypeIsI(op1->TypeGet()) && varTypeIsI(op2->TypeGet()) || varTypeIsFloating(op1->gtType) && varTypeIsFloating(op2->gtType) : Possibly bad IL with CEE_ceq at offset 00FAh (op1=long op2=int stkDepth=0)' (importer.cpp Line: 11629):
2 Assertion failed '(op2 == idx) || (op2->gtEffectiveVal() == idx)' (gentree.cpp Line: 3846)
|
- Fix a couple of issues with TreeNodeInfoInit for struct arguments - Remove duplicate calls to LowerBlockStore from TreeNodeInfoInit for arm and arm64 - Eliminate duplicative isMultiReg method from GenTree and fix condition for OperIsMultiRegOp()
54b23a0
to
388660b
Compare
@CarolEidt AFAIR the assertion from |
@wateret thanks! All the JIT.Methodical.ELEMENT_TYPE_IU tests pass with the x86 test build - I should have known that I would need to download tests with the same bitness. I'm now re-running the other failing tests. |
There are (unverifiable) tests that depend on the size of native int. I haven't verified that these tests fall into that category, but it would explain that assertion, which is complaining about type mismatches. |
src/jit/lsraarmarch.cpp
Outdated
assert(curArgTabEntry); | ||
|
||
assert(curArgTabEntry->regNum == REG_STK); | ||
INDEBUG(assert(curArgTabEntry->regNum == REG_STK)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these asserts need to be surrounded by INDEBUG
? assert
is always DEBUG only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh! Thanks, I'll remove the INDEBUG
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one nit
Here's an update after using the Windows/x86 test build instead of the x64 one (though I only re-ran the tests that had failed before, assuming there are no new failures): 12 total failures 8 Runtime Failures:
2 Assertion failed 'info->dstCount == 0' (lsraarm.cpp Line: 473)
2 Assertion failed '(op2 == idx) || (op2->gtEffectiveVal() == idx)' (gentree.cpp Line: 3846)
|
I don't see any issue for the |
@dotnet-bot test Tizen armel Cross Debug Build |
#13603
I opened #13604 |