Add repro case for jit stack overflow failure #23346

AndyAyersMS · 2019-03-19T18:22:29Z

Repro case for #18582 and #23309.

Repro case for dotnet#18582 and #23309.

AndyAyersMS · 2019-03-19T18:22:59Z

@sandreenko PTAL
cc @dotnet/jit-contrib

Will propose merging this to 2.2 along with the fix, to address #23309.

sandreenko

LGTM

sandreenko · 2019-03-19T18:37:56Z

tests/src/JIT/Regression/JitBlue/GitHub_18582/GitHub_18582.cs

+    {
+        int z = s_x;
+        Consume(
+            q(), q(), q(), q(), q(), q(), q(), q(), q(), q(),


Are these q() threated as late arguments? Is it because they are calls?

The first 839 calls to q() are spilled to temps by the importer. But that last call to q() is not spilled to a temp, and this causes all the previous args to become late args.

[005082] ------------ * STMT void (IL ???... ???) [004231] --C-G------- \--* CALL void P.Consume [000010] ------------ arg0 +--* LCL_VAR int V02 tmp1 [000015] ------------ arg1 +--* LCL_VAR int V03 tmp2 [000020] ------------ arg2 +--* LCL_VAR int V04 tmp3 ... [004200] ------------ arg838 +--* LCL_VAR int V840 tmp839 [004196] --C-G------- arg839 +--* CALL int P.q [004202] ------------ | /--* CNS_INT int 1 [004203] ------------ arg840 +--* ADD int [004201] ------------ | \--* LCL_VAR int V00 loc0 [004205] ------------ | /--* CNS_INT int 1 ... Upping fgPtrArgCntMax from 835 to 846 argSlots=850, preallocatedArgCount=850, nextSlotNum=850, outgoingArgSpaceSize=6800 Sorting the arguments: Argument with 'side effect'... [004196] --CXG+------ * CALL int P.q lvaGrabTemp returning 841 (V841 tmp840) called for argument with side effect. Evaluate to a temp: [004196] --CXG+------ /--* CALL int P.q [005086] -ACXG-----L- * ASG int [005085] D------N---- \--* LCL_VAR int V841 tmp840 ... Shuffled argument table: rdx r8 r9 rcx fgArgTabEntry[arg 839 5087.LCL_VAR, numSlots=1, slotNum=839, align=1, lateArgInx=0, tmpNum=V841, isTmp, processed] fgArgTabEntry[arg 841 4206.ADD, numSlots=1, slotNum=841, align=1, processed] fgArgTabEntry[arg 842 4209.ADD, numSlots=1, slotNum=842, align=1, processed] fgArgTabEntry[arg 843 4212.ADD, numSlots=1, slotNum=843, align=1, processed] fgArgTabEntry[arg 844 4215.ADD, numSlots=1, slotNum=844, align=1, processed] fgArgTabEntry[arg 845 4218.ADD, numSlots=1, slotNum=845, align=1, processed] fgArgTabEntry[arg 846 4221.ADD, numSlots=1, slotNum=846, align=1, processed] fgArgTabEntry[arg 847 4224.ADD, numSlots=1, slotNum=847, align=1, processed] fgArgTabEntry[arg 848 4227.ADD, numSlots=1, slotNum=848, align=1, processed] fgArgTabEntry[arg 849 4230.ADD, numSlots=1, slotNum=849, align=1, processed] fgArgTabEntry[arg 840 4203.ADD, numSlots=1, slotNum=840, align=1, processed] fgArgTabEntry[arg 1 15.LCL_VAR, rdx, regs=1, align=1, lateArgInx=1, processed] fgArgTabEntry[arg 2 20.LCL_VAR, r8, regs=1, align=1, lateArgInx=2, processed] fgArgTabEntry[arg 3 25.LCL_VAR, r9, regs=1, align=1, lateArgInx=3, processed] fgArgTabEntry[arg 4 30.LCL_VAR, numSlots=1, slotNum=4, align=1, lateArgInx=4, needPlace, processed] fgArgTabEntry[arg 5 35.LCL_VAR, numSlots=1, slotNum=5, align=1, lateArgInx=5, needPlace, processed] fgArgTabEntry[arg 6 40.LCL_VAR, numSlots=1, slotNum=6, align=1, lateArgInx=6, needPlace, processed] ... fgArgTabEntry[arg 837 4195.LCL_VAR, numSlots=1, slotNum=837, align=1, lateArgInx=837, needPlace, processed] fgArgTabEntry[arg 838 4200.LCL_VAR, numSlots=1, slotNum=838, align=1, lateArgInx=838, needPlace, processed] fgArgTabEntry[arg 0 10.LCL_VAR, rcx, regs=1, align=1, lateArgInx=839, processed]

But that last call to q() is not spilled to a temp

Why is not it spilled? It is what I was not able to reach when tried to create a repro test for the original PR.

When we process a call we spill the stack unless the call returns void or is one of a few special methods. So we spill before the last q() but not before Consume(...).

Got it, thank you.

mikedn · 2019-03-19T18:57:09Z

I checked what loop hoisting does with this example. It does indeed go into deep recursion but there's no stack overflow because the frame size is lower (~200 bytes in a checked build). Anyway I happen to have a PR with a loop hoisiting fix and that also takes care of recursion.

Repro case for dotnet#18582 and #23309.

sandreenko · 2019-03-21T23:00:08Z

This test failed on OSX with 134 exit code. Log should be available here, but I do not trust azure links.

If the link doesn't work you can see the results from scheduled stress runs and then click "Test for PollingCounter (#23257)" -> "Test Pri1 OSX x64 checked Job" -> Result will be available ...

Probably OSX has lower stack size and hits overflow on loop hoisting that @mikedn described.
@AndyAyersMS could you please take a look?

AndyAyersMS · 2019-03-21T23:15:00Z

Hmm, the stack size is specified by the test. Let me dig in.

sandreenko · 2019-03-21T23:36:20Z

Hmm, the stack size is specified by the test. Let me dig in.

I mean OSX default stack size limit is 512Kb, when on Ubuntu it is 2Mb, on Windows 10 it is 1Mb if I remember correctly.

AndyAyersMS · 2019-03-21T23:40:17Z

        Thread t = new Thread(Test, 512 * 1024);

should be asking for a 512KB stack segment, regardless of OS defaults.

sandreenko · 2019-03-21T23:49:50Z

I see, I have missed that. Thanks.

mikedn · 2019-03-22T03:20:31Z

Can't be loop hoisting because the actual test doesn't have a loop. I actually modified the test code to check what happens if loop hoisting hits such a tree.

AndyAyersMS · 2019-03-25T19:25:54Z

Issue on OSX is that Test gets selected for cloning stress, and gtCloneExpr does the same sort of recursive traversal for GT_LIST.

Will mark this test as jit optimization sensitive.

Jitstress on OSX will turn on cloning stress in `Test` and cause stack overflows cloning the large `GT_LIST` subtree of the call. See issue dotnet#23346.

Jitstress on OSX will turn on cloning stress in `Test` and cause stack overflows cloning the large `GT_LIST` subtree of the call. See issue #23346.

Repro case for dotnet/coreclr#18582 and dotnet/coreclr#23309. Commit migrated from dotnet/coreclr@69ec646

Jitstress on OSX will turn on cloning stress in `Test` and cause stack overflows cloning the large `GT_LIST` subtree of the call. See issue dotnet/coreclr#23346. Commit migrated from dotnet/coreclr@ca2eec2

Add repro case for jit stack overflow failure

ad47757

Repro case for dotnet#18582 and #23309.

sandreenko approved these changes Mar 19, 2019

View reviewed changes

sandreenko reviewed Mar 19, 2019

View reviewed changes

return -1 on failure

71b2ae5

AndyAyersMS merged commit 69ec646 into dotnet:master Mar 20, 2019

AndyAyersMS deleted the ReproFor18582 branch March 20, 2019 16:17

sandreenko pushed a commit to sandreenko/coreclr that referenced this pull request Mar 20, 2019

Add repro case for jit stack overflow failure (dotnet#23346)

916f553

Repro case for dotnet#18582 and #23309.

sandreenko mentioned this pull request Mar 20, 2019

JIT: optimize fgMorphTree for GenTreeArgList #23368

Merged

AndyAyersMS mentioned this pull request Mar 25, 2019

Mark test GitHub_18582 as optimization sensitive #23434

Merged

AndyAyersMS added a commit that referenced this pull request Mar 25, 2019

Mark test GitHub_18582 as optimization sensitive (#23434)

ca2eec2

Jitstress on OSX will turn on cloning stress in `Test` and cause stack overflows cloning the large `GT_LIST` subtree of the call. See issue #23346.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Add repro case for jit stack overflow failure (dotnet/coreclr#23346)

fc45cba

Repro case for dotnet/coreclr#18582 and dotnet/coreclr#23309. Commit migrated from dotnet/coreclr@69ec646

Add repro case for jit stack overflow failure #23346

Add repro case for jit stack overflow failure #23346

Uh oh!

Conversation

AndyAyersMS commented Mar 19, 2019

Uh oh!

AndyAyersMS commented Mar 19, 2019

Uh oh!

sandreenko left a comment

Choose a reason for hiding this comment

Uh oh!

sandreenko Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

AndyAyersMS Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

sandreenko Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

AndyAyersMS Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

sandreenko Mar 19, 2019

Choose a reason for hiding this comment

Uh oh!

mikedn commented Mar 19, 2019

Uh oh!

sandreenko commented Mar 21, 2019

Uh oh!

AndyAyersMS commented Mar 21, 2019

Uh oh!

sandreenko commented Mar 21, 2019

Uh oh!

AndyAyersMS commented Mar 21, 2019

Uh oh!

sandreenko commented Mar 21, 2019

Uh oh!

mikedn commented Mar 22, 2019

Uh oh!

AndyAyersMS commented Mar 25, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants