JIT throughput: noway_assert #7709

BruceForstall · 2017-03-23T15:56:06Z

The JIT has about 3300 noway_assert. These are executed in non-DEBUG (aka, RELEASE) builds. Some might be frequently executed and thus costly. Instead of auditing all of them for relevance (i.e., in an optimization phase that can be backed out of), or apparently importance, we could change the noway_assert macro (conditionally) to collect a count of which ones are frequently executed, using a hash table from preprocessor FILE and LINE to execution count, dumped at the end of compilation. Then, we could convert the worst ones to simple asserts.

category:throughput
theme:throughput
skill-level:expert
cost:medium

BruceForstall · 2017-04-18T23:04:15Z

I implemented this. I ran "a lot" of the test tree, processed the data, and saw these as the top 75 dynamic occurrences of noway_asserts (some might actually get optimized away in release builds, as there are a few static constant asserts here). The columns are: total count, filename, line number, assertion text.

181470679, e:\gh\coreclr2\src\jit\morph.cpp, 14648, tree
180436108, e:\gh\coreclr2\src\jit\morph.cpp, 14649, tree->gtOper != GT_STMT
83096677, e:\gh\coreclr2\src\jit\assertionprop.cpp, 808, assertIndex <= optAssertionCount
83096677, e:\gh\coreclr2\src\jit\assertionprop.cpp, 807, assertIndex != NO_ASSERTION_INDEX
72024047, e:\gh\coreclr2\src\jit\liveness.cpp, 1632, lclNum < lvaCount
43294958, e:\gh\coreclr2\src\jit\lsra.cpp, 3003, count < MaxInternalRegisters
40851620, e:\gh\coreclr2\src\jit\lclvars.cpp, 2510, varNum < lvaCount
35418942, e:\gh\coreclr2\src\jit\morph.cpp, 10632, tree->OperKind() & GTK_SMPOP
33982229, e:\gh\coreclr2\src\jit\flowgraph.cpp, 5228, opcode < CEE_COUNT
32541990, e:\gh\coreclr2\src\jit\morph.cpp, 11463, tree->gtOper != GT_CALL
32533197, e:\gh\coreclr2\src\jit\morph.cpp, 13347, oper == tree->gtOper
30050847, e:\gh\coreclr2\src\jit\flowgraph.cpp, 16328, !(block->bbFlags & BBF_REMOVED)
27085069, e:\gh\coreclr2\src\jit\morph.cpp, 15542, stmt->gtOper == GT_STMT
23724178, e:\gh\coreclr2\src\jit\flowgraph.cpp, 18490, tree->gtOper == GT_STMT
23362852, e:\gh\coreclr2\src\jit\liveness.cpp, 1642, varIndex < lvaTrackedCount
23102961, e:\gh\coreclr2\src\jit\emit.h, 1604, (UNATIVE_OFFSET)distance == distance
22587177, e:\gh\coreclr2\src\jit\morph.cpp, 8401, tree->OperKind() & GTK_LEAF
21688207, e:\gh\coreclr2\src\jit\morph.cpp, 5984, tree->gtOper == GT_LCL_VAR
19590233, e:\gh\coreclr2\src\jit\lclvars.cpp, 3599, (tree->gtOper == GT_LCL_VAR) || (tree->gtOper == GT_LCL_FLD)
19478128, e:\gh\coreclr2\src\jit\lclvars.cpp, 3680, tiVerificationNeeded || varDsc->lvType == TYP_UNDEF || tree->gtType == TYP_UNKNOWN || allowStructs || genActualType(varDsc->TypeGet()) == genActualType(tree->gtType) || (tree->gtType == TYP_BYREF && varDsc->TypeGet() == TYP_I_IMPL) || (tree->gtType == TYP_I_IMPL && varDsc->TypeGet() == TYP_BYREF) || (tree->gtFlags & GTF_VAR_CAST) || varTypeIsFloating(varDsc->TypeGet()) && varTypeIsFloating(tree->gtType)
18224434, e:\gh\coreclr2\src\jit\morph.cpp, 6019, !(tree->gtFlags & GTF_VAR_DEF) || varAddr
15639847, e:\gh\coreclr2\src\jit\gentree.cpp, 7095, argInfo != nullptr
14993689, e:\gh\coreclr2\src\jit\morph.cpp, 8344, tree->OperKind() & GTK_CONST
13800099, e:\gh\coreclr2\src\jit\flowgraph.cpp, 18555, list.gtNext->gtPrev == &list
13209197, e:\gh\coreclr2\src\jit\morph.cpp, 14731, tree != nullptr
12710725, e:\gh\coreclr2\src\jit\emitxarch.cpp, 546, prefix >= 0x40 && prefix <= 0x4F
11966466, e:\gh\coreclr2\src\jit\morph.cpp, 10733, op1
11256849, e:\gh\coreclr2\src\jit\emitxarch.cpp, 1695, (int)offs < 0
10750793, e:\gh\coreclr2\src\jit\importer.cpp, 563, impTreeLast != nullptr
10503327, e:\gh\coreclr2\src\jit\flowgraph.cpp, 889, block
10444252, e:\gh\coreclr2\src\jit\codegenxarch.cpp, 1516, targetType != TYP_STRUCT
10283544, e:\gh\coreclr2\src\jit\flowgraph.cpp, 9626, block->bbNext == bNext
9808620, e:\gh\coreclr2\src\jit\flowgraph.cpp, 16280, !(block->bbFlags & BBF_TRY_BEG)
9808617, e:\gh\coreclr2\src\jit\flowgraph.cpp, 16279, !block->bbCatchTyp
9807880, e:\gh\coreclr2\src\jit\morph.cpp, 15677, fgPtrArgCntCur == 0
9490871, e:\gh\coreclr2\src\jit\valuenum.cpp, 747, attribs == CEA_None
9297809, e:\gh\coreclr2\src\jit\emitxarch.cpp, 3328, emitVerifyEncodable(ins, size, reg)
9049871, e:\gh\coreclr2\src\jit\liveness.cpp, 1879, VarSetOps::IsSubset(this, keepAliveVars, life)
9026196, e:\gh\coreclr2\src\jit\flowgraph.cpp, 890, blockPred
8640521, e:\gh\coreclr2\src\jit\emitxarch.cpp, 4861, emitVerifyEncodable(ins, size, ireg)
8540880, e:\gh\coreclr2\src\jit\codegencommon.cpp, 2423, rv1 || mul != 1
8540880, e:\gh\coreclr2\src\jit\codegencommon.cpp, 2425, FitsIn<INT32>(cns)
7375190, e:\gh\coreclr2\src\jit\morph.cpp, 15754, fgExpandInline == false
7352623, e:\gh\coreclr2\src\jit\emitxarch.cpp, 3735, emitVerifyEncodable(ins, size, reg1, reg2)
7322925, e:\gh\coreclr2\src\jit\morph.cpp, 10675, op1 == tree->gtOp.gtOp1
7282409, e:\gh\coreclr2\src\jit\morph.cpp, 8054, call->gtOper == GT_CALL
7028316, e:\gh\coreclr2\src\jit\lclvars.cpp, 3363, varDsc->lvRefCnt > 0
6854055, e:\gh\coreclr2\src\jit\flowgraph.cpp, 11017, (block->bbFlags & BBF_REMOVED) == 0
6777828, e:\gh\coreclr2\src\jit\inlinepolicy.cpp, 497, smOpcode < SM_COUNT
6777827, e:\gh\coreclr2\src\jit\inlinepolicy.cpp, 498, smOpcode != SM_PREFIX_N
6480615, e:\gh\coreclr2\src\jit\flowgraph.cpp, 17102, blk != nullptr
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13703, GT_SUB == GT_ADD + 1
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13704, GT_MUL == GT_ADD + 2
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13705, GT_DIV == GT_ADD + 3
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13706, GT_MOD == GT_ADD + 4
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13716, GT_RSZ == GT_ADD + 12
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13715, GT_RSH == GT_ADD + 11
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13714, GT_LSH == GT_ADD + 10
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13712, GT_AND == GT_ADD + 9
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13711, GT_XOR == GT_ADD + 8
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13707, GT_UDIV == GT_ADD + 5
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13708, GT_UMOD == GT_ADD + 6
6264602, e:\gh\coreclr2\src\jit\morph.cpp, 13710, GT_OR == GT_ADD + 7
6262994, e:\gh\coreclr2\src\jit\liveness.cpp, 3003, compCurBB == block
6062460, e:\gh\coreclr2\src\jit\flowgraph.cpp, 7343, tree->OperGet() == GT_ASG
6028687, e:\gh\coreclr2\src\jit\codegencommon.cpp, 11901, jitGetILoffs(offsx) <= compiler->info.compILCodeSize
5280889, e:\gh\coreclr2\src\jit\regalloc.cpp, 6747, varDsc->lvIsInReg() || varDsc->lvOnFrame || varDsc->lvRefCnt == 0
5280889, e:\gh\coreclr2\src\jit\regalloc.cpp, 6751, !varDsc->lvRegister || !varDsc->lvOnFrame
5280889, e:\gh\coreclr2\src\jit\lclvars.cpp, 4567, !varDsc->lvFramePointerBased || codeGen->doubleAlignOrFramePointerUsed()
5223565, e:\gh\coreclr2\src\jit\flowgraph.cpp, 1713, fgDomsComputed
4989227, e:\gh\coreclr2\src\jit\liveness.cpp, 1833, endNode || (startNode == compCurStmt->gtStmt.gtStmtExpr)
4989227, e:\gh\coreclr2\src\jit\liveness.cpp, 2945, nextStmt->gtOper == GT_STMT
4989227, e:\gh\coreclr2\src\jit\liveness.cpp, 2944, nextStmt
4989227, e:\gh\coreclr2\src\jit\liveness.cpp, 1832, compCurStmt->gtOper == GT_STMT
4798987, e:\gh\coreclr2\src\jit\assertionprop.cpp, 3931, !optLocalAssertionProp

BruceForstall · 2017-06-07T16:15:46Z

I measured the total impact of noway_assert using instruction counts over SuperPMI collections of the dotnet/coreclr testbed, by removing noway_assert from release build. I saw a 1.03% overhead from noway_assert using normal optimization, and a 0.74% overhead from noway_assert using MinOpts.

BruceForstall · 2017-06-07T16:22:55Z

The top 11 noway_asserts here are over 50% of the dynamic count, so converting these few to simple assert I would expect to see up to 0.5% throughput improvement.

mikedn · 2017-06-07T16:35:03Z

What's the story of this noway_assert thing anyway? How was decided where to use assert and where to use noway_assert?

BruceForstall · 2017-06-07T16:47:12Z

As I understand it, assert was converted to noway_assert automatically or semi-automatically. I actually don't know how they determined which were to be converted. Theoretically, we should not have any noway_assert unless a re-compilation with MinOpts would avoid repeating the condition (since that's the main benefit of noway_assert). But that's hard to tell sometimes. So we have to make a call about which ones are worth it, and which are too expensive (now, after the automated conversion was done).

With these few changes, I measured a JIT instruction count reduction of 0.37% of SuperPMI over the tests, and 0.17% for MinOpts. Related to #10421

TIHan · 2023-11-01T21:52:10Z

Considering this is for throughput, it would be interesting to investigate this more.

BruceForstall · 2025-02-25T23:18:55Z

Going to close this, as I don't expect we'll work on this anytime soon.

BruceForstall referenced this issue in BruceForstall/coreclr Jun 7, 2017

Convert some very common noway_assert to simple assert

dbf724c

With these few changes, I measured a JIT instruction count reduction of 0.37% of SuperPMI over the tests, and 0.17% for MinOpts. Related to #10421

msftgits transferred this issue from dotnet/coreclr Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

BruceForstall added the JitUntriaged label Oct 28, 2020

TIHan modified the milestones: Future, 9.0.0 Nov 1, 2023

TIHan removed the JitUntriaged label Nov 1, 2023

JulieLeeMSFT assigned BruceForstall Apr 9, 2024

JulieLeeMSFT modified the milestones: 9.0.0, 10.0.0 Jul 25, 2024

BruceForstall closed this as completed Feb 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT throughput: noway_assert #7709

JIT throughput: noway_assert #7709

BruceForstall commented Mar 23, 2017

BruceForstall commented Apr 18, 2017

BruceForstall commented Jun 7, 2017

BruceForstall commented Jun 7, 2017

mikedn commented Jun 7, 2017

BruceForstall commented Jun 7, 2017

TIHan commented Nov 1, 2023

BruceForstall commented Feb 25, 2025

JIT throughput: noway_assert #7709

JIT throughput: noway_assert #7709

Comments

BruceForstall commented Mar 23, 2017

BruceForstall commented Apr 18, 2017

BruceForstall commented Jun 7, 2017

BruceForstall commented Jun 7, 2017

mikedn commented Jun 7, 2017

BruceForstall commented Jun 7, 2017

TIHan commented Nov 1, 2023

BruceForstall commented Feb 25, 2025