Skip to content

Commit

Permalink
Fix versioning test against final index value for BNDCHK
Browse files Browse the repository at this point in the history
Previously this versioning test would do either a strict or non-strict
comparison depending on a few factors:
- isLoopDrivingInductionVariable,
- isIndexChildMultiplied, and
- the opcode of _loopTestTree (without consulting reverseBranch).

To deal with all of this variation there were multiple ad hoc
adjustments of +1 or -1 at different points in the determination of
maxValue (which is really supposed to be the last value).

This state of affairs led to bugs, notably when using a non-strict loop
test together with an index expression that contains a multiplication,
but also in other scenarios that I have not precisely characterized.

Now this versioning test is always (ifiucmpge maxValue arrayLength) for
increasing index expressions and (ificmplt maxValue 0) for decreasing
ones, with no regard to isLoopDrivingInductionVariable, etc., and the
corresponding ad hoc adjustments are eliminated from maxValue. Rather
than always take the ceiling and try to compensate for non-strict loop
tests after the fact, we now use the floor in the case of non-strict
loop tests and (ceiling - 1) otherwise, as described in the comment in
the extremum versioning path in buildConditionalTree(). (Note though
that instead of subtracting 1 before multiplying by incrJNode, this
change subtracts additiveConstantInStore afterward, which has the same
effect.)

Additionally, if the index expression was based on a derived IV and
involved a substitution via isDependentOnInductionVariable(), then this
versioning test would load the temporary that's defined within the loop
instead of the initial value of the IV. The temp might be uninitialized,
resulting in a versioning test that can non-deterministically pass even
if the bound check should fail. It's not much better either if the temp
happens to be initialized, since its value won't have come from the
definition within the loop and is most likely entirely unrelated to the
initial value of the IV. This commit changes the versioning test to load
the derived IV instead of the temporary.

Finally, it was also possible for a later versioning test to load an
uninitialized temporary when trying to rule out overflow in a
multiplication that appears in the index expression. To prevent this,
versioning is now disallowed if it would be necessary to substitute
beneath a multiplication.

With these changes the versioning test is still not precise, which I
believe is due to a failure to account properly for different possible
orderings of operations in the loop and possibly also to account for the
difference between the initial value of a derived IV and the value of a
temp based on it in the first iteration (e.g. tmp=iv+1).

However, it appears to be conservative. I have checked a large number of
generated test cases with different combinations of features:
- Do-while loops vs while loops (reverseBranch)
- Strict vs. non-strict loop conditions
- Increasing primary IV (with <, <= comparisons) vs. decreasing (>, >=)
- Accessing every element vs. every other element
- Getting every other element by stepping by 2 vs. multiplying by 2
- Even vs. odd array length (when accessing every other element)
- Accessing even vs. odd indices (when accessing every other element)
- Accessing elements in increasing vs. decreasing order of index
- Indexing based on primary vs. derived IV
- Stepping the primary IV by -3, -1, +1, or +3 (if using a derived IV)
- Adding an offset in the index expression vs. not adding one
- Adding an offset before multiplying vs. afterward (if multiplying)
- Using a temp (isDependentOnInductionVariable()) vs. no temp
- Setting the temp to the IV or to the index (if using a temp)
- Different initial values for the IV(s), as consistent with the above
- Every usable permutation of the statements in the loop body

In every case I had a script determine the widest loop bound that avoids
failing a bound check. I ran each test case three times:
- once with the precise bounds, which should not fail a bound check;
- once with the initial value modified to cause a bound check to fail on
  the first iteration; and
- once with the loop bound modified to do an extra iteration at the end
  and fail a bound check in the last iteration.

With this commit, all such test executions behaved as expected. In some
cases the bound check was not versioned or execution went unnecessarily
to the cold loop, but in those cases one of the following always held
before this commit:
- we already did the same thing, or
- the test passed all bound checks when it should have failed one, or
- there was a load of an uninitialized variable in the versioning test.

Most loops in real programs compute the index based on the primary IV,
which appears directly in the index expression, i.e. array[...i...]. For
these loops as they occur in the test cases described above, the only
effect of this change is to correctly run bound checks when one should
fail and to sometimes avoid going unnecessarily to the cold loop.
  • Loading branch information
jdmpapin committed Nov 29, 2023
1 parent edcb5c6 commit 2c2cc01
Showing 1 changed file with 36 additions and 123 deletions.
159 changes: 36 additions & 123 deletions compiler/optimizer/LoopVersioner.cpp
Expand Up @@ -2159,6 +2159,7 @@ bool TR_LoopVersioner::detectInvariantBoundChecks(List<TR::TreeTop> *boundCheckT
_loopConditionInvariant &&
(node->getOpCodeValue() == TR::BNDCHK || node->getOpCodeValue() == TR::BNDCHKwithSpineCHK))
{
bool isIndexChildMultiplied = false;
if (indexChild->getOpCode().hasSymbolReference())
indexSymRef = indexChild->getSymbolReference();
else
Expand All @@ -2167,6 +2168,9 @@ bool TR_LoopVersioner::detectInvariantBoundChecks(List<TR::TreeTop> *boundCheckT
while (indexChild->getOpCode().isAdd() || indexChild->getOpCode().isSub() || indexChild->getOpCode().isAnd() ||
indexChild->getOpCode().isMul())
{
if (indexChild->getOpCode().isMul())
isIndexChildMultiplied = true;

if (indexChild->getSecondChild()->getOpCode().isLoadConst())
indexChild = indexChild->getFirstChild();
else
Expand Down Expand Up @@ -2247,7 +2251,16 @@ bool TR_LoopVersioner::detectInvariantBoundChecks(List<TR::TreeTop> *boundCheckT

if (!isInductionVariable)
{
bool isIndexChildMultiplied=false;
if (isIndexChildMultiplied)
{
// Don't look through temps beneath a multiplication. The
// versioning test to rule out overflow in multiplication
// would generate a load of the temporary, which before
// entering the loop has a value unrelated to its value in
// the index expression and might even be uninitialized.
break;
}

TR::Node *mulNode=NULL;
TR::Node *strideNode=NULL;
bool indVarOccursAsSecondChildOfSub=false;
Expand Down Expand Up @@ -7460,81 +7473,21 @@ void TR_LoopVersioner::buildBoundCheckComparisonsTree(
TR::Node *incrJNode = TR::Node::create(boundCheckNode, TR::iconst, 0, incrementJ);
TR::Node *zeroNode = TR::Node::create(boundCheckNode, TR::iconst, 0, 0);
numIterations = TR::Node::create(TR::idiv, 2, range, incrNode);
TR::Node *remNode = TR::Node::create(TR::irem, 2, range, incrNode);
TR::Node *ceilingNode = TR::Node::create(TR::icmpne, 2, remNode, zeroNode);
numIterations = TR::Node::create(TR::iadd, 2, numIterations, ceilingNode);

incrementJ = additiveConstantInStore;
int32_t condValue = 0;
bool adjustForDerivedIV = false;
switch (_loopTestTree->getNode()->getOpCodeValue())
{
case TR::ificmpge:
if (reverseBranch) // actually <
{
if (isLoopDrivingInductionVariable)
incrementJ = incrementJ - 1;
else //isDerivedInductionVariable
adjustForDerivedIV = true;
}
else
condValue = 1;
break;
case TR::ificmple:
if (reverseBranch) // actually >
{
if (isLoopDrivingInductionVariable)
incrementJ = incrementJ + 1;
else //isDerivedInductionVariable
adjustForDerivedIV = true;
}
else
condValue = 1;
break;
case TR::ificmplt:
if (reverseBranch) // actually >=
condValue = 1;
else
{
if (isLoopDrivingInductionVariable)
incrementJ = incrementJ - 1;
else //isDerivedInductionVariable
adjustForDerivedIV = true;
}
break;
case TR::ificmpgt:
if (reverseBranch) // actually <=
condValue = 1;
else
{
if (isLoopDrivingInductionVariable)
incrementJ = incrementJ + 1;
else //isDerivedInductionVariable
adjustForDerivedIV = true;
}
break;
default:
break;
}

if (adjustForDerivedIV)
TR::ILOpCode stayInLoopOp = _loopTestTree->getNode()->getOpCode();
if (reverseBranch)
stayInLoopOp = stayInLoopOp.getOpCodeForReverseBranch();

bool strict = !stayInLoopOp.isCompareTrueIfEqual();
if (strict)
{
if (isAddition)
incrementJ = incrementJ - 1;
else
incrementJ = incrementJ + 1;
TR::Node *remNode = TR::Node::create(TR::irem, 2, range, incrNode);
TR::Node *ceilingNode = TR::Node::create(TR::icmpne, 2, remNode, zeroNode);
numIterations = TR::Node::create(TR::iadd, 2, numIterations, ceilingNode);
}

TR::Node *extraIter = TR::Node::create(TR::icmpeq, 2, remNode, zeroNode);
extraIter = TR::Node::create(TR::iand, 2, extraIter,
TR::Node::create(boundCheckNode, TR::iconst, 0, condValue));
numIterations = TR::Node::create(TR::iadd, 2, numIterations, extraIter);
numIterations = TR::Node::create(TR::imul, 2, numIterations, incrJNode);
TR::Node *adjustMaxValue;
if(isIndexChildMultiplied)
adjustMaxValue = TR::Node::create(boundCheckNode, TR::iconst, 0, incrementJ+1);
else
adjustMaxValue = TR::Node::create(boundCheckNode, TR::iconst, 0, incrementJ);

TR::Node *maxValue = NULL;
TR::ILOpCodes addOp = TR::isub;
if (isLoopDrivingInductionVariable)
Expand All @@ -7551,11 +7504,16 @@ void TR_LoopVersioner::buildBoundCheckComparisonsTree(
if (isLoopDrivingInductionVariable)
entryNode = TR::Node::createLoad(boundCheckNode, loopDrivingSymRef);
else
entryNode = TR::Node::createLoad(boundCheckNode, origIndexSymRef);
entryNode = TR::Node::createLoad(boundCheckNode, indexSymRef);

maxValue = TR::Node::create(addOp, 2, entryNode, numIterations);

maxValue = TR::Node::create(TR::isub, 2, maxValue, adjustMaxValue);
if (strict)
{
TR::Node *adjustMaxValue = TR::Node::create(boundCheckNode, TR::iconst, 0, additiveConstantInStore);
maxValue = TR::Node::create(TR::isub, 2, maxValue, adjustMaxValue);
}

loopLimit = maxValue;
dumpOptDetails(comp(), "loopLimit has been adjusted to %p\n", loopLimit);
}
Expand Down Expand Up @@ -7605,34 +7563,12 @@ void TR_LoopVersioner::buildBoundCheckComparisonsTree(
{
if (!indVarOccursAsSecondChildOfSub)
{
if (isLoopDrivingInductionVariable)
{
if ((_loopTestTree->getNode()->getOpCodeValue() == TR::ificmple) ||
(_loopTestTree->getNode()->getOpCodeValue() == TR::ificmpgt))
nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
}

if (!nextComparisonNode)
{
if(isIndexChildMultiplied)
nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
else
nextComparisonNode = TR::Node::createif(TR::ifiucmpgt, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
}
nextComparisonNode->setIsVersionableIfWithMaxExpr(comp());
nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
nextComparisonNode->setIsVersionableIfWithMaxExpr(comp());
}
else
{
if (isLoopDrivingInductionVariable)
{
if ((_loopTestTree->getNode()->getOpCodeValue() == TR::ificmplt) ||
(_loopTestTree->getNode()->getOpCodeValue() == TR::ificmpge))
nextComparisonNode = TR::Node::createif(TR::ificmple, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, -1), _exitGotoTarget);
}

if (!nextComparisonNode)
nextComparisonNode = TR::Node::createif(TR::ificmplt, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, -1), _exitGotoTarget);

nextComparisonNode = TR::Node::createif(TR::ificmplt, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, 0), _exitGotoTarget);
nextComparisonNode->setIsVersionableIfWithMinExpr(comp());
}

Expand All @@ -7643,35 +7579,12 @@ void TR_LoopVersioner::buildBoundCheckComparisonsTree(
{
if (!indVarOccursAsSecondChildOfSub)
{
if (isLoopDrivingInductionVariable)
{
if ((_loopTestTree->getNode()->getOpCodeValue() == TR::ificmplt) ||
(_loopTestTree->getNode()->getOpCodeValue() == TR::ificmpge))
nextComparisonNode = TR::Node::createif(TR::ificmple, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, -1), _exitGotoTarget);
}

if (!nextComparisonNode)
nextComparisonNode = TR::Node::createif(TR::ificmplt, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, -1), _exitGotoTarget);

nextComparisonNode = TR::Node::createif(TR::ificmplt, correctCheckNode, TR::Node::create(boundCheckNode, TR::iconst, 0, 0), _exitGotoTarget);
nextComparisonNode->setIsVersionableIfWithMinExpr(comp());
}
else
{
if (isLoopDrivingInductionVariable)
{
if ((_loopTestTree->getNode()->getOpCodeValue() == TR::ificmple) ||
(_loopTestTree->getNode()->getOpCodeValue() == TR::ificmpgt))
nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
}

if (!nextComparisonNode)
{
if(isIndexChildMultiplied)
nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
else
nextComparisonNode = TR::Node::createif(TR::ifiucmpgt, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
}

nextComparisonNode = TR::Node::createif(TR::ifiucmpge, correctCheckNode, arrayLengthNode->duplicateTree(), _exitGotoTarget);
nextComparisonNode->setIsVersionableIfWithMaxExpr(comp());
}

Expand Down

0 comments on commit 2c2cc01

Please sign in to comment.