-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: pragma[noinline] parameter index predicates #4637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A performance regression in `definitionByReferenceNodeFromArgument#ff` was ultimately caused by a join on parameter indexes in `DefinitionByReferenceNode.getArgument`. Joining on numbers in QL is always fragile, and somehow the changes in github#4432 had caused the join order here to break. Instead of tweaking the join order in the slow predicate itself, I added `pragma[noinline]` to one of the predicates involved in the join on parameter indexes. This should prevent us from getting similar performance problems in the future when we write code that joins on parameter numbers. Joining on indexes is always risky, but it's even more risky when one of the predicates in the join is inlined by the compiler and expands to further joins. I tested performance by running `CgiXss.ql` on a ChakraCore snapshot. Tuple counts before (I interrupted execution after five minutes or so): (626s) Tuple counts for DataFlowUtil::definitionByReferenceNodeFromArgument#ff: 58162 ~0% {3} r1 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, -1, I.<0> 26934 ~0% {2} r2 = JOIN r1 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 2 OUTPUT r1.<0>, r1.<2> 26934 ~1% {2} r3 = JOIN r2 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r2.<1> 26850 ~1% {2} r4 = JOIN r3 WITH Instruction::CallInstruction::getThisArgumentOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r3.<1> 26850 ~0% {2} r5 = JOIN r4 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r4.<1> 26850 ~1% {2} r6 = JOIN r5 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r5.<1> 58162 ~0% {2} r7 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, I.<0> 58162 ~4% {3} r8 = JOIN r7 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 1 OUTPUT R.<1>, r7.<1>, r7.<0> 4026581120 ~0% {4} r9 = JOIN r8 WITH Instruction::CallInstruction::getPositionalArgumentOperand_dispred#fff_102#join_rhs AS R ON FIRST 1 OUTPUT r8.<2>, R.<1>, r8.<1>, R.<2> 31154 ~4% {2} r10 = JOIN r9 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 2 OUTPUT r9.<3>, r9.<2> 31154 ~8% {2} r11 = JOIN r10 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r10.<1> 31154 ~0% {2} r12 = JOIN r11 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r11.<1> 58004 ~0% {2} r13 = r6 \/ r12 return r13 Tuple counts after: (0s) Tuple counts for DataFlowUtil::definitionByReferenceNodeFromArgument#ff: 385785 ~6% {2} r1 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, I.<0> 385785 ~0% {3} r2 = JOIN r1 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 1 OUTPUT r1.<0>, r1.<1>, R.<1> 385785 ~1% {3} r3 = JOIN r2 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r2.<2>, r2.<1> 198736 ~4% {2} r4 = JOIN r3 WITH Instruction::CallInstruction::getPositionalArgument#fff AS R ON FIRST 2 OUTPUT R.<2>, r3.<2> 198736 ~0% {2} r5 = JOIN r4 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r4.<1> 385785 ~1% {3} r6 = SCAN DataFlowUtil::DefinitionByReferenceNode#class#ff AS I OUTPUT I.<1>, -1, I.<0> 186891 ~1% {2} r7 = JOIN r6 WITH Instruction::IndexedInstruction#ff AS R ON FIRST 2 OUTPUT r6.<0>, r6.<2> 186891 ~2% {2} r8 = JOIN r7 WITH Instruction::SideEffectInstruction::getPrimaryInstruction_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r7.<1> 183201 ~3% {2} r9 = JOIN r8 WITH Instruction::CallInstruction::getThisArgumentOperand_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r8.<1> 183201 ~0% {2} r10 = JOIN r9 WITH Operand::Operand::getDef_dispred#3#ff AS R ON FIRST 1 OUTPUT R.<1>, r9.<1> 175449 ~8% {2} r11 = JOIN r10 WITH Instruction::Instruction::getUnconvertedResultExpression_dispred#ff AS R ON FIRST 1 OUTPUT R.<1>, r10.<1> 374185 ~3% {2} r12 = r5 \/ r11 return r12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Merging now to avoid having a bad |
I've started https://jenkins.internal.semmle.com/job/Changes/job/CPP-Differences/1545/ to check if we're back to baseline performance. This job compares the last good CPP-Changes reference (5f1c91f04/92494441a) with the latest internal/external |
@@ -1585,6 +1586,7 @@ class CallInstruction extends Instruction { | |||
/** | |||
* Gets the argument at the specified index. | |||
*/ | |||
pragma[noinline] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to add the pragma[noinline]
to this predicate? Is annotating the other predicate not enough to avoid the problem if this predicate is inlined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel more confident claiming that the other predicate, getPositionalArgumentOperand
, can do without pragma[noinline]
. The slow version of definitionByReferenceNodeFromArgument
has getPositionalArgumentOperand
in its tuple counts, so that predicate seems to not have been inlined.
In any case, I don't think it was strictly necessary to put the pragma on both predicates. I did it anyway because I worried we'd otherwise have a similar performance problem the next time we try and do a slightly different thing with these predicates. Joining on parameter indexes is always risky, and I wanted to reduce that risk.
A performance regression in
definitionByReferenceNodeFromArgument#ff
was ultimately caused by a join on parameter indexes inDefinitionByReferenceNode.getArgument
. Joining on numbers in QL is always fragile, and somehow the changes in #4432 had caused the join order here to break.Instead of tweaking the join order in the slow predicate itself, I added
pragma[noinline]
to one of the predicates involved in the join on parameter indexes. This should prevent us from getting similar performance problems in the future when we write code that joins on parameter numbers. Joining on indexes is always risky, but it's even more risky when one of the predicates in the join is inlined by the compiler and expands to further joins.I tested performance by running
CgiXss.ql
on a ChakraCore snapshot. Tuple counts before (I interrupted execution after five minutes or so):Tuple counts after: