Skip to content

C++: Fix localStepsToExpr performance #12477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MathiasVP
Copy link
Contributor

localStepFromNonExpr included a bunch of unnecessary nodes which led to a blowup when taking its transitive closure in localStepsToExpr:

Evaluated non-recursive predicate DataFlowUtil#47741e1f::ExprFlowCached::localStepsToExpr#3#fff@ac2ee4up in 12957ms (evaluation was halted due to CancellationException).
Evaluated relational algebra for predicate DataFlowUtil#47741e1f::ExprFlowCached::localStepsToExpr#3#fff@ac2ee4up with tuple counts:
 56000   ~4%    {3} r1 = SCAN DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff OUTPUT In.0, In.0, In.1
            
60965500   ~4%    {3} r2 = JOIN boundedFastTC:DataFlowUtil#47741e1f::ExprFlowCached::localStepFromNonExpr#2#ff_10#higher_order_body:project#DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff WITH DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1
            
61021500   ~4%    {3} r3 = r1 UNION r2
61021000   ~4%    {3} r4 = r3 AND NOT _DataFlowUtil#47741e1f::Node::toString#0#dispred#ff#antijoin_rhs(Lhs.0)
                return r4

(I stopped evaluation midway)

After this PR we get:

Evaluated non-recursive predicate DataFlowUtil#47741e1f::ExprFlowCached::localStepsToExpr#3#fff@3b3f70t2 in 353ms (size: 2923152).
Evaluated relational algebra for predicate DataFlowUtil#47741e1f::ExprFlowCached::localStepsToExpr#3#fff@3b3f70t2 with tuple counts:
  1131827   ~0%    {3} r1 = SCAN DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff OUTPUT In.0, In.0, In.1
               
  1791325   ~0%    {3} r2 = JOIN boundedFastTC:DataFlowUtil#47741e1f::ExprFlowCached::localStepFromNonExpr#2#ff_10#higher_order_body:project#DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff WITH DataFlowUtil#47741e1f::Node::asExpr#0#dispred#ff ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1
               
  2923152   ~0%    {3} r3 = r1 UNION r2
                   return r3

@MathiasVP MathiasVP added C++ no-change-note-required This PR does not need a change note labels Mar 10, 2023
@MathiasVP MathiasVP requested a review from a team as a code owner March 10, 2023 10:57
@MathiasVP
Copy link
Contributor Author

CI check is failing due to the autoformatter changes. I suggest we ignore those and fix all of those in a follow-up PR once #12230 has been merged into main.

Comment on lines 1622 to 1625
// We should never recursive through any of these nodes while
// looking for the next expression.
not n1 instanceof SsaPhiNode and
not n1 instanceof InitialGlobalValue and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment doesn't quite give me sufficient information here as to why this is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yes I can see what you mean. I'll expand the explanation. Taking a step back, the point of localStepsToExpr(n1, n2, e2) is that n2 is the first node after n1 in the dataflow step relation that has a result for asExpr(). So localStepsToExpr does a transitive closure over localFlowStep, but where localFlowStep is restricted to only take a step as long as there's no result for asExpr() (this is the localStepFromNonExpr predicate). Once we find a node for which asExpr() exists, the transitive closure stops.

On the particular project I was looking at, there apparently were some nasty SsaPhiNodes that made the transitive closure of localStepFromNonExpr blow up even with the restriction mentioned above. SsaPhiNode are trivially includes in localStepFromNonExpr because both SsaPhiNode because they clearly satisfy not exists(n1.asExpr()) (as these nodes are not supposed to map to expressions) and they can have many many incoming nodes. This resulted in the tuple explosion.

But when we're looking for the next node after n1 in the dataflow step relation that has a result for asExpr(), we should never step over an SsaPhiNode since this node represents incoming data from many different places.

I'll try to write a more carefully worded description of the above and put it in a comment if it makes more sense now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

37d417b adds a version of ^ into the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense now.

@MathiasVP
Copy link
Contributor Author

Oops. Looks like I broke some query results 😢. I'll investigate!

Comment on lines +1638 to +1644
not n1 instanceof InitialGlobalValue and
// And dually to the above case for `InitialGlobalValue`: if we're at a
// `FinalGlobalValue` node it means we've reached the final value of a global
// variable as we're exiting a function. That global variable will have a final
// mention somewhere in the body of the function, and that mention will have a
// result for `asExpr()`.
not n2 instanceof FinalGlobalValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when I originally commented. I thought that the comment above not n1 instanceof SsaPhiNode applied to both that and not n1 instanceof InitialGlobalValue, which it clearly does not. A combined comment for InitialGlobalValue and FinalGlobalValue might make slightly more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a good point. I'll write a combined comment once I've figured out why this PR is breaking some query tests.

@MathiasVP
Copy link
Contributor Author

Turns out this PR wasn't the right way to fix this. It gave some genuine new lost results, and I now realize that weeding out Phi nodes in the way it's done in this PR was totally not the right way to fix the performance. I've opened #12507 with the right fix 🤞.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants