-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Fix join-order problem in cpp/command-line-injection
#9870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before on Abseil Linux: ``` Evaluated relational algebra for predicate ExecTainted::ExecState#class#91000ffb#fff@41084cm7 with tuple counts: 40879811 ~0% {2} r1 = SCAN DataFlowUtil::Node::getLocation#dispred#f0820431#ff OUTPUT In.1, In.0 40879811 ~0% {2} r2 = JOIN r1 WITH Location::Location::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1 7527 ~3% {3} r3 = JOIN r2 WITH ExecTainted::interestingConcatenation#91000ffb#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0 7527 ~0% {4} r4 = JOIN r3 WITH DataFlowUtil::Node::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0, Rhs.1 7527 ~0% {5} r5 = JOIN r4 WITH DataFlowUtil::Node::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0, Lhs.3, Rhs.1 7527 ~0% {6} r6 = JOIN r5 WITH DataFlowUtil::Node::getLocation#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2, Lhs.0, Lhs.3, Lhs.4 7527 ~0% {3} r7 = JOIN r6 WITH Location::Location::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT ((((((("ExecState (" ++ Rhs.1) ++ " | ") ++ Lhs.4) ++ ", ") ++ Lhs.1) ++ " | ") ++ Lhs.5 ++ ")"), Lhs.3, Lhs.2 return r7 ``` After: ``` Evaluated relational algebra for predicate ExecTainted::ExecState#class#91000ffb#fff@1ffe61ps with tuple counts: 7527 ~0% {3} r1 = JOIN ExecTainted::interestingConcatenation#91000ffb#ff WITH DataFlowUtil::Node::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1 7527 ~0% {4} r2 = JOIN r1 WITH DataFlowUtil::Node::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Lhs.2, Rhs.1 7527 ~1% {5} r3 = JOIN r2 WITH DataFlowUtil::Node::getLocation#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.0, Lhs.2, Lhs.3 7527 ~0% {5} r4 = JOIN r3 WITH Location::Location::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.4, Rhs.1 7527 ~4% {6} r5 = JOIN r4 WITH DataFlowUtil::Node::getLocation#dispred#f0820431#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.0, Lhs.1, Lhs.2, Lhs.3, Lhs.4 7527 ~0% {3} r6 = JOIN r5 WITH Location::Location::toString#dispred#f0820431#ff ON FIRST 1 OUTPUT ((((((("ExecState (" ++ Rhs.1) ++ " | ") ++ Lhs.3) ++ ", ") ++ Lhs.5) ++ " | ") ++ Lhs.4 ++ ")"), Lhs.1, Lhs.2 return r6 ```
LGTM pending DCA results. |
DCA looks good to me. |
DCA - overall analysis time is 52s faster and query run times per source and query shows |
I wouldn't put too much faith in the overall analysis time, as ChakraCore was incredibly wobbly, but indeed all seems to be within the margin of error, and overall tuple counts seem to have improved slightly.
I saw this locally too when I prepared the PR. As far as I can tell some of the predicates got renamed once I changed the join order. These names do not seem to be very stable to me. |
That makes sense. Annoyingly hard to verify though, as both the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this. 👍
Before on Abseil Linux:
After: