Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push filter down HashInnerJoin rule #4956

Merged
merged 15 commits into from
Dec 6, 2022

Conversation

yixinglu
Copy link
Contributor

@yixinglu yixinglu commented Nov 29, 2022

What type of PR is this?

  • bug
  • feature
  • enhancement

What problem(s) does this PR solve?

Issue(s) number:

fix #4979

Description:

How do you solve it?

Special notes for your reviewer, ex. impact of this fix, design document, etc:

Checklist:

Tests:

  • Unit test(positive and negative cases)
  • Function test
  • Performance test
  • N/A

Affects:

  • Documentation affected (Please add the label if documentation needs to be modified.)
  • Incompatibility (If it breaks the compatibility, please describe it and add the label.)
  • If it's needed to cherry-pick (If cherry-pick to some branches is required, please label the destination version(s).)
  • Performance impacted: Consumes more CPU/Memory

Release notes:

Please confirm whether to be reflected in release notes and how to describe:

ex. Fixed the bug .....

@yixinglu yixinglu changed the title Push filter down BiInnerJoin rule Push filter down HashInnerJoin rule Dec 1, 2022
@yixinglu yixinglu force-pushed the push-filter-down-innerjoin branch 13 times, most recently from 724293e to 20a0593 Compare December 2, 2022 16:10
@yixinglu yixinglu force-pushed the push-filter-down-innerjoin branch 2 times, most recently from 332b6e0 to b829d79 Compare December 3, 2022 07:55
@yixinglu yixinglu added the ready-for-testing PR: ready for the CI test label Dec 3, 2022
jievince
jievince previously approved these changes Dec 3, 2022
Copy link
Contributor

@jievince jievince left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done!

@yixinglu
Copy link
Contributor Author

yixinglu commented Dec 3, 2022

crashed LDBC query: use ldbc_v0_3_3

MATCH (country:Country) 
WHERE id(country) == "Spain" 
MATCH (a:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country) 
MATCH (b:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country) 
MATCH (c:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country) 
MATCH (a)-[:KNOWS]-(b), (b)-[:KNOWS]-(c), (c)-[:KNOWS]-(a) 
WHERE a.Person.id < b.Person.id AND b.Person.id < c.Person.id 
RETURN count(*) AS count

jievince
jievince previously approved these changes Dec 6, 2022
@jievince jievince mentioned this pull request Dec 6, 2022
11 tasks
Comment on lines +54 to +65
if (!depGroup) {
return Status::Error("Could not find the dependent group in pattern leaves");
}
if (depGroup->groupNodes_.size() != 1U || depGroup->groupNodesReferenced_.size() != 1U) {
return Status::Error(
"Invalid sub-plan generated when applying the rule: %s, "
"planNode: %s, numGroupNodes: %lu, numGroupNodesRef: %lu",
rule->toString().c_str(),
PlanNode::toString(gn->node()->kind()),
depGroup->groupNodes_.size(),
depGroup->groupNodesReferenced_.size());
}
Copy link
Contributor

@czpmango czpmango Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better not to pass optimizer errors to the user.

It might change this way:

DCHECK_NOTNULL(depGroup) << "Could not find the dependent group in pattern leaves";

or

// The `depGroup` should not be null bcz ...
DCHECK_NOTNULL(depGroup);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have converted the error message in QueryInstance. this return only for not crash in release version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@@ -41,9 +41,81 @@ OptGroup::OptGroup(OptContext *ctx) noexcept : ctx_(ctx) {
DCHECK(ctx != nullptr);
}

Status OptGroup::validateSubPlan(const OptGroupNode *gn,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the call to this function. Did I miss something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Actually I don't use it in this PR except to debug some crash. I want to use it when refactoring go planner, now i need to consider loop/select plan nodes, it's a little tedious!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK.

if (!filterPicked) return nullptr;

auto* newChildPlanNode = childPlanNode->clone();
DCHECK_NE(childPlanNode->outputVar(), newChildPlanNode->outputVar());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the behavior of PlanNode::PlanNode(), need DCHECK here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, This line could be removed!

auto* newInnerJoinNode = static_cast<graph::HashInnerJoin*>(oldInnerJoinNode->clone());
auto newJoinGroup = rightFilterUnpicked ? OptGroup::create(octx) : filterGroupNode->group();
// TODO(yee): it's too tricky
auto newGroupNode = rightFilterUnpicked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe renaming rightFilterUnpicked to filterRemained would be more readable.

# This source code is licensed under Apache 2.0 License.
Feature: Push Filter down HashInnerJoin rule

Background:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test case:

match (a:player)-[:like]->(b) where b.player.age>30 
match (b)-[:serve]->(c) where c.team.name>"A" and b.player.age+a.player.age>40 and a.player.age<45 
return a,b,c

and

match (a:player)-[:like]->(b) where b.player.age>30 or b.player.age>45
match (b)-[:serve]->(c) where c.team.name>"A" or b.player.age+a.player.age>40 and a.player.age<45 
return a,b,c

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases related to OR filter and non-query-part plan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

@yixinglu yixinglu merged commit 7d27f32 into vesoft-inc:master Dec 6, 2022
@yixinglu yixinglu deleted the push-filter-down-innerjoin branch December 6, 2022 07:54
@yixinglu
Copy link
Contributor Author

yixinglu commented Dec 6, 2022

More cases will be added in NEXT PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-testing PR: ready for the CI test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allShortestPaths fetch the wrong results
4 participants