New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate unnecessary cross joins #6395

Closed
wants to merge 12 commits into
base: master
from

Conversation

Projects
None yet
7 participants
@pnowojski
Member

pnowojski commented Oct 19, 2016

This pull request adds simple join reordering algorithm. With reorder_joins set to true (default false) this rewrite will find all consecutive join sequences and if there is a cross join it will try to reorder joins to eliminate them. For example:

select * from A, B, C where A.a=C.a and B.b=C.b

There are equality join conditions betweem A<->C and B<->C but join order forces to first join A and B, which results in cross join. This code will rewrite such query to:

select * from A, C, B where A.a=C.a and B.b=C.b

It will also try to preserve original join order as much as possible.

This solves issues with:

  • automatically generated queries
  • user queries for which user either didn't think about join order or unintentionally introduced cross join
  • some tpch/tpcds queries have cross joins that can be removed this way (dramatically improving performance)
@kbajda

This comment has been minimized.

Show comment
Hide comment
@kbajda

kbajda Oct 19, 2016

Member

@pnowojski : How about making the property name more specific (e.g., reorder_cross_joins) ?
I suppose once we enable table/column stats much broader join reordering could be possible.

Also, could you please give specific TPC-H/DS query #s and observed speedup after introducing your change.

Member

kbajda commented Oct 19, 2016

@pnowojski : How about making the property name more specific (e.g., reorder_cross_joins) ?
I suppose once we enable table/column stats much broader join reordering could be possible.

Also, could you please give specific TPC-H/DS query #s and observed speedup after introducing your change.

@pnowojski

This comment has been minimized.

Show comment
Hide comment
@pnowojski

pnowojski Oct 20, 2016

Member

@KBP-TDC TPC-DS query 64, instead running indefinitely (12+ hours) finishes in couple of minutes. Query like in the description using TPC-H schema:

SELECT count(*) FROM part p, orders o, lineitem l WHERE p.partkey = l.partkey AND l.orderkey = o.orderkey;

On tpch.sf1 schema on my laptop without reorder_joins it runs around 1 hour and with reorder_joins it finishes in couple of minutes.

Regarding renaming it to reorder_cross_joins. I'm against having so much switches with such high granularity. I named it reorder_joins with specifically having in mind future stats based reordering. I imagine there should be one switch that turns on/off all joins reordering algorithms. What's more, future joins reordering might not be separate optimizer, but it might end up expanding/replacing this one.

Member

pnowojski commented Oct 20, 2016

@KBP-TDC TPC-DS query 64, instead running indefinitely (12+ hours) finishes in couple of minutes. Query like in the description using TPC-H schema:

SELECT count(*) FROM part p, orders o, lineitem l WHERE p.partkey = l.partkey AND l.orderkey = o.orderkey;

On tpch.sf1 schema on my laptop without reorder_joins it runs around 1 hour and with reorder_joins it finishes in couple of minutes.

Regarding renaming it to reorder_cross_joins. I'm against having so much switches with such high granularity. I named it reorder_joins with specifically having in mind future stats based reordering. I imagine there should be one switch that turns on/off all joins reordering algorithms. What's more, future joins reordering might not be separate optimizer, but it might end up expanding/replacing this one.

@kokosing

nice work!

Anyway I think you need to reorganize commits in some logical way instead of way you implemented that.

Show outdated Hide outdated ...o-main/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
@@ -0,0 +1,230 @@
/*

This comment has been minimized.

@kokosing

kokosing Oct 20, 2016

Member

can you please add some tests?

@kokosing

kokosing Oct 20, 2016

Member

can you please add some tests?

This comment has been minimized.

@pnowojski

pnowojski Oct 25, 2016

Member

it's tested in TestCrossJoinElimination

@pnowojski

pnowojski Oct 25, 2016

Member

it's tested in TestCrossJoinElimination

Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...java/com/facebook/presto/sql/planner/optimizations/TestReorderJoins.java
Show outdated Hide outdated ...java/com/facebook/presto/sql/planner/optimizations/TestReorderJoins.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Show outdated Hide outdated ...-main/src/main/java/com/facebook/presto/sql/analyzer/FeaturesConfig.java
Show outdated Hide outdated ...-main/src/main/java/com/facebook/presto/sql/analyzer/FeaturesConfig.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated ...java/com/facebook/presto/sql/planner/optimizations/TestReorderJoins.java
Show outdated Hide outdated ...o-main/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
Show outdated Hide outdated ...com/facebook/presto/sql/planner/optimizations/CrossJoinsElimination.java
@kbajda

This comment has been minimized.

Show comment
Hide comment
@kbajda

kbajda Oct 26, 2016

Member

@pnowojski : this PR should be named: "Eliminate unnecessary cross joins" as @rschlussel noted above.

Member

kbajda commented Oct 26, 2016

@pnowojski : this PR should be named: "Eliminate unnecessary cross joins" as @rschlussel noted above.

@pnowojski pnowojski changed the title from Eliminate cross joins to Eliminate unnecessary cross joins Oct 27, 2016

@pnowojski

This comment has been minimized.

Show comment
Hide comment
@pnowojski

pnowojski Nov 14, 2016

Member

@martint I think that you are the correct person to review this code. If not, feel free to unassign yourself.

Member

pnowojski commented Nov 14, 2016

@martint I think that you are the correct person to review this code. If not, feel free to unassign yourself.

@martint

Some minor comments, but looks good overall.

Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../facebook/presto/sql/planner/optimizations/PruneIdentityProjections.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated .../java/com/facebook/presto/sql/planner/optimizations/joins/JoinGraph.java
Show outdated Hide outdated ...a/com/facebook/presto/sql/planner/optimizations/EliminateCrossJoins.java
Show outdated Hide outdated ...a/com/facebook/presto/sql/planner/optimizations/EliminateCrossJoins.java
Show outdated Hide outdated ...a/com/facebook/presto/sql/planner/optimizations/EliminateCrossJoins.java
Show outdated Hide outdated ...a/com/facebook/presto/sql/planner/optimizations/EliminateCrossJoins.java

@martint martint assigned pnowojski and unassigned martint Nov 15, 2016

@pnowojski pnowojski assigned martint and unassigned pnowojski Nov 16, 2016

@pnowojski

This comment has been minimized.

Show comment
Hide comment
@pnowojski

pnowojski Nov 16, 2016

Member

Applied changes and rebased to latest master.

Member

pnowojski commented Nov 16, 2016

Applied changes and rebased to latest master.

@martint

This comment has been minimized.

Show comment
Hide comment
@martint

martint Dec 9, 2016

Contributor

I'm ready to merge this, but I'm running into some conflicts due to recent changes to the plan matching framework. Can you address those? @ebd2 can help you migrate to the new shape.

Contributor

martint commented Dec 9, 2016

I'm ready to merge this, but I'm running into some conflicts due to recent changes to the plan matching framework. Can you address those? @ebd2 can help you migrate to the new shape.

pnowojski added some commits Oct 17, 2016

Eliminate cross joins with multiple separate joins
Support plans where there are separate JoinNodes sequences.
For example: two join nodes, followed by aggregation followd by three join nodes.

Each of such sequence will be treated separately
Eliminate cross joins with non equality conditions
Ignore all Filter nodes and join filter functions between INNER
JoinNodes sequence during cross join elimination. After
reordering reintroduce Filter functions at the top of JoinNodes.
Eliminate cross joins with identity ProjectNodes
Support situation where there are are ProjectNodes between JoinNodes.
In such case remove all of that ProjectNode and add one ProjectNode at top
of the JoinNodes sequence.
Add test case with filter after joins
This test case requires support of both projections and filters
in CrossJoinElimination
@pnowojski

This comment has been minimized.

Show comment
Hide comment
@pnowojski

pnowojski Dec 13, 2016

Member

@martint : rebased and fixed conflicts

Member

pnowojski commented Dec 13, 2016

@martint : rebased and fixed conflicts

@pnowojski pnowojski assigned martint and unassigned pnowojski and martint Dec 13, 2016

@martint

This comment has been minimized.

Show comment
Hide comment
@martint

martint Dec 15, 2016

Contributor

Merged, thanks!

Contributor

martint commented Dec 15, 2016

Merged, thanks!

@martint martint closed this Dec 15, 2016

@pnowojski

This comment has been minimized.

Show comment
Hide comment
@pnowojski

pnowojski Dec 16, 2016

Member

Thanks

Member

pnowojski commented Dec 16, 2016

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment