-
Notifications
You must be signed in to change notification settings - Fork 1.8k
reimplement push_down_projection and prune_column.
#4465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,17 @@ | ||
| Sort: revenue DESC NULLS FIRST | ||
| Projection: customer.c_custkey, customer.c_name, SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) AS revenue, customer.c_acctbal, nation.n_name, customer.c_address, customer.c_phone, customer.c_comment | ||
| Aggregate: groupBy=[[customer.c_custkey, customer.c_name, customer.c_acctbal, customer.c_phone, nation.n_name, customer.c_address, customer.c_comment]], aggr=[[SUM(CAST(lineitem.l_extendedprice AS Decimal128(38, 4)) * CAST(Decimal128(Some(100),23,2) - CAST(lineitem.l_discount AS Decimal128(23, 2)) AS Decimal128(38, 4))) AS SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)]] | ||
| Inner Join: customer.c_nationkey = nation.n_nationkey | ||
| Inner Join: orders.o_orderkey = lineitem.l_orderkey | ||
| Inner Join: customer.c_custkey = orders.o_custkey | ||
| TableScan: customer projection=[c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_comment] | ||
| Filter: orders.o_orderdate >= Date32("8674") AND orders.o_orderdate < Date32("8766") | ||
| TableScan: orders projection=[o_orderkey, o_custkey, o_orderdate] | ||
| Filter: lineitem.l_returnflag = Utf8("R") | ||
| TableScan: lineitem projection=[l_orderkey, l_extendedprice, l_discount, l_returnflag] | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
| Projection: customer.c_custkey, customer.c_name, customer.c_address, customer.c_phone, customer.c_acctbal, customer.c_comment, lineitem.l_extendedprice, lineitem.l_discount, nation.n_name | ||
| Inner Join: customer.c_nationkey = nation.n_nationkey | ||
| Projection: customer.c_custkey, customer.c_name, customer.c_address, customer.c_nationkey, customer.c_phone, customer.c_acctbal, customer.c_comment, lineitem.l_extendedprice, lineitem.l_discount | ||
| Inner Join: orders.o_orderkey = lineitem.l_orderkey | ||
| Projection: customer.c_custkey, customer.c_name, customer.c_address, customer.c_nationkey, customer.c_phone, customer.c_acctbal, customer.c_comment, orders.o_orderkey | ||
| Inner Join: customer.c_custkey = orders.o_custkey | ||
| TableScan: customer projection=[c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, c_comment] | ||
| Projection: orders.o_orderkey, orders.o_custkey | ||
| Filter: orders.o_orderdate >= Date32("8674") AND orders.o_orderdate < Date32("8766") | ||
| TableScan: orders projection=[o_orderkey, o_custkey, o_orderdate] | ||
| Projection: lineitem.l_orderkey, lineitem.l_extendedprice, lineitem.l_discount | ||
| Filter: lineitem.l_returnflag = Utf8("R") | ||
| TableScan: lineitem projection=[l_orderkey, l_extendedprice, l_discount, l_returnflag] | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,18 +3,24 @@ Sort: value DESC NULLS FIRST | |
| Filter: CAST(SUM(partsupp.ps_supplycost * partsupp.ps_availqty) AS Decimal128(38, 15)) > CAST(__scalar_sq_1.__value AS Decimal128(38, 15)) | ||
| CrossJoin: | ||
| Aggregate: groupBy=[[partsupp.ps_partkey]], aggr=[[SUM(CAST(partsupp.ps_supplycost AS Decimal128(26, 2)) * CAST(partsupp.ps_availqty AS Decimal128(26, 2)))]] | ||
| Inner Join: supplier.s_nationkey = nation.n_nationkey | ||
| Inner Join: partsupp.ps_suppkey = supplier.s_suppkey | ||
| TableScan: partsupp projection=[ps_partkey, ps_suppkey, ps_availqty, ps_supplycost] | ||
| TableScan: supplier projection=[s_suppkey, s_nationkey] | ||
| Filter: nation.n_name = Utf8("GERMANY") | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
| SubqueryAlias: __scalar_sq_1 | ||
| Projection: CAST(SUM(partsupp.ps_supplycost * partsupp.ps_availqty) AS Float64) * Float64(0.0001) AS __value | ||
| Aggregate: groupBy=[[]], aggr=[[SUM(CAST(partsupp.ps_supplycost AS Decimal128(26, 2)) * CAST(partsupp.ps_availqty AS Decimal128(26, 2)))]] | ||
| Inner Join: supplier.s_nationkey = nation.n_nationkey | ||
| Projection: partsupp.ps_partkey, partsupp.ps_availqty, partsupp.ps_supplycost | ||
| Inner Join: supplier.s_nationkey = nation.n_nationkey | ||
| Projection: partsupp.ps_partkey, partsupp.ps_availqty, partsupp.ps_supplycost, supplier.s_nationkey | ||
| Inner Join: partsupp.ps_suppkey = supplier.s_suppkey | ||
| TableScan: partsupp projection=[ps_suppkey, ps_availqty, ps_supplycost] | ||
| TableScan: partsupp projection=[ps_partkey, ps_suppkey, ps_availqty, ps_supplycost] | ||
| TableScan: supplier projection=[s_suppkey, s_nationkey] | ||
| Projection: nation.n_nationkey | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here Filter will output two columns, but we just need one column |
||
| Filter: nation.n_name = Utf8("GERMANY") | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
| SubqueryAlias: __scalar_sq_1 | ||
| Projection: CAST(SUM(partsupp.ps_supplycost * partsupp.ps_availqty) AS Float64) * Float64(0.0001) AS __value | ||
| Aggregate: groupBy=[[]], aggr=[[SUM(CAST(partsupp.ps_supplycost AS Decimal128(26, 2)) * CAST(partsupp.ps_availqty AS Decimal128(26, 2)))]] | ||
| Projection: partsupp.ps_availqty, partsupp.ps_supplycost | ||
| Inner Join: supplier.s_nationkey = nation.n_nationkey | ||
| Projection: partsupp.ps_availqty, partsupp.ps_supplycost, supplier.s_nationkey | ||
| Inner Join: partsupp.ps_suppkey = supplier.s_suppkey | ||
| TableScan: partsupp projection=[ps_suppkey, ps_availqty, ps_supplycost] | ||
| TableScan: supplier projection=[s_suppkey, s_nationkey] | ||
| Projection: nation.n_nationkey | ||
| Filter: nation.n_name = Utf8("GERMANY") | ||
| TableScan: nation projection=[n_nationkey, n_name] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,7 +1,9 @@ | ||
| Sort: lineitem.l_shipmode ASC NULLS LAST | ||
| Projection: lineitem.l_shipmode, SUM(CASE WHEN orders.o_orderpriority = Utf8("1-URGENT") OR orders.o_orderpriority = Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END) AS high_line_count, SUM(CASE WHEN orders.o_orderpriority != Utf8("1-URGENT") AND orders.o_orderpriority != Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END) AS low_line_count | ||
| Aggregate: groupBy=[[lineitem.l_shipmode]], aggr=[[SUM(CASE WHEN orders.o_orderpriority = Utf8("1-URGENT") OR orders.o_orderpriority = Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END), SUM(CASE WHEN orders.o_orderpriority != Utf8("1-URGENT") AND orders.o_orderpriority != Utf8("2-HIGH") THEN Int64(1) ELSE Int64(0) END)]] | ||
| Inner Join: lineitem.l_orderkey = orders.o_orderkey | ||
| Filter: (lineitem.l_shipmode = Utf8("SHIP") OR lineitem.l_shipmode = Utf8("MAIL")) AND lineitem.l_commitdate < lineitem.l_receiptdate AND lineitem.l_shipdate < lineitem.l_commitdate AND lineitem.l_receiptdate >= Date32("8766") AND lineitem.l_receiptdate < Date32("9131") | ||
| TableScan: lineitem projection=[l_orderkey, l_shipdate, l_commitdate, l_receiptdate, l_shipmode] | ||
| TableScan: orders projection=[o_orderkey, o_orderpriority] | ||
| Projection: lineitem.l_shipmode, orders.o_orderpriority | ||
| Inner Join: lineitem.l_orderkey = orders.o_orderkey | ||
| Projection: lineitem.l_orderkey, lineitem.l_shipmode | ||
| Filter: (lineitem.l_shipmode = Utf8("SHIP") OR lineitem.l_shipmode = Utf8("MAIL")) AND lineitem.l_commitdate < lineitem.l_receiptdate AND lineitem.l_shipdate < lineitem.l_commitdate AND lineitem.l_receiptdate >= Date32("8766") AND lineitem.l_receiptdate < Date32("9131") | ||
| TableScan: lineitem projection=[l_orderkey, l_shipdate, l_commitdate, l_receiptdate, l_shipmode] | ||
| TableScan: orders projection=[o_orderkey, o_orderpriority] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,12 +1,16 @@ | ||
| Projection: CAST(SUM(lineitem.l_extendedprice) AS Float64) / Float64(7) AS avg_yearly | ||
| Aggregate: groupBy=[[]], aggr=[[SUM(lineitem.l_extendedprice)]] | ||
| Filter: CAST(lineitem.l_quantity AS Decimal128(30, 15)) < CAST(__scalar_sq_1.__value AS Decimal128(30, 15)) | ||
| Inner Join: part.p_partkey = __scalar_sq_1.l_partkey, lineitem.l_partkey = __scalar_sq_1.l_partkey | ||
| Inner Join: lineitem.l_partkey = part.p_partkey | ||
| TableScan: lineitem projection=[l_partkey, l_quantity, l_extendedprice] | ||
| Filter: part.p_brand = Utf8("Brand#23") AND part.p_container = Utf8("MED BOX") | ||
| TableScan: part projection=[p_partkey, p_brand, p_container] | ||
| SubqueryAlias: __scalar_sq_1 | ||
| Projection: lineitem.l_partkey, Float64(0.2) * CAST(AVG(lineitem.l_quantity) AS Float64) AS __value | ||
| Aggregate: groupBy=[[lineitem.l_partkey]], aggr=[[AVG(lineitem.l_quantity)]] | ||
| TableScan: lineitem projection=[l_partkey, l_quantity] | ||
| Projection: lineitem.l_extendedprice | ||
| Filter: CAST(lineitem.l_quantity AS Decimal128(30, 15)) < CAST(__scalar_sq_1.__value AS Decimal128(30, 15)) AND __scalar_sq_1.l_partkey = lineitem.l_partkey | ||
| Projection: lineitem.l_partkey, lineitem.l_quantity, lineitem.l_extendedprice, __scalar_sq_1.l_partkey, __scalar_sq_1.__value | ||
| Inner Join: part.p_partkey = __scalar_sq_1.l_partkey | ||
| Filter: part.p_partkey = lineitem.l_partkey AND lineitem.l_partkey = part.p_partkey | ||
| Inner Join: lineitem.l_partkey = part.p_partkey | ||
| TableScan: lineitem projection=[l_partkey, l_quantity, l_extendedprice] | ||
| Projection: part.p_partkey | ||
| Filter: part.p_brand = Utf8("Brand#23") AND part.p_container = Utf8("MED BOX") | ||
| TableScan: part projection=[p_partkey, p_brand, p_container] | ||
| SubqueryAlias: __scalar_sq_1 | ||
| Projection: lineitem.l_partkey, Float64(0.2) * CAST(AVG(lineitem.l_quantity) AS Float64) AS __value | ||
| Aggregate: groupBy=[[lineitem.l_partkey]], aggr=[[AVG(lineitem.l_quantity)]] | ||
| TableScan: lineitem projection=[l_partkey, l_quantity] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can find some new projection happen above join. because I prune column for
JoinFor example:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea is that the
Projectionnodes above the join are added to make it clear what columns that come out of the join are actually needed above it?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the
Projectionnodes above thejoinis used to just get columns that we need to use.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is relative to
build_join_schema. Some optimizer rules call this function.I think it is ok for calling it before pushdown projection, but I guess it is not correct after push down projection.
For the query:
we call it after pushdown projection:
schema(a): a.idschema(b): b.idbuild_join_schemawill merge left and right, the result isa.id+b.id, but the expected result should be onlya.id.Maybe we can fix it first, and then we will not need the projection any more.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for this comment is easily misunderstood. I have corrected it.