-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](mtmv) Fix getting related partition table wrongly when multi base partition table exists #34781
[fix](mtmv) Fix getting related partition table wrongly when multi base partition table exists #34781
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
TPC-H: Total hot run time: 41800 ms
|
TPC-DS: Total hot run time: 187483 ms
|
a4ad559
to
12d0109
Compare
run buildall |
12d0109
to
9c1f063
Compare
run buildall |
TPC-H: Total hot run time: 40779 ms
|
TPC-DS: Total hot run time: 170445 ms
|
ClickBench: Total hot run time: 30.74 s
|
run buildall |
run compile |
run buildall |
…se partition table exists
a2688f8
to
a280d9e
Compare
run buildall |
TPC-H: Total hot run time: 41141 ms
|
TPC-DS: Total hot run time: 169590 ms
|
ClickBench: Total hot run time: 30.42 s
|
...-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/MaterializedViewUtils.java
Outdated
Show resolved
Hide resolved
List<Object> catalogRelationObjs = materializedViewPlan.collectToList( | ||
planTreeNode -> planTreeNode instanceof CatalogRelation); | ||
ImmutableMultimap.Builder<TableIdentifier, CatalogRelation> tableCatalogRelationMultimapBuilder = | ||
ImmutableMultimap.builder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use expectedSize builder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImmutableMultimap.Builder seems doesn't have expectedSize builder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImmutableMap.builderWithExpectedSize()
run buildall |
TPC-H: Total hot run time: 41188 ms
|
TPC-DS: Total hot run time: 167213 ms
|
ClickBench: Total hot run time: 30.56 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by anyone and no changes requested. |
PR approved by at least one committer and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…se partition table exists (#34781) Fix getting related partition table wrongly when multi base partition table exists such as base table def is as following: CREATE TABLE `test1` ( `pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no', `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type1` VARCHAR(50) NULL COMMENT 'vin', `upgrade_day` date COMMENT 'upgrade_day' ) ENGINE=OLAP unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); CREATE TABLE `test2` ( `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type2` VARCHAR(50) NULL COMMENT 'vin', `status` VARCHAR(50) COMMENT 'status', `upgrade_day` date not null COMMENT 'upgrade_day' ) ENGINE=OLAP Duplicate KEY(`batch_no`,`vin_type2`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); if you create partition mv which partition by ` t1.upgrade_day` as following it will be successful select t1.upgrade_day, t1.batch_no, t1.vin_type1 from ( SELECT batch_no, vin_type1, upgrade_day FROM test1 where batch_no like 'c%' group by batch_no, vin_type1, upgrade_day ) t1 left join ( select batch_no, vin_type2, status from test2 group by batch_no, vin_type2, status ) t2 on t1.vin_type1 = t2.vin_type2;
…se partition table exists (#34781) Fix getting related partition table wrongly when multi base partition table exists such as base table def is as following: CREATE TABLE `test1` ( `pre_batch_no` VARCHAR(100) NULL COMMENT 'pre_batch_no', `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type1` VARCHAR(50) NULL COMMENT 'vin', `upgrade_day` date COMMENT 'upgrade_day' ) ENGINE=OLAP unique KEY(`pre_batch_no`,`batch_no`, `vin_type1`, `upgrade_day`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-03-20") TO ("2024-03-31") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type1`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); CREATE TABLE `test2` ( `batch_no` VARCHAR(100) NULL COMMENT 'batch_no', `vin_type2` VARCHAR(50) NULL COMMENT 'vin', `status` VARCHAR(50) COMMENT 'status', `upgrade_day` date not null COMMENT 'upgrade_day' ) ENGINE=OLAP Duplicate KEY(`batch_no`,`vin_type2`) COMMENT 'OLAP' PARTITION BY RANGE(`upgrade_day`) ( FROM ("2024-01-01") TO ("2024-01-10") INTERVAL 1 DAY ) DISTRIBUTED BY HASH(`vin_type2`) BUCKETS 10 PROPERTIES ( "replication_num" = "1" ); if you create partition mv which partition by ` t1.upgrade_day` as following it will be successful select t1.upgrade_day, t1.batch_no, t1.vin_type1 from ( SELECT batch_no, vin_type1, upgrade_day FROM test1 where batch_no like 'c%' group by batch_no, vin_type1, upgrade_day ) t1 left join ( select batch_no, vin_type2, status from test2 group by batch_no, vin_type2, status ) t2 on t1.vin_type1 = t2.vin_type2;
… optimize the fail reason (#35562) this depends on #34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
… optimize the fail reason (#35562) this depends on #34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
… optimize the fail reason (apache#35562) this depends on apache#34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
… optimize the fail reason (apache#35562) this depends on apache#34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
… optimize the fail reason (apache#35562) this depends on apache#34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
… optimize the fail reason (apache#35562) this depends on apache#34781 1. Materialized view partition track supports date_trunc and optimize the fail reason. 2. it supports create partition mv as following: this mv will be partition updated by day CREATE MATERIALIZED VIEW mv_6 BUILD IMMEDIATE REFRESH AUTO ON MANUAL partition by(date_trunc(date_alias, 'day')) DISTRIBUTED BY RANDOM BUCKETS 2 PROPERTIES ('replication_num' = '1') AS SELECT date_trunc(t1.L_SHIPDATE, 'hour') as date_alias, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS, count(distinct case when t1.L_SUPPKEY > 0 then t2.O_ORDERSTATUS else null end) as cnt_1 from (select * from lineitem where L_SHIPDATE in ('2017-01-30')) t1 left join (select * from orders where O_ORDERDATE in ('2017-01-30')) t2 on t1.L_ORDERKEY = t2.O_ORDERKEY group by t1.L_SHIPDATE, t2.O_ORDERDATE, t1.L_QUANTITY, t2.O_ORDERSTATUS;
Proposed changes
Fix getting related partition table wrongly when multi base partition table exists
such as base table def is as following:
if you create partition mv which partition by
t1.upgrade_day
as following it will be successfulFurther comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...