Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having-clause causes DecodeNode wedges into two-stage Agg mistakenly #19901

Closed
satanson opened this issue Mar 21, 2023 · 0 comments
Closed

Having-clause causes DecodeNode wedges into two-stage Agg mistakenly #19901

satanson opened this issue Mar 21, 2023 · 0 comments
Labels
type/bug Something isn't working

Comments

@satanson
Copy link
Contributor

satanson commented Mar 21, 2023

For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes.

*** SIGSEGV (@0x7f5691bda000) received by PID 22950 (TID 0x7f59983f0700) from PID 18446744071859707904; stack trace: ***
    @          0x5764502 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f5ab9ec48e0 (unknown)
    @          0x35fac28 starrocks::vectorized::DistinctAggregateState<>::deserialize_and_merge()
    @          0x35faf16 starrocks::vectorized::TDistinctAggregateFunction<>::merge()
    @          0x35453ad starrocks::vectorized::NullableAggregateFunctionUnary<>::merge_batch_single_state()
    @          0x3006b66 starrocks::Aggregator::compute_single_agg_state()
    @          0x2f624c8 starrocks::pipeline::AggregateBlockingSinkOperator::push_chunk()

The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs.

Steps to reproduce the behavior (Required)

preprare data

DROP TABLE if exists t0;

CREATE TABLE if not exists t0
(
c0 DATE NOT NULL,
c1 VARCHAR(10)  NULL,
c2 VARCHAR(10)  NULL,
c3 VARCHAR(10)  NULL,
c4 VARCHAR(10)  NULL,
c5 VARCHAR(10)  NULL,
c6 VARCHAR(10)  NULL,
c7 VARCHAR(10)  NULL,
c8 VARCHAR(10)  NULL,
c9 VARCHAR(10)  NULL,
c10 VARCHAR(10)  NULL,
v1 DOUBLE NOT NULL,
v2 DOUBLE NOT NULL,
v3 DOUBLE NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(`c0`, `c1`, `c2`, `c3`, `c4`)
COMMENT "OLAP"
PARTITION BY RANGE(c0) (
  START ("2022-01-01") END ("2022-01-31") EVERY (INTERVAL 1 day)
)
DISTRIBUTED BY HASH(`c5`, `c6`, `c7`) BUCKETS 6
PROPERTIES(
"replication_num" = "1",
"in_memory" = "false",
"storage_format" = "default"
);


INSERT INTO t0
  (c0, c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, v1, v2, v3)
VALUES
  ('2022-01-04', 'xQ', 'b59UmFY', '', 'SYPa2uH7', 'vrm', 'CXWq', 'rU2', 'BlD', 'MIT', 'HYJ5jk', '2.714786950031216E8', '296.4384538888556', '-58.726291995630135'),
  ('2022-01-27', '0euCjBs', '', 'rGJHXd', 'IrydncZ', '5E91JECP', 'lPwKYAgm', '3R', 'EG66', 'Ez1PVy', 'MsPbSTm', '2.0005746107412485E11', '-9.1359063932587984E16', '3.326848242432247E12'),
  ('2022-01-16', 'MfvoM8Z', 'CU2B9S', 'zwni8', '6VoOV3H0', '0WnlYUI', 'jNFUs', 'cQJ7PV', 'AQMqFE2c', '', 'iDWrVbrZQ', '-14065.958506190851', '3.6146139731040496E16', '110570.45966442804'),
  ('2022-01-28', '5f3zd7', 'Zh2lzTX', '8ZGqDEjo', '', '', 'aORS1D', 'd', 'be3Z8FkOf', 'VYMf7VHl', 'fCIy', '7.122101484983402', '-153.20429643562775', '-2.7360878018214388E16'),
  ('2022-01-21', '5f3zd7', 'BWMlGD2I', '8DB15', '', '7KV', 'y', 'aR9zduCRP', 'zxCFBLm84', 'JNNx', 'aWn39', '-1.4624335108345786E12', '849.9895878900785', '-1.668452935736979E16'),
  ('2022-01-09', 'rpAfuwKF', 'b', 't57', 'ykag', 'Bt8', 'Ipsimss2', 'ql6eFnDp', 'FXvGWEcsL', 'EV4udP', 'QqLRQ', '-98.94609187992373', '-3.0821299053430746E17', '3.0035746980288867E12'),
  ('2022-01-05', 'F3s7zB', 'BWMlGD2I', 'S', 'ykag', 'Z0xUP', 'VcOBdx2E', '', 'VqD', 'K2', 's8xy', '-2.831111855557189E13', '8.93410356595643E7', '7.412004583979832E16'),
  ('2022-01-15', 'xQ', 'AVsg', 't57', '', '7U5R43h', 'Kq4', '', '2z6o', '8q', 'G', '550.1914650161069', '-2.794147625355034E12', '1.274014677154364');
  
analyze table t0;

query

set cbo_cte_reuse = false;
set cbo_enable_low_cardinality_optimize = true;
set new_planner_agg_stage = 2;

explain costs select count(distinct c1), count(distinct c2) from t0 having count(1) > 0;

Expected behavior (Required)

no DecodeNode interpolated

                                                                                                                     |
|   3:AGGREGATE (merge finalize)                                                                                                                                                                                                                                                                                                                                                                    |
|   |  aggregate: multi_distinct_count[([10: count, VARCHAR, false]); args: VARCHAR; result: BIGINT; args nullable: true; result nullable: false], multi_distinct_count[([11: count, VARCHAR, false]); args: VARCHAR; result: BIGINT; args nullable: true; result nullable: false], count[([12: count, BIGINT, false]); args: TINYINT; result: BIGINT; args nullable: true; result nullable: false] |
|   |  having: [12: count, BIGINT, false] > 0, [12: count, BIGINT, false] > 0                                                                                                                                                                                                                                                                                                                       |
|   |  cardinality: 1                                                                                                                                                                                                                                                                                                                                                                               |
|   |  column statistics:                                                                                                                                                                                                                                                                                                                                                                           |
|   |  * count-->[0.0, 1.0, 0.0, 8.0, 1.0] ESTIMATE                                                                                                                                                                                                                                                                                                                                                 |
|   |  * count-->[0.0, 1.0, 0.0, 8.0, 1.0] ESTIMATE                                                                                                                                                                                                                                                                                                                                                 |
|   |  * count-->[0.0, 1.0, 0.0, 8.0, 1.0] ESTIMATE                                                                                                                                                                                                                                                                                                                                                 |
|   |                                                                                                                                                                                                                                                                                                                                                                                               |
|   2:EXCHANGE                                                                                                                                                                                                                                                                                                                                                                                      |
|      cardinality: 1                                                                                                                                                                                                                                                                                                                                                                               |
|                                                                                                                                                                                                                                                                                                                                                                                                   |
| PLAN FRAGMENT 1(F00)                                                                                                                                                                                                                                                                                                                                                                              |
|                                                                                                                                                                                                                                                                                                                                                                                                   |
|   Input Partition: RANDOM                                                                                                                                                                                                                                                                                                                                                                         |
|   OutPut Partition: UNPARTITIONED                                                                                                                                                                                                                                                                                                                                                                 |
|   OutPut Exchange Id: 02                                                                                                                                                                                                                                                                                                                                                                          |
|                                                                                                                                                                                                                                                                                                                                                                                                   |
|   1:AGGREGATE (update serialize)                                                                                                                                                                                                                                                                                                                                                                  |
|   |  aggregate: multi_distinct_count[([2: c1, VARCHAR, false]); args: VARCHAR; result: VARCHAR; args nullable: false; result nullable: false], multi_distinct_count[([3: c2, VARCHAR, false]); args: VARCHAR; result: VARCHAR; args nullable: false; result nullable: false], count[(1); args: TINYINT; result: BIGINT; args nullable: false; result nullable: false]                             |
|   |  cardinality: 1                                                                                                                                                                                                                                                                                                                                                     

Real behavior (Required)

DecodeNode interpolated

g8rxpE6Epk

StarRocks version (Required)

  • 2.5.2
  • main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant