Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planner: rewrite avg for mpp #1

Open
wants to merge 2 commits into
base: shuffle-agg
Choose a base branch
from
Open

Conversation

fzhedu
Copy link
Owner

@fzhedu fzhedu commented Jan 6, 2021

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: we cannot push down avg to tiflash entalily, so we rewrite it to case count(arg) when 0 then null else sum(arg)/count(arg) end, then the sum(arg) and count(arg) can be pushed down.

What is changed and how it works?

Proposal: xxx

What's Changed:

How it Works:

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

mysql>  set @@tidb_allow_mpp=1; set @@tidb_opt_broadcast_join=1;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> select value, avg(id)+1 as b from t1 group by value;
+-------+---------+
| value | b       |
+-------+---------+
|    11 |    NULL |
|    12 |    NULL |
|    10 | 10.0000 |
|     6 |  5.0000 |
|     2 |  2.0000 |
|     5 |  4.0000 |
|     8 |  8.0000 |
|     4 |  4.0000 |
+-------+---------+
8 rows in set (0.01 sec)

mysql> desc select value, avg(id)+1 as b from t1 group by value;
+------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
| id                                 | estRows | task         | access object | operator info                                                                                                                        |
+------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Projection_4                       | 7.20    | root         |               | test.t1.value, plus(case(eq(Column#4, 0), <nil>, div(Column#5, cast(Column#4, decimal(20,0) BINARY))), 1)->Column#6                  |
| └─TableReader_32                   | 7.20    | root         |               | data:ExchangeSender_31                                                                                                               |
|   └─ExchangeSender_31              | 7.20    | cop[tiflash] |               | ExchangeType: PassThrough                                                                                                            |
|     └─HashAgg_28                   | 7.20    | cop[tiflash] |               | group by:test.t1.value, funcs:sum(Column#14)->Column#4, funcs:sum(Column#15)->Column#5, funcs:firstrow(test.t1.value)->test.t1.value |
|       └─ExchangeReceiver_30        | 7.20    | cop[tiflash] |               |                                                                                                                                      |
|         └─ExchangeSender_29        | 7.20    | cop[tiflash] |               | ExchangeType: HashPartition, Hash Cols: test.t1.value                                                                                |
|           └─HashAgg_10             | 7.20    | cop[tiflash] |               | group by:test.t1.value, funcs:count(test.t1.id)->Column#14, funcs:sum(test.t1.id)->Column#15                                         |
|             └─TableFullScan_27     | 9.00    | cop[tiflash] | table:t1      | keep order:false, stats:pseudo                                                                                                       |
+------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.00 sec)

mysql> select  avg(id)+1 from t1 where value=-1;
+-----------+
| avg(id)+1 |
+-----------+
|      NULL |
+-----------+
1 row in set (0.01 sec)

mysql> desc select  avg(id)+1 from t1 where value=-1;
+--------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------+
| id                             | estRows | task         | access object | operator info                                                                                        |
+--------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------+
| Projection_5                   | 1.00    | root         |               | plus(case(eq(Column#4, 0), <nil>, div(Column#5, cast(Column#4, decimal(20,0) BINARY))), 1)->Column#6 |
| └─StreamAgg_12                 | 1.00    | root         |               | funcs:count(Column#16)->Column#4, funcs:sum(Column#17)->Column#5                                     |
|   └─Projection_36              | 0.01    | root         |               | test.t1.id, cast(test.t1.id, decimal(32,0) BINARY)->Column#17                                        |
|     └─TableReader_28           | 0.01    | root         |               | data:Selection_27                                                                                    |
|       └─Selection_27           | 0.01    | cop[tiflash] |               | eq(test.t1.value, -1)                                                                                |
|         └─TableFullScan_26     | 9.00    | cop[tiflash] | table:t1      | keep order:false, stats:pseudo                                                                       |
+--------------------------------+---------+--------------+---------------+------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

mysql> select value, avg(id) from t1 where value=-1 group by value;
Empty set (0.01 sec)

mysql> desc select value, avg(id) from t1 where value=-1 group by value;
+--------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
| id                                   | estRows | task         | access object | operator info                                                                                                                        |
+--------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Projection_5                         | 1.00    | root         |               | test.t1.value, case(eq(Column#4, 0), <nil>, div(Column#5, cast(Column#4, decimal(20,0) BINARY)))->Column#6                           |
| └─TableReader_40                     | 1.00    | root         |               | data:ExchangeSender_39                                                                                                               |
|   └─ExchangeSender_39                | 1.00    | cop[tiflash] |               | ExchangeType: PassThrough                                                                                                            |
|     └─HashAgg_36                     | 1.00    | cop[tiflash] |               | group by:test.t1.value, funcs:sum(Column#14)->Column#4, funcs:sum(Column#15)->Column#5, funcs:firstrow(test.t1.value)->test.t1.value |
|       └─ExchangeReceiver_38          | 1.00    | cop[tiflash] |               |                                                                                                                                      |
|         └─ExchangeSender_37          | 1.00    | cop[tiflash] |               | ExchangeType: HashPartition, Hash Cols: test.t1.value                                                                                |
|           └─HashAgg_11               | 1.00    | cop[tiflash] |               | group by:test.t1.value, funcs:count(test.t1.id)->Column#14, funcs:sum(test.t1.id)->Column#15                                         |
|             └─Selection_35           | 0.01    | cop[tiflash] |               | eq(test.t1.value, -1)                                                                                                                |
|               └─TableFullScan_34     | 9.00    | cop[tiflash] | table:t1      | keep order:false, stats:pseudo                                                                                                       |
+--------------------------------------+---------+--------------+---------------+--------------------------------------------------------------------------------------------------------------------------------------+
9 rows in set (0.00 sec)

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM
  • Breaking backward compatibility

Release note

  • No release note

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant