Skip to content

Conversation

@keanji-x
Copy link
Contributor

@keanji-x keanji-x commented Nov 23, 2023

Proposed changes

select count(null) --> select 0

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@keanji-x keanji-x force-pushed the count_null branch 2 times, most recently from 70bb4e5 to 5d8f456 Compare November 23, 2023 08:24
@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 5d8f456f071885ac64983670f855eb93bb193bb3, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4870	4669	4642	4642
q2	361	166	166	166
q3	2056	1931	1956	1931
q4	1389	1288	1223	1223
q5	3986	3959	4068	3959
q6	253	129	129	129
q7	1447	872	898	872
q8	2814	2821	2805	2805
q9	9824	9566	9532	9532
q10	3641	3544	3546	3544
q11	381	254	252	252
q12	435	284	290	284
q13	4579	3840	3825	3825
q14	319	298	307	298
q15	587	533	519	519
q16	653	583	584	583
q17	1146	971	954	954
q18	7908	7494	7330	7330
q19	1677	1686	1686	1686
q20	558	323	305	305
q21	4390	3949	4015	3949
q22	468	377	377	377
Total cold run time: 53742 ms
Total hot run time: 49165 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4571	4601	4585	4585
q2	340	226	267	226
q3	4019	4010	4022	4010
q4	2716	2709	2699	2699
q5	9644	9626	9563	9563
q6	245	122	123	122
q7	3013	2473	2485	2473
q8	4440	4491	4492	4491
q9	12949	12848	12869	12848
q10	4077	4165	4157	4157
q11	848	625	654	625
q12	967	822	810	810
q13	4318	3543	3562	3543
q14	371	357	367	357
q15	578	521	521	521
q16	738	701	678	678
q17	3872	3969	3863	3863
q18	9608	9147	9167	9147
q19	1822	1775	1801	1775
q20	2387	2065	2077	2065
q21	8857	8705	8670	8670
q22	951	787	822	787
Total cold run time: 81331 ms
Total hot run time: 78015 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.68 seconds
stream load tsv: 572 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17098987900 Bytes

@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit b1df75960cd2296442b210a580770b4bf9f2064f, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4920	4663	4656	4656
q2	364	159	159	159
q3	2035	1923	1923	1923
q4	1377	1250	1245	1245
q5	3963	3975	4056	3975
q6	249	131	134	131
q7	1441	892	864	864
q8	2804	2802	2794	2794
q9	9705	9584	9422	9422
q10	3455	3529	3538	3529
q11	379	252	237	237
q12	443	292	292	292
q13	4578	3784	3780	3780
q14	319	287	287	287
q15	584	536	523	523
q16	660	573	586	573
q17	1142	964	937	937
q18	7928	7357	7416	7357
q19	1682	1698	1684	1684
q20	569	319	293	293
q21	4356	3988	3943	3943
q22	472	374	363	363
Total cold run time: 53425 ms
Total hot run time: 48967 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4562	4585	4579	4579
q2	338	234	249	234
q3	4006	3979	4013	3979
q4	2710	2700	2706	2700
q5	9644	9624	9632	9624
q6	247	122	127	122
q7	3042	2509	2503	2503
q8	4442	4504	4473	4473
q9	12998	12822	12909	12822
q10	4082	4172	4190	4172
q11	763	630	631	630
q12	977	822	806	806
q13	4271	3539	3595	3539
q14	375	343	357	343
q15	575	520	528	520
q16	730	688	667	667
q17	3898	3878	3882	3878
q18	9616	9065	9108	9065
q19	1783	1822	1776	1776
q20	2407	2060	2040	2040
q21	8825	8450	8753	8450
q22	863	777	776	776
Total cold run time: 81154 ms
Total hot run time: 77698 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.27 seconds
stream load tsv: 581 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.3 seconds inserted 10000000 Rows, about 353K ops/s
storage size: 17101303615 Bytes

@keanji-x keanji-x force-pushed the count_null branch 2 times, most recently from 2cf4f86 to 9e7f82c Compare November 23, 2023 14:00
@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.1 seconds
stream load tsv: 584 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17101340515 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 9e7f82c2e91e687d9da113831c99f08fa356ce69, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4921	4667	4631	4631
q2	355	161	155	155
q3	2027	1888	1920	1888
q4	1394	1302	1291	1291
q5	3972	3965	4031	3965
q6	246	131	133	131
q7	1429	886	880	880
q8	2802	2792	2792	2792
q9	9815	9554	9409	9409
q10	3472	3539	3487	3487
q11	381	241	251	241
q12	447	289	297	289
q13	4581	3848	3802	3802
q14	328	289	292	289
q15	591	535	519	519
q16	664	583	591	583
q17	1162	977	941	941
q18	7874	7384	7431	7384
q19	1675	1703	1681	1681
q20	561	303	303	303
q21	4462	3986	3969	3969
q22	477	382	380	380
Total cold run time: 53636 ms
Total hot run time: 49010 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4579	4598	4600	4598
q2	372	239	252	239
q3	4033	4011	3979	3979
q4	2710	2705	2713	2705
q5	9726	9622	9680	9622
q6	247	124	126	124
q7	3039	2495	2511	2495
q8	4464	4445	4439	4439
q9	12990	12888	12858	12858
q10	4070	4190	4173	4173
q11	798	667	640	640
q12	976	817	820	817
q13	4321	3541	3567	3541
q14	377	348	343	343
q15	583	525	517	517
q16	740	677	664	664
q17	3852	3821	3846	3821
q18	9505	9077	9200	9077
q19	1846	1785	1799	1785
q20	2396	2072	2065	2065
q21	8841	8701	8605	8605
q22	890	808	803	803
Total cold run time: 81355 ms
Total hot run time: 77910 ms

* Whether the expression is a constant.
*/
public boolean isConstant() {
if (this instanceof AggregateFunction) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need add window expression and tvf?

Copy link
Contributor Author

@keanji-x keanji-x Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need. It only used in fold scalar constant and count literal

Comment on lines 59 to 60
// if there is no group by keys and other agg func, just return the one row Relations, such as
// select count(null) from t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not right when base table is empty, think about

select 'abc' from t group by 'abc'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@keanji-x keanji-x force-pushed the count_null branch 2 times, most recently from 40e44eb to 555c9e7 Compare November 28, 2023 02:20
@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.87 seconds
stream load tsv: 581 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 29.1 seconds inserted 10000000 Rows, about 343K ops/s
storage size: 17099299249 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit ad6eaa561952eae0683e14b3be67282e8b883f7f, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4927	4676	4647	4647
q2	364	154	157	154
q3	2038	1887	1904	1887
q4	1399	1268	1255	1255
q5	3992	4032	3995	3995
q6	256	131	128	128
q7	1348	852	849	849
q8	2812	2825	2771	2771
q9	9808	11089	9800	9800
q10	3471	3520	3532	3520
q11	380	262	241	241
q12	441	288	304	288
q13	4563	3860	3821	3821
q14	325	295	291	291
q15	591	539	513	513
q16	493	464	464	464
q17	1142	974	949	949
q18	7890	7403	7392	7392
q19	1697	1652	1678	1652
q20	570	334	350	334
q21	4454	3997	4200	3997
q22	481	371	384	371
Total cold run time: 53442 ms
Total hot run time: 49319 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4600	4588	4590	4588
q2	343	232	256	232
q3	4048	4013	4012	4012
q4	2727	2727	2724	2724
q5	9677	9570	9728	9570
q6	243	119	120	119
q7	2993	2502	2494	2494
q8	4486	4521	4458	4458
q9	13273	13092	13058	13058
q10	4046	4148	4145	4145
q11	809	636	675	636
q12	967	829	789	789
q13	4306	3582	3592	3582
q14	384	343	343	343
q15	577	521	523	521
q16	603	568	564	564
q17	3916	3788	3875	3788
q18	9756	9190	9112	9112
q19	1822	1791	1766	1766
q20	2398	2057	2047	2047
q21	8763	8499	8737	8499
q22	877	770	781	770
Total cold run time: 81614 ms
Total hot run time: 77817 ms

@keanji-x keanji-x requested a review from morrySnow November 28, 2023 05:29
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 28, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@jackwener jackwener merged commit 3d46643 into apache:master Nov 28, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.4 reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants