Skip to content

[improvement](statistics)Remove collect external table row count task.#34487

Merged
morrySnow merged 1 commit intoapache:masterfrom
Jibing-Li:externalrowcount
May 8, 2024
Merged

[improvement](statistics)Remove collect external table row count task.#34487
morrySnow merged 1 commit intoapache:masterfrom
Jibing-Li:externalrowcount

Conversation

@Jibing-Li
Copy link
Copy Markdown
Contributor

Before, we would create an analyze task for external table to collect the row count of the table. But this is resource consuming and unnecessary. Because the planner doesn't use table row count, it fetch the row count from column row count. This pr is to remove this row count task for external table.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link
Copy Markdown

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Jibing-Li
Copy link
Copy Markdown
Contributor Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 40140 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dbb37359542f4b109a43ea9cc55d056ac8cab758, data reload: false

------ Round 1 ----------------------------------
q1	18007	4445	4356	4356
q2	2480	200	197	197
q3	10963	1311	1199	1199
q4	10523	736	789	736
q5	7502	2738	2635	2635
q6	216	134	139	134
q7	1043	625	597	597
q8	9424	2126	2056	2056
q9	8971	6606	6551	6551
q10	8913	3709	3685	3685
q11	462	252	253	252
q12	518	219	222	219
q13	18961	2919	2944	2919
q14	256	216	218	216
q15	527	473	475	473
q16	513	390	376	376
q17	963	699	745	699
q18	7880	7467	7405	7405
q19	2469	1542	1505	1505
q20	647	317	310	310
q21	5098	3332	4243	3332
q22	371	290	288	288
Total cold run time: 116707 ms
Total hot run time: 40140 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4353	4214	4212	4212
q2	378	270	299	270
q3	2947	2781	2687	2687
q4	1870	1582	1530	1530
q5	5277	5253	5271	5253
q6	213	126	126	126
q7	2242	1893	1872	1872
q8	3181	3387	3365	3365
q9	8420	8426	8425	8425
q10	3876	3699	3695	3695
q11	609	485	481	481
q12	768	587	597	587
q13	16254	3010	2942	2942
q14	300	268	255	255
q15	509	475	477	475
q16	473	417	413	413
q17	1759	1491	1466	1466
q18	7680	7570	7408	7408
q19	3901	1574	1579	1574
q20	1962	1751	1750	1750
q21	5136	4914	4850	4850
q22	576	498	506	498
Total cold run time: 72684 ms
Total hot run time: 54134 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 185191 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dbb37359542f4b109a43ea9cc55d056ac8cab758, data reload: false

query1	932	375	348	348
query2	6476	2356	2346	2346
query3	6655	207	212	207
query4	23056	21168	21276	21168
query5	4141	410	432	410
query6	294	177	175	175
query7	4583	292	296	292
query8	244	195	198	195
query9	8401	2441	2419	2419
query10	450	253	260	253
query11	14914	14146	14097	14097
query12	139	92	89	89
query13	1641	384	372	372
query14	10538	8007	6711	6711
query15	250	167	177	167
query16	8170	267	273	267
query17	1864	578	564	564
query18	2117	290	282	282
query19	332	154	158	154
query20	94	93	86	86
query21	198	127	130	127
query22	4978	4810	4808	4808
query23	33942	33236	33198	33198
query24	10568	2896	2895	2895
query25	645	393	378	378
query26	1494	168	158	158
query27	2922	327	336	327
query28	7628	2076	2071	2071
query29	942	641	620	620
query30	294	153	164	153
query31	945	738	738	738
query32	94	59	62	59
query33	759	263	251	251
query34	1033	497	495	495
query35	812	692	671	671
query36	1092	899	937	899
query37	143	69	69	69
query38	2891	2750	2755	2750
query39	1580	1542	1542	1542
query40	282	136	126	126
query41	44	42	41	41
query42	103	97	98	97
query43	575	534	532	532
query44	1182	742	752	742
query45	269	242	251	242
query46	1085	711	720	711
query47	1929	1839	1940	1839
query48	371	293	297	293
query49	1137	397	396	396
query50	784	409	394	394
query51	6698	6602	6734	6602
query52	101	91	89	89
query53	360	276	290	276
query54	320	237	244	237
query55	81	75	71	71
query56	237	234	220	220
query57	1223	1137	1124	1124
query58	217	199	209	199
query59	3376	3117	2937	2937
query60	271	235	231	231
query61	90	87	88	87
query62	692	455	440	440
query63	307	282	287	282
query64	9615	7256	7136	7136
query65	3139	3074	3033	3033
query66	1385	364	342	342
query67	15706	14993	14972	14972
query68	9042	539	549	539
query69	526	317	308	308
query70	1144	1069	1118	1069
query71	493	271	283	271
query72	7938	2561	2377	2377
query73	811	325	328	325
query74	6518	6088	6129	6088
query75	4358	2644	2656	2644
query76	5373	1059	976	976
query77	623	261	261	261
query78	11014	10297	10268	10268
query79	12277	526	520	520
query80	1643	449	431	431
query81	498	219	225	219
query82	634	95	91	91
query83	201	165	161	161
query84	264	85	84	84
query85	1324	268	258	258
query86	387	297	296	296
query87	3302	3127	3081	3081
query88	5261	2434	2449	2434
query89	534	378	374	374
query90	2015	185	189	185
query91	127	97	98	97
query92	59	49	51	49
query93	7556	520	503	503
query94	1263	186	185	185
query95	396	306	310	306
query96	595	268	266	266
query97	3168	2928	2940	2928
query98	240	217	211	211
query99	1254	884	899	884
Total cold run time: 310668 ms
Total hot run time: 185191 ms

@Jibing-Li
Copy link
Copy Markdown
Contributor Author

run buildall

@Jibing-Li Jibing-Li marked this pull request as ready for review May 7, 2024 14:30
@Jibing-Li Jibing-Li force-pushed the externalrowcount branch 2 times, most recently from 8b7eca5 to 2de7018 Compare May 8, 2024 03:54
@Jibing-Li Jibing-Li force-pushed the externalrowcount branch from 2de7018 to 3b01346 Compare May 8, 2024 04:08
Copy link
Copy Markdown
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 8, 2024
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented May 8, 2024

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented May 8, 2024

PR approved by anyone and no changes requested.

@Jibing-Li
Copy link
Copy Markdown
Contributor Author

run buildall

1 similar comment
@Jibing-Li
Copy link
Copy Markdown
Contributor Author

run buildall

@morrySnow morrySnow merged commit 3713bb9 into apache:master May 8, 2024
@Jibing-Li Jibing-Li deleted the externalrowcount branch May 8, 2024 11:18
ByteYue pushed a commit to ByteYue/doris that referenced this pull request May 15, 2024
Before, we would create an analyze task
for external table tocollect the row count of the table.
But this is resource consuming and unnecessary.
Because the planner doesn't use table row count,
it fetch the row count from column row count.
This pr is to remove this row count task for external table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.0-merged not-merge/2.0 do not merge into 2.0 branch not-merge/2.1 reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants