Skip to content

[fix](statistics)Cast min/max in partition stats table to double for numeric column to get correct table level min/max value.#34919

Merged
Jibing-Li merged 1 commit intoapache:masterfrom
Jibing-Li:numminmax
May 17, 2024
Merged

[fix](statistics)Cast min/max in partition stats table to double for numeric column to get correct table level min/max value.#34919
Jibing-Li merged 1 commit intoapache:masterfrom
Jibing-Li:numminmax

Conversation

@Jibing-Li
Copy link
Contributor

@Jibing-Li Jibing-Li commented May 15, 2024

Min/max value in partition statistics table are of type String, we need to cast numeric type to double when we merge partition stats to get table stats, otherwise, the table min/max value is incorrect because they are in alphabet order.

Currently, this function is closed by default. Will open it later with more test cases.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Jibing-Li Jibing-Li marked this pull request as ready for review May 15, 2024 11:22
@Jibing-Li
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41710 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2f8a32d8cd2ca935839ec126f97a1d082634ee93, data reload: false

------ Round 1 ----------------------------------
q1	17652	4323	4261	4261
q2	2028	191	186	186
q3	10459	1267	1157	1157
q4	10204	748	775	748
q5	7501	2741	2718	2718
q6	220	137	131	131
q7	1021	618	585	585
q8	9223	2143	2103	2103
q9	9434	6765	6667	6667
q10	9251	3920	3912	3912
q11	446	242	244	242
q12	475	217	218	217
q13	17221	3139	3243	3139
q14	283	234	219	219
q15	521	460	479	460
q16	496	387	385	385
q17	996	726	675	675
q18	8289	7925	7754	7754
q19	5597	1576	1583	1576
q20	635	310	303	303
q21	5189	4199	3999	3999
q22	352	276	273	273
Total cold run time: 117493 ms
Total hot run time: 41710 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4544	4422	4433	4422
q2	382	263	270	263
q3	3153	2948	2822	2822
q4	1860	1599	1625	1599
q5	5483	5471	5527	5471
q6	211	121	124	121
q7	2324	1987	1983	1983
q8	3277	3429	3370	3370
q9	8668	8708	8696	8696
q10	3875	3735	3860	3735
q11	582	490	487	487
q12	805	614	619	614
q13	16868	3116	3169	3116
q14	303	273	269	269
q15	532	483	481	481
q16	478	425	432	425
q17	1766	1480	1482	1480
q18	7741	7590	7414	7414
q19	1660	1563	1579	1563
q20	1931	1748	1774	1748
q21	7845	4798	4874	4798
q22	614	497	478	478
Total cold run time: 74902 ms
Total hot run time: 55355 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187673 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2f8a32d8cd2ca935839ec126f97a1d082634ee93, data reload: false

query1	925	378	366	366
query2	6437	2422	2428	2422
query3	6643	223	216	216
query4	23057	21340	21107	21107
query5	4112	415	415	415
query6	257	172	173	172
query7	4582	301	293	293
query8	230	191	188	188
query9	8482	2393	2398	2393
query10	425	243	236	236
query11	14772	14146	14232	14146
query12	139	86	86	86
query13	1644	371	365	365
query14	10942	7996	8324	7996
query15	267	166	167	166
query16	8071	252	252	252
query17	1748	581	547	547
query18	2099	271	266	266
query19	197	145	165	145
query20	89	90	83	83
query21	191	130	130	130
query22	5202	4937	4945	4937
query23	34340	33738	33693	33693
query24	10676	2904	2845	2845
query25	554	379	394	379
query26	686	155	148	148
query27	2204	318	318	318
query28	5689	2035	2036	2035
query29	838	588	594	588
query30	262	173	181	173
query31	956	752	725	725
query32	96	51	55	51
query33	626	239	238	238
query34	872	476	474	474
query35	801	673	656	656
query36	1123	897	888	888
query37	104	71	69	69
query38	2905	2791	2768	2768
query39	1651	1563	1539	1539
query40	196	122	123	122
query41	45	45	44	44
query42	100	92	93	92
query43	570	559	567	559
query44	1047	728	735	728
query45	264	251	261	251
query46	1070	721	689	689
query47	1986	1887	1917	1887
query48	374	297	290	290
query49	837	407	389	389
query50	773	378	376	376
query51	6811	6815	6776	6776
query52	103	93	90	90
query53	350	282	284	282
query54	825	428	430	428
query55	75	72	70	70
query56	235	224	219	219
query57	1256	1154	1153	1153
query58	210	198	195	195
query59	3421	3203	3129	3129
query60	249	232	227	227
query61	88	88	87	87
query62	649	474	484	474
query63	305	276	277	276
query64	8468	7465	7315	7315
query65	3106	3082	3072	3072
query66	771	339	321	321
query67	15307	15131	15021	15021
query68	4504	522	532	522
query69	472	318	301	301
query70	1176	1098	1151	1098
query71	375	267	262	262
query72	7223	2510	2360	2360
query73	706	319	314	314
query74	6512	6223	6151	6151
query75	3337	2613	2606	2606
query76	2362	1017	971	971
query77	406	262	266	262
query78	10543	10173	10188	10173
query79	2446	504	509	504
query80	1085	419	415	415
query81	537	243	239	239
query82	933	95	97	95
query83	234	165	166	165
query84	250	86	84	84
query85	1386	267	256	256
query86	489	324	315	315
query87	3337	3148	3141	3141
query88	4060	2343	2362	2343
query89	471	371	384	371
query90	1963	184	191	184
query91	129	108	105	105
query92	57	49	50	49
query93	1658	494	482	482
query94	1229	182	184	182
query95	451	291	294	291
query96	587	269	263	263
query97	3185	2990	2962	2962
query98	236	225	217	217
query99	1178	914	918	914
Total cold run time: 277902 ms
Total hot run time: 187673 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2f8a32d8cd2ca935839ec126f97a1d082634ee93, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.68	0.07	0.08
query5	0.50	0.50	0.50
query6	1.12	0.72	0.72
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.55	0.49	0.48
query10	0.53	0.55	0.53
query11	0.16	0.10	0.11
query12	0.15	0.12	0.12
query13	0.59	0.58	0.60
query14	0.79	0.78	0.78
query15	0.82	0.80	0.80
query16	0.37	0.36	0.36
query17	0.95	1.01	0.96
query18	0.23	0.23	0.26
query19	1.86	1.75	1.68
query20	0.01	0.01	0.01
query21	15.48	0.68	0.68
query22	3.92	7.62	2.34
query23	18.26	1.32	1.20
query24	1.42	0.36	0.22
query25	0.14	0.09	0.08
query26	0.26	0.17	0.18
query27	0.08	0.08	0.07
query28	13.36	1.01	1.00
query29	13.22	3.27	3.20
query30	0.24	0.05	0.05
query31	2.87	0.39	0.38
query32	3.28	0.48	0.47
query33	2.82	2.81	2.82
query34	17.22	4.44	4.44
query35	4.55	4.53	4.60
query36	0.69	0.46	0.45
query37	0.17	0.15	0.16
query38	0.14	0.15	0.13
query39	0.04	0.04	0.03
query40	0.17	0.14	0.14
query41	0.09	0.04	0.04
query42	0.06	0.05	0.04
query43	0.05	0.03	0.04
Total cold run time: 109.26 s
Total hot run time: 30.74 s

@Jibing-Li
Copy link
Contributor Author

run performance

morningman
morningman previously approved these changes May 16, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels May 16, 2024
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 17, 2024
@Jibing-Li
Copy link
Contributor Author

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 17, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@Jibing-Li
Copy link
Contributor Author

run buildall

@Jibing-Li
Copy link
Contributor Author

run feut

@Jibing-Li Jibing-Li merged commit 1537a05 into apache:master May 17, 2024
@Jibing-Li Jibing-Li deleted the numminmax branch May 17, 2024 14:34
dataroaring pushed a commit that referenced this pull request May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants