Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](compaction) fix the longest continuous rowsets cannot be selected when missing rowsets #38728

Merged
merged 3 commits into from
Aug 4, 2024

Conversation

luwei16
Copy link
Contributor

@luwei16 luwei16 commented Aug 1, 2024

problem

When version is missing, the code for selecting the longest continuous version has a bug. Only the version before the missing version will be selected.
For example: the current version is version [1-1], version [2-2], version [4-4], version [5-5], version [6-6], version [7-7], and version [3-3] is missing.
The current result is to return version [1-1], version [2-2] instead of version [4-4], version [5-5], version [6-6], version [7-7]

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@github-actions github-actions bot added the doing label Aug 1, 2024
@luwei16
Copy link
Contributor Author

luwei16 commented Aug 1, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

// specific language governing permissions and limitations
// under the License.

#include "olap/cumulative_compaction.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'olap/cumulative_compaction.h' file not found [clang-diagnostic-error]

#include "olap/cumulative_compaction.h"
         ^

return rowset;
}

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'TEST_F' exceeds recommended size/complexity thresholds [readability-function-size]

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
^
Additional context

be/test/olap/cumulative_compaction_test.cpp:66: 142 lines including whitespace and comments (threshold 80)

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
^

dataroaring
dataroaring previously approved these changes Aug 1, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

github-actions bot commented Aug 1, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Aug 1, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 41477 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2c8d52290b0e6e3c8e0c8a34dd91c69868c3aa7c, data reload: false

------ Round 1 ----------------------------------
q1	17635	4053	4087	4053
q2	2019	198	199	198
q3	10459	1306	1331	1306
q4	10177	828	921	828
q5	7632	2938	2940	2938
q6	219	136	138	136
q7	1021	617	609	609
q8	9422	1834	1910	1834
q9	8499	6575	6559	6559
q10	8740	3834	3797	3797
q11	428	243	246	243
q12	409	224	221	221
q13	17772	2921	2939	2921
q14	272	237	251	237
q15	516	470	475	470
q16	515	388	385	385
q17	956	930	881	881
q18	7993	7299	7269	7269
q19	1427	1220	1211	1211
q20	545	314	320	314
q21	5282	4789	4782	4782
q22	352	290	285	285
Total cold run time: 112290 ms
Total hot run time: 41477 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4092	4039	4036	4036
q2	329	220	222	220
q3	2982	2988	3089	2988
q4	1985	1962	1973	1962
q5	5490	5523	5434	5434
q6	238	137	130	130
q7	2084	1781	1751	1751
q8	3319	3378	3320	3320
q9	8597	8593	8686	8593
q10	3976	3993	4016	3993
q11	534	437	449	437
q12	734	602	589	589
q13	15314	3121	3115	3115
q14	306	272	281	272
q15	536	486	485	485
q16	453	412	419	412
q17	1773	1739	1700	1700
q18	8260	7704	7737	7704
q19	1905	1751	1710	1710
q20	2047	1867	1836	1836
q21	5634	5263	5397	5263
q22	539	477	463	463
Total cold run time: 71127 ms
Total hot run time: 56413 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169981 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2c8d52290b0e6e3c8e0c8a34dd91c69868c3aa7c, data reload: false

query1	907	364	362	362
query2	6465	1721	1686	1686
query3	6659	217	228	217
query4	20006	17618	17102	17102
query5	3620	501	512	501
query6	285	186	162	162
query7	4607	288	287	287
query8	262	203	187	187
query9	8488	2340	2340	2340
query10	425	263	264	263
query11	10483	10120	10227	10120
query12	119	90	91	90
query13	1644	382	356	356
query14	9009	7504	7020	7020
query15	202	161	162	161
query16	6940	505	449	449
query17	965	545	547	545
query18	1805	271	273	271
query19	186	140	143	140
query20	96	87	83	83
query21	212	95	95	95
query22	4337	4213	4041	4041
query23	33889	33455	33550	33455
query24	9310	3178	3075	3075
query25	676	397	416	397
query26	956	149	156	149
query27	2243	284	281	281
query28	6680	1984	1997	1984
query29	983	438	438	438
query30	237	157	159	157
query31	939	764	742	742
query32	97	55	55	55
query33	668	326	331	326
query34	928	506	502	502
query35	858	770	754	754
query36	1031	889	873	873
query37	179	92	83	83
query38	2959	2945	2869	2869
query39	916	876	841	841
query40	214	116	114	114
query41	49	45	46	45
query42	113	99	99	99
query43	450	422	424	422
query44	1169	718	726	718
query45	205	179	181	179
query46	1084	838	788	788
query47	1801	1754	1711	1711
query48	361	286	291	286
query49	862	429	430	429
query50	891	432	432	432
query51	6862	6659	6670	6659
query52	97	96	92	92
query53	255	184	188	184
query54	628	459	452	452
query55	81	79	78	78
query56	292	258	273	258
query57	1140	1031	1033	1031
query58	293	303	268	268
query59	2469	2541	2550	2541
query60	308	294	276	276
query61	117	112	113	112
query62	888	668	663	663
query63	217	179	185	179
query64	4818	1945	1956	1945
query65	3149	3349	3111	3111
query66	1165	344	338	338
query67	15320	14849	14906	14849
query68	6029	574	572	572
query69	684	386	332	332
query70	1156	1122	1047	1047
query71	465	287	280	280
query72	7755	2812	2714	2714
query73	929	320	330	320
query74	6050	5631	5564	5564
query75	3657	2726	2747	2726
query76	3301	1241	1206	1206
query77	701	319	329	319
query78	9614	8897	8909	8897
query79	4263	526	536	526
query80	2430	558	500	500
query81	560	226	230	226
query82	1429	136	131	131
query83	257	177	176	176
query84	279	84	76	76
query85	1435	305	297	297
query86	476	272	294	272
query87	3243	3136	3127	3127
query88	3980	2401	2371	2371
query89	453	288	293	288
query90	1826	192	189	189
query91	123	100	102	100
query92	59	49	50	49
query93	5300	609	600	600
query94	822	291	280	280
query95	371	269	258	258
query96	616	285	278	278
query97	3206	3045	3047	3045
query98	218	210	198	198
query99	1707	1299	1307	1299
Total cold run time: 269963 ms
Total hot run time: 169981 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.99 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2c8d52290b0e6e3c8e0c8a34dd91c69868c3aa7c, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.22	0.04	0.04
query4	1.69	0.07	0.08
query5	0.49	0.48	0.47
query6	1.13	0.71	0.72
query7	0.02	0.02	0.01
query8	0.05	0.05	0.04
query9	0.58	0.53	0.50
query10	0.57	0.56	0.56
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.61	0.60	0.60
query14	0.76	0.82	0.79
query15	0.90	0.85	0.85
query16	0.35	0.35	0.35
query17	1.03	0.96	1.00
query18	0.22	0.21	0.22
query19	1.79	1.74	1.78
query20	0.01	0.01	0.01
query21	15.38	0.75	0.65
query22	4.12	7.55	1.43
query23	18.07	1.35	1.33
query24	2.22	0.22	0.22
query25	0.18	0.08	0.09
query26	0.31	0.21	0.21
query27	0.47	0.24	0.23
query28	13.15	1.00	0.96
query29	12.68	3.28	3.28
query30	0.25	0.05	0.06
query31	2.86	0.41	0.40
query32	3.26	0.49	0.47
query33	2.96	2.97	2.88
query34	15.42	4.26	4.24
query35	4.29	4.27	4.31
query36	0.67	0.47	0.48
query37	0.18	0.16	0.17
query38	0.16	0.15	0.15
query39	0.04	0.03	0.04
query40	0.15	0.13	0.13
query41	0.09	0.05	0.05
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 107.86 s
Total hot run time: 29.99 s

@luwei16
Copy link
Contributor Author

luwei16 commented Aug 1, 2024

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 1, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

return rowset;
}

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'TEST_F' exceeds recommended size/complexity thresholds [readability-function-size]

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
^
Additional context

be/test/olap/cumulative_compaction_test.cpp:66: 197 lines including whitespace and comments (threshold 80)

TEST_F(CumulativeCompactionTest, TestConsecutiveVersion) {
^

@doris-robot
Copy link

TPC-H: Total hot run time: 41558 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 77c99653ca5e54f5eed67f69c4f8e5e4dee30cfb, data reload: false

------ Round 1 ----------------------------------
q1	17610	4132	4036	4036
q2	2021	199	190	190
q3	10633	1278	1312	1278
q4	10254	867	906	867
q5	7681	2957	2991	2957
q6	224	143	139	139
q7	1052	621	625	621
q8	9967	1964	1970	1964
q9	8384	6570	6602	6570
q10	8681	3830	3822	3822
q11	436	247	251	247
q12	404	224	224	224
q13	17753	2915	2932	2915
q14	271	245	241	241
q15	530	486	489	486
q16	537	409	381	381
q17	955	934	923	923
q18	8074	7348	7199	7199
q19	1438	1209	1216	1209
q20	568	340	344	340
q21	5250	4675	4725	4675
q22	348	274	280	274
Total cold run time: 113071 ms
Total hot run time: 41558 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4071	4026	3979	3979
q2	327	221	220	220
q3	2975	2998	2983	2983
q4	1876	1868	1847	1847
q5	5227	5223	5230	5223
q6	214	131	128	128
q7	2066	1629	1697	1629
q8	3169	3272	3263	3263
q9	8268	8259	8247	8247
q10	3746	3806	3842	3806
q11	542	452	440	440
q12	708	572	528	528
q13	15531	2958	2965	2958
q14	291	255	252	252
q15	515	478	475	475
q16	438	399	398	398
q17	1710	1693	1675	1675
q18	7757	7315	7116	7116
q19	1666	1668	1660	1660
q20	1964	1760	1773	1760
q21	5348	5209	5140	5140
q22	515	464	471	464
Total cold run time: 68924 ms
Total hot run time: 54191 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169386 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 77c99653ca5e54f5eed67f69c4f8e5e4dee30cfb, data reload: false

query1	922	382	371	371
query2	6476	1684	1663	1663
query3	6664	220	247	220
query4	20015	17654	17353	17353
query5	4301	518	517	517
query6	284	166	165	165
query7	4603	301	293	293
query8	254	212	191	191
query9	8504	2357	2366	2357
query10	440	282	289	282
query11	10423	9992	9901	9901
query12	140	90	87	87
query13	1643	386	377	377
query14	10102	7682	7618	7618
query15	205	162	171	162
query16	7095	514	484	484
query17	939	568	565	565
query18	1833	287	292	287
query19	190	150	147	147
query20	99	88	89	88
query21	207	101	99	99
query22	4372	4149	4121	4121
query23	33722	32780	32828	32780
query24	10410	3100	3055	3055
query25	678	383	389	383
query26	1756	156	155	155
query27	2896	282	278	278
query28	6926	1966	1940	1940
query29	1266	431	426	426
query30	294	147	148	147
query31	923	756	745	745
query32	105	53	55	53
query33	701	341	326	326
query34	925	490	496	490
query35	838	725	723	723
query36	993	867	861	861
query37	295	81	80	80
query38	2840	2785	2779	2779
query39	865	817	838	817
query40	289	112	112	112
query41	49	45	44	44
query42	119	99	104	99
query43	454	451	429	429
query44	1195	728	716	716
query45	210	177	175	175
query46	1072	804	778	778
query47	1802	1676	1710	1676
query48	362	295	297	295
query49	1182	441	433	433
query50	884	442	435	435
query51	6842	6793	6698	6698
query52	104	95	96	95
query53	253	185	185	185
query54	636	454	467	454
query55	79	74	79	74
query56	271	296	267	267
query57	1151	1035	1033	1033
query58	280	269	288	269
query59	2657	2528	2394	2394
query60	294	276	288	276
query61	96	94	101	94
query62	918	673	677	673
query63	223	190	190	190
query64	5880	1939	1887	1887
query65	3142	3115	3097	3097
query66	1442	331	361	331
query67	15330	14718	14841	14718
query68	4318	569	572	569
query69	617	385	304	304
query70	1067	1037	1081	1037
query71	459	285	288	285
query72	7142	2669	2501	2501
query73	765	334	328	328
query74	5985	5656	5561	5561
query75	3413	2759	2776	2759
query76	2571	1191	1272	1191
query77	571	321	322	321
query78	9462	8940	8896	8896
query79	1876	549	546	546
query80	946	503	511	503
query81	552	231	233	231
query82	948	136	130	130
query83	258	168	175	168
query84	271	79	81	79
query85	1226	330	300	300
query86	396	295	347	295
query87	3262	3099	3083	3083
query88	2914	2418	2416	2416
query89	387	286	291	286
query90	1830	200	194	194
query91	130	101	104	101
query92	66	51	55	51
query93	1446	604	624	604
query94	919	309	280	280
query95	382	273	272	272
query96	605	279	279	279
query97	3213	3060	3046	3046
query98	218	197	194	194
query99	1623	1287	1263	1263
Total cold run time: 263198 ms
Total hot run time: 169386 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.07 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 77c99653ca5e54f5eed67f69c4f8e5e4dee30cfb, data reload: false

query1	0.05	0.04	0.04
query2	0.07	0.04	0.04
query3	0.22	0.05	0.05
query4	1.69	0.08	0.09
query5	0.51	0.48	0.48
query6	1.13	0.71	0.71
query7	0.02	0.02	0.01
query8	0.05	0.05	0.05
query9	0.58	0.55	0.51
query10	0.57	0.58	0.57
query11	0.15	0.11	0.12
query12	0.15	0.12	0.13
query13	0.62	0.60	0.60
query14	0.78	0.79	0.82
query15	0.88	0.86	0.85
query16	0.37	0.36	0.35
query17	1.00	0.97	1.00
query18	0.23	0.21	0.21
query19	1.82	1.72	1.70
query20	0.02	0.01	0.01
query21	15.40	0.73	0.65
query22	4.10	7.43	1.32
query23	18.17	1.35	1.32
query24	2.27	0.22	0.22
query25	0.18	0.09	0.08
query26	0.32	0.21	0.22
query27	0.46	0.23	0.23
query28	13.16	1.00	0.98
query29	12.59	3.31	3.31
query30	0.26	0.06	0.06
query31	2.87	0.40	0.42
query32	3.26	0.49	0.49
query33	2.93	2.98	2.97
query34	15.40	4.27	4.25
query35	4.30	4.27	4.30
query36	0.68	0.47	0.50
query37	0.19	0.16	0.16
query38	0.16	0.16	0.15
query39	0.04	0.04	0.03
query40	0.16	0.13	0.13
query41	0.09	0.04	0.04
query42	0.05	0.04	0.05
query43	0.05	0.04	0.04
Total cold run time: 108 s
Total hot run time: 30.07 s

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 4, 2024
Copy link
Contributor

github-actions bot commented Aug 4, 2024

PR approved by at least one committer and no changes requested.

@dataroaring dataroaring merged commit 4c33839 into apache:master Aug 4, 2024
30 of 31 checks passed
dataroaring pushed a commit that referenced this pull request Aug 11, 2024
…ed when missing rowsets (#38728)

### problem
When version is missing, the code for selecting the longest continuous
version has a bug. Only the version before the missing version will be
selected.
For example: the current version is version [1-1], version [2-2],
version [4-4], version [5-5], version [6-6], version [7-7], and version
[3-3] is missing.
The current result is to return version [1-1], version [2-2] instead of
version [4-4], version [5-5], version [6-6], version [7-7]
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Aug 13, 2024
…ed when missing rowsets (apache#38728)

When version is missing, the code for selecting the longest continuous
version has a bug. Only the version before the missing version will be
selected.
For example: the current version is version [1-1], version [2-2],
version [4-4], version [5-5], version [6-6], version [7-7], and version
[3-3] is missing.
The current result is to return version [1-1], version [2-2] instead of
version [4-4], version [5-5], version [6-6], version [7-7]
yiguolei pushed a commit that referenced this pull request Aug 13, 2024
dataroaring pushed a commit that referenced this pull request Aug 16, 2024
…ed when missing rowsets (#38728)

### problem
When version is missing, the code for selecting the longest continuous
version has a bug. Only the version before the missing version will be
selected.
For example: the current version is version [1-1], version [2-2],
version [4-4], version [5-5], version [6-6], version [7-7], and version
[3-3] is missing.
The current result is to return version [1-1], version [2-2] instead of
version [4-4], version [5-5], version [6-6], version [7-7]
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Sep 4, 2024
…e selected when missing rowsets (apache#38728) (apache#39262)"

This reverts commit c9949f2.

This pr may increase the probability of full clone failure, so revert it first.
yiguolei pushed a commit that referenced this pull request Sep 4, 2024
…e selected when missing rowsets (#38728) (#39262)" (#40375)

This reverts commit c9949f2.

This pr may increase the probability of full clone failure, so revert it
first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.6-merged dev/3.0.2-merged doing reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants