Skip to content

[Fix](cloud-mow) Fix race between CloudMetaMgr::sync_tablet_rowsets and CloudSchemaChangeJob::_convert_historical_rowsets #50051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 16, 2025

Conversation

bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Apr 15, 2025

What problem does this PR solve?

Considering the following race situation:

  1. thread 1 finish convert historical rowsets before version Y on tablet A and write tmp converted rowset metas on MS
  2. thread 2 begins sync_rowsets on tablet A with version Y and get visible rowsets before version Y.
  3. thread 1 commit heavy schema change job on MS and turn tmp converted historical rowsets to visible rowsets on tablet Y.
  4. thread 1 add converted historical rowset metas to tablet Y's BE local tablet meta and remove all delete bitmaps of new tablet before version Y.
  5. thread 2 add rowsets to tablet Y's BE local tablet meta which overwrites schema change's converted rowsets.

This will cause correctness problem. This PR add a lock to avoid this situation.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1 bobhan1 force-pushed the fix-sc-sync_tablet_rowsets-race branch from 6c12861 to 782df85 Compare April 15, 2025 08:14
@bobhan1 bobhan1 force-pushed the fix-sc-sync_tablet_rowsets-race branch 2 times, most recently from 7855598 to 88266ab Compare April 15, 2025 08:43
@bobhan1
Copy link
Contributor Author

bobhan1 commented Apr 15, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34425 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 88266ab52c4ca7a4f574794f5cd3a34297890994, data reload: false

------ Round 1 ----------------------------------
q1	26412	5103	5074	5074
q2	2068	272	185	185
q3	10396	1252	724	724
q4	10221	1033	544	544
q5	7541	2427	2351	2351
q6	192	163	134	134
q7	929	750	631	631
q8	9339	1313	1145	1145
q9	6840	5109	5085	5085
q10	6828	2294	1890	1890
q11	476	277	275	275
q12	349	360	219	219
q13	17787	3684	3100	3100
q14	218	224	205	205
q15	552	508	501	501
q16	628	613	578	578
q17	572	871	360	360
q18	7522	7307	7170	7170
q19	1101	1079	555	555
q20	338	336	223	223
q21	4155	3377	2504	2504
q22	1052	1008	972	972
Total cold run time: 115516 ms
Total hot run time: 34425 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5149	5084	5101	5084
q2	245	330	221	221
q3	2163	2664	2238	2238
q4	1411	1877	1489	1489
q5	4616	4510	4367	4367
q6	212	164	125	125
q7	1941	1944	1722	1722
q8	2579	2649	2557	2557
q9	7117	7134	7052	7052
q10	2992	3215	2732	2732
q11	584	506	486	486
q12	715	772	609	609
q13	3520	3894	3346	3346
q14	273	295	287	287
q15	541	483	489	483
q16	664	679	645	645
q17	1152	1496	1411	1411
q18	7789	7440	7504	7440
q19	823	821	858	821
q20	2002	2050	1900	1900
q21	5223	4654	4787	4654
q22	1059	1017	990	990
Total cold run time: 52770 ms
Total hot run time: 50659 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187353 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 88266ab52c4ca7a4f574794f5cd3a34297890994, data reload: false

query1	1013	510	501	501
query2	6561	1917	1877	1877
query3	6758	218	219	218
query4	26715	23900	23526	23526
query5	4728	604	463	463
query6	301	199	211	199
query7	4621	490	273	273
query8	302	230	236	230
query9	8630	2584	2599	2584
query10	481	322	276	276
query11	15874	15169	14868	14868
query12	180	110	107	107
query13	1654	526	419	419
query14	9462	6216	6234	6216
query15	220	200	172	172
query16	7253	644	484	484
query17	1151	756	575	575
query18	1973	421	310	310
query19	198	197	165	165
query20	133	120	114	114
query21	228	124	106	106
query22	4037	4152	4044	4044
query23	34186	33093	33090	33090
query24	8508	2423	2410	2410
query25	581	448	384	384
query26	1238	269	147	147
query27	2749	496	329	329
query28	4336	2441	2415	2415
query29	763	563	425	425
query30	286	221	191	191
query31	1007	834	793	793
query32	72	70	73	70
query33	566	365	327	327
query34	813	863	525	525
query35	800	831	760	760
query36	934	984	908	908
query37	120	97	75	75
query38	4180	4251	4123	4123
query39	1474	1400	1400	1400
query40	216	119	108	108
query41	58	55	51	51
query42	118	103	110	103
query43	509	505	477	477
query44	1336	811	800	800
query45	180	173	163	163
query46	834	1027	617	617
query47	1766	1811	1717	1717
query48	373	396	285	285
query49	785	511	418	418
query50	645	686	402	402
query51	4148	4161	4076	4076
query52	108	107	101	101
query53	230	255	193	193
query54	586	569	507	507
query55	88	81	84	81
query56	302	310	287	287
query57	1148	1166	1081	1081
query58	267	268	250	250
query59	2616	2653	2583	2583
query60	331	323	304	304
query61	130	124	130	124
query62	794	705	646	646
query63	223	192	187	187
query64	4384	1012	681	681
query65	4309	4266	4292	4266
query66	1139	409	322	322
query67	15832	15570	15406	15406
query68	8305	890	513	513
query69	469	298	271	271
query70	1190	1119	1113	1113
query71	461	324	291	291
query72	5413	4747	4850	4747
query73	746	640	344	344
query74	8976	9037	8954	8954
query75	3886	3156	2703	2703
query76	3708	1184	769	769
query77	790	393	314	314
query78	10139	10155	9300	9300
query79	2108	812	564	564
query80	585	510	431	431
query81	506	260	224	224
query82	479	134	101	101
query83	258	249	234	234
query84	251	109	81	81
query85	795	358	317	317
query86	382	272	280	272
query87	4477	4432	4446	4432
query88	3493	2312	2257	2257
query89	395	319	290	290
query90	1926	221	218	218
query91	147	149	116	116
query92	75	63	60	60
query93	1628	950	585	585
query94	678	409	303	303
query95	378	291	290	290
query96	494	558	281	281
query97	3219	3258	3101	3101
query98	238	224	272	224
query99	1411	1419	1318	1318
Total cold run time: 276547 ms
Total hot run time: 187353 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 88266ab52c4ca7a4f574794f5cd3a34297890994, data reload: false

query1	0.04	0.04	0.04
query2	0.14	0.10	0.11
query3	0.25	0.20	0.18
query4	1.60	0.20	0.19
query5	0.60	0.58	0.59
query6	1.19	0.72	0.73
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.59	0.52	0.51
query10	0.58	0.57	0.57
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.61	0.60	0.60
query14	2.65	2.70	2.80
query15	0.94	0.85	0.88
query16	0.38	0.38	0.38
query17	0.99	1.02	1.04
query18	0.21	0.20	0.20
query19	1.91	1.96	1.85
query20	0.01	0.01	0.01
query21	15.35	0.90	0.54
query22	0.74	1.14	0.76
query23	14.87	1.41	0.66
query24	7.29	1.13	0.83
query25	0.44	0.14	0.15
query26	0.61	0.18	0.13
query27	0.05	0.06	0.05
query28	9.70	0.90	0.43
query29	12.56	3.99	3.34
query30	0.24	0.09	0.06
query31	2.83	0.59	0.40
query32	3.23	0.55	0.47
query33	3.04	3.02	3.11
query34	15.81	5.14	4.46
query35	4.46	4.51	4.55
query36	0.65	0.50	0.50
query37	0.09	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.02
query40	0.18	0.14	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.46 s
Total hot run time: 31.39 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/51) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.42% (14107/26909)
Line Coverage 41.34% (122313/295884)
Region Coverage 40.06% (62241/155369)
Branch Coverage 34.76% (31244/89872)

@bobhan1
Copy link
Contributor Author

bobhan1 commented Apr 15, 2025

run cloud_p0

@bobhan1 bobhan1 force-pushed the fix-sc-sync_tablet_rowsets-race branch from 88266ab to ae29a47 Compare April 15, 2025 11:35
@bobhan1 bobhan1 force-pushed the fix-sc-sync_tablet_rowsets-race branch from ae29a47 to 444f407 Compare April 15, 2025 11:36
@bobhan1
Copy link
Contributor Author

bobhan1 commented Apr 15, 2025

run buildall

@bobhan1
Copy link
Contributor Author

bobhan1 commented Apr 15, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34354 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9b1862387a82fc862feab6e99558ba8e5e7ee61c, data reload: false

------ Round 1 ----------------------------------
q1	25916	5721	5094	5094
q2	2071	304	210	210
q3	10333	1242	698	698
q4	10235	1009	529	529
q5	7530	2354	2359	2354
q6	186	162	130	130
q7	932	757	622	622
q8	9322	1318	1144	1144
q9	6896	5046	5137	5046
q10	6878	2332	1900	1900
q11	486	299	275	275
q12	354	354	214	214
q13	17771	3646	3082	3082
q14	231	221	218	218
q15	522	494	491	491
q16	646	617	583	583
q17	593	875	355	355
q18	7553	7281	7152	7152
q19	1692	970	562	562
q20	343	330	208	208
q21	4538	3378	2512	2512
q22	1067	1010	975	975
Total cold run time: 116095 ms
Total hot run time: 34354 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5208	5103	5083	5083
q2	243	328	231	231
q3	2184	2657	2305	2305
q4	1464	1959	1469	1469
q5	4443	4461	4382	4382
q6	214	170	124	124
q7	1984	1981	1732	1732
q8	2602	2474	2508	2474
q9	7212	7082	7138	7082
q10	3027	3155	2765	2765
q11	596	520	483	483
q12	675	798	605	605
q13	3560	3849	3390	3390
q14	281	293	267	267
q15	520	466	480	466
q16	650	666	660	660
q17	1172	1552	1379	1379
q18	7791	7697	7363	7363
q19	842	900	1067	900
q20	1912	1978	1871	1871
q21	5455	4832	4737	4737
q22	1146	1131	1015	1015
Total cold run time: 53181 ms
Total hot run time: 50783 ms

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 15, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-DS: Total hot run time: 192941 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9b1862387a82fc862feab6e99558ba8e5e7ee61c, data reload: false

query1	1407	1054	1075	1054
query2	6237	1862	1898	1862
query3	11186	4585	4647	4585
query4	25335	23654	23090	23090
query5	3934	619	448	448
query6	288	203	192	192
query7	3986	474	285	285
query8	294	253	239	239
query9	8496	2587	2584	2584
query10	455	326	265	265
query11	15392	15059	14766	14766
query12	162	112	109	109
query13	1564	522	403	403
query14	9011	6193	6213	6193
query15	218	196	174	174
query16	7657	673	514	514
query17	1201	719	580	580
query18	2025	422	320	320
query19	206	204	171	171
query20	136	133	124	124
query21	210	123	110	110
query22	4400	4452	4295	4295
query23	34828	33588	33433	33433
query24	8375	2427	2425	2425
query25	496	495	402	402
query26	734	277	154	154
query27	2926	502	334	334
query28	4598	2453	2409	2409
query29	640	583	438	438
query30	279	224	193	193
query31	862	894	835	835
query32	80	66	59	59
query33	546	358	312	312
query34	802	884	535	535
query35	812	839	790	790
query36	972	1018	913	913
query37	123	99	80	80
query38	4171	4263	4234	4234
query39	1534	1456	1443	1443
query40	220	114	109	109
query41	53	52	54	52
query42	119	104	105	104
query43	497	532	484	484
query44	1309	836	847	836
query45	181	175	168	168
query46	849	1018	626	626
query47	1838	1866	1796	1796
query48	404	416	320	320
query49	702	518	441	441
query50	654	699	403	403
query51	4332	4284	4278	4278
query52	110	117	104	104
query53	223	253	187	187
query54	581	561	540	540
query55	91	87	83	83
query56	299	291	295	291
query57	1198	1170	1117	1117
query58	271	268	268	268
query59	2649	2829	2602	2602
query60	323	314	320	314
query61	127	124	126	124
query62	771	748	687	687
query63	241	194	185	185
query64	3013	1047	717	717
query65	4460	4403	4374	4374
query66	794	405	313	313
query67	16004	15496	15361	15361
query68	8973	880	511	511
query69	481	311	275	275
query70	1229	1052	1079	1052
query71	471	313	289	289
query72	5668	4804	4739	4739
query73	724	609	343	343
query74	8823	9154	8726	8726
query75	4263	3191	2691	2691
query76	3622	1180	756	756
query77	792	359	286	286
query78	10000	10104	9293	9293
query79	2150	870	584	584
query80	600	501	504	501
query81	503	256	221	221
query82	442	128	97	97
query83	285	243	237	237
query84	288	111	82	82
query85	778	348	306	306
query86	348	303	277	277
query87	4472	4499	4392	4392
query88	2955	2241	2264	2241
query89	406	338	297	297
query90	1839	211	217	211
query91	141	139	114	114
query92	72	56	54	54
query93	1111	920	580	580
query94	648	413	306	306
query95	376	293	291	291
query96	484	555	274	274
query97	3166	3239	3174	3174
query98	226	208	205	205
query99	1447	1405	1300	1300
Total cold run time: 276674 ms
Total hot run time: 192941 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.13 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9b1862387a82fc862feab6e99558ba8e5e7ee61c, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.11	0.12
query3	0.24	0.20	0.20
query4	1.58	0.20	0.19
query5	0.59	0.59	0.58
query6	1.20	0.71	0.73
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.57	0.52	0.51
query10	0.56	0.58	0.57
query11	0.15	0.10	0.10
query12	0.14	0.11	0.11
query13	0.62	0.59	0.59
query14	2.67	2.82	2.81
query15	0.93	0.85	0.84
query16	0.41	0.38	0.38
query17	1.01	1.02	1.03
query18	0.20	0.20	0.20
query19	1.92	1.93	1.79
query20	0.01	0.01	0.02
query21	15.35	0.90	0.55
query22	0.77	1.04	0.78
query23	14.93	1.40	0.67
query24	6.74	2.04	0.48
query25	0.49	0.19	0.09
query26	0.75	0.16	0.13
query27	0.05	0.05	0.04
query28	9.31	0.88	0.43
query29	12.54	3.98	3.32
query30	0.25	0.09	0.06
query31	2.83	0.61	0.38
query32	3.23	0.56	0.47
query33	3.02	3.07	3.12
query34	15.86	5.11	4.50
query35	4.55	4.59	4.57
query36	0.69	0.49	0.49
query37	0.09	0.06	0.06
query38	0.06	0.05	0.03
query39	0.03	0.02	0.03
query40	0.17	0.14	0.12
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.88 s
Total hot run time: 31.13 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/19) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.43% (14109/26910)
Line Coverage 41.35% (122337/295857)
Region Coverage 40.06% (62246/155370)
Branch Coverage 34.77% (31248/89872)

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 98cd61a into apache:master Apr 16, 2025
25 of 26 checks passed
bobhan1 added a commit to bobhan1/doris that referenced this pull request Apr 16, 2025
… and `CloudSchemaChangeJob::_convert_historical_rowsets` (apache#50051)

Considering the following race situation:
1. thread 1 finish convert historical rowsets before version Y on tablet
A and write tmp converted rowset metas on MS
2. thread 2 begins sync_rowsets on tablet A with version Y and get
visible rowsets before version Y.
3. thread 1 commit heavy schema change job on MS and turn tmp converted
historical rowsets to visible rowsets on tablet Y.
4. thread 1 add converted historical rowset metas to tablet Y's BE local
tablet meta and remove all delete bitmaps of new tablet before version
Y.
5. thread 2 **add rowsets to tablet Y's BE local tablet meta which
overwrites schema change's converted rowsets**.

This will cause correctness problem. This PR add a lock to avoid this
situation.
bobhan1 added a commit to bobhan1/doris that referenced this pull request Apr 16, 2025
… and `CloudSchemaChangeJob::_convert_historical_rowsets` (apache#50051)

Considering the following race situation:
1. thread 1 finish convert historical rowsets before version Y on tablet
A and write tmp converted rowset metas on MS
2. thread 2 begins sync_rowsets on tablet A with version Y and get
visible rowsets before version Y.
3. thread 1 commit heavy schema change job on MS and turn tmp converted
historical rowsets to visible rowsets on tablet Y.
4. thread 1 add converted historical rowset metas to tablet Y's BE local
tablet meta and remove all delete bitmaps of new tablet before version
Y.
5. thread 2 **add rowsets to tablet Y's BE local tablet meta which
overwrites schema change's converted rowsets**.

This will cause correctness problem. This PR add a lock to avoid this
situation.
seawinde pushed a commit to seawinde/doris that referenced this pull request Apr 17, 2025
… and `CloudSchemaChangeJob::_convert_historical_rowsets` (apache#50051)

Considering the following race situation:
1. thread 1 finish convert historical rowsets before version Y on tablet
A and write tmp converted rowset metas on MS
2. thread 2 begins sync_rowsets on tablet A with version Y and get
visible rowsets before version Y.
3. thread 1 commit heavy schema change job on MS and turn tmp converted
historical rowsets to visible rowsets on tablet Y.
4. thread 1 add converted historical rowset metas to tablet Y's BE local
tablet meta and remove all delete bitmaps of new tablet before version
Y.
5. thread 2 **add rowsets to tablet Y's BE local tablet meta which
overwrites schema change's converted rowsets**.

This will cause correctness problem. This PR add a lock to avoid this
situation.
dataroaring pushed a commit that referenced this pull request Apr 22, 2025
…c_tablet_rowsets` and `CloudSchemaChangeJob::_convert_historical_rowsets` (#50051) (#50081)

pick #50051
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
… and `CloudSchemaChangeJob::_convert_historical_rowsets` (apache#50051)

Considering the following race situation:
1. thread 1 finish convert historical rowsets before version Y on tablet
A and write tmp converted rowset metas on MS
2. thread 2 begins sync_rowsets on tablet A with version Y and get
visible rowsets before version Y.
3. thread 1 commit heavy schema change job on MS and turn tmp converted
historical rowsets to visible rowsets on tablet Y.
4. thread 1 add converted historical rowset metas to tablet Y's BE local
tablet meta and remove all delete bitmaps of new tablet before version
Y.
5. thread 2 **add rowsets to tablet Y's BE local tablet meta which
overwrites schema change's converted rowsets**.

This will cause correctness problem. This PR add a lock to avoid this
situation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.6-merged p0_w reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants