Skip to content

[Fix](Compaction) Fix delete rowset sleeping when compaction #51482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2025

Conversation

Yukang-Lian
Copy link
Collaborator

@Yukang-Lian Yukang-Lian commented Jun 4, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

In #50950, we enhanced the observability of compaction, but in some places, the last_cumu_compaction_failure_time was incorrectly set, causing it to be set when encountering a delete rowset. As a result, when compaction encountered a delete rowset, it would consider it a failure and sleep for 5 seconds. This PR fixes the issue.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Yukang-Lian
Copy link
Collaborator Author

run buildall

@Thearas
Copy link
Contributor

Thearas commented Jun 4, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Yukang-Lian
Copy link
Collaborator Author

run buildall

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 4, 2025
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by anyone and no changes requested.

@Yukang-Lian
Copy link
Collaborator Author

run beut

@dataroaring dataroaring force-pushed the Fix-Compaction-Delete branch from 28890bb to a30fe4f Compare June 5, 2025 13:44
@dataroaring
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33734 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a30fe4f03d1999832155c2d291e1c542f4f13731, data reload: false

------ Round 1 ----------------------------------
q1	26499	5043	5029	5029
q2	1978	280	195	195
q3	10391	1232	710	710
q4	10231	1014	519	519
q5	7641	2399	2322	2322
q6	185	161	130	130
q7	934	721	616	616
q8	9313	1272	1104	1104
q9	6833	5109	5150	5109
q10	6883	2292	1872	1872
q11	510	294	270	270
q12	348	351	212	212
q13	17787	3623	3024	3024
q14	233	222	215	215
q15	567	482	490	482
q16	439	429	367	367
q17	604	871	365	365
q18	7788	7306	7062	7062
q19	2196	1057	561	561
q20	330	345	222	222
q21	3646	3223	2367	2367
q22	1072	1039	981	981
Total cold run time: 116408 ms
Total hot run time: 33734 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5214	5445	5064	5064
q2	247	319	222	222
q3	2159	2735	2257	2257
q4	1377	1816	1501	1501
q5	4430	4471	4419	4419
q6	222	168	125	125
q7	2071	1918	1798	1798
q8	2623	2548	2527	2527
q9	7294	6994	7185	6994
q10	2977	3165	2756	2756
q11	570	534	496	496
q12	720	766	657	657
q13	3509	3901	3280	3280
q14	292	297	255	255
q15	528	492	504	492
q16	441	495	475	475
q17	1152	1541	1394	1394
q18	7746	7650	7513	7513
q19	834	932	1018	932
q20	1912	2012	1849	1849
q21	4968	4473	4500	4473
q22	1148	1065	1044	1044
Total cold run time: 52434 ms
Total hot run time: 50523 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193115 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a30fe4f03d1999832155c2d291e1c542f4f13731, data reload: false

query1	1401	1092	1039	1039
query2	6254	1829	1826	1826
query3	11037	4433	4628	4433
query4	55223	25216	23345	23345
query5	4955	504	475	475
query6	342	222	200	200
query7	4927	522	294	294
query8	288	232	225	225
query9	5524	2687	2684	2684
query10	438	329	277	277
query11	15071	15032	14952	14952
query12	152	110	110	110
query13	1023	539	406	406
query14	9982	6380	6250	6250
query15	212	199	178	178
query16	7113	635	512	512
query17	1089	708	569	569
query18	1599	400	319	319
query19	205	188	175	175
query20	132	137	120	120
query21	206	124	109	109
query22	4379	4384	4271	4271
query23	34522	33834	33563	33563
query24	6633	2453	2439	2439
query25	482	465	396	396
query26	679	280	148	148
query27	2232	525	344	344
query28	3080	2217	2184	2184
query29	582	585	446	446
query30	271	225	192	192
query31	852	870	772	772
query32	77	63	61	61
query33	459	379	369	369
query34	779	880	550	550
query35	790	816	749	749
query36	954	1011	928	928
query37	117	103	74	74
query38	4214	4287	4130	4130
query39	1489	1436	1454	1436
query40	214	128	119	119
query41	61	61	60	60
query42	126	109	110	109
query43	516	512	493	493
query44	1367	850	841	841
query45	192	177	174	174
query46	875	1058	637	637
query47	1828	1885	1808	1808
query48	419	445	330	330
query49	672	499	422	422
query50	724	711	419	419
query51	4280	4276	4277	4276
query52	118	109	100	100
query53	263	261	185	185
query54	581	587	522	522
query55	94	88	94	88
query56	312	326	294	294
query57	1203	1164	1115	1115
query58	273	279	253	253
query59	2683	2839	2705	2705
query60	326	334	311	311
query61	127	134	129	129
query62	743	744	680	680
query63	229	208	193	193
query64	1618	1036	799	799
query65	4426	4189	4194	4189
query66	727	422	329	329
query67	15991	15704	15206	15206
query68	5657	905	545	545
query69	549	326	275	275
query70	1227	1174	1122	1122
query71	492	328	308	308
query72	6061	4769	4787	4769
query73	1305	621	353	353
query74	8936	9109	8931	8931
query75	3486	3197	2685	2685
query76	3803	1182	809	809
query77	556	385	296	296
query78	10050	10177	9445	9445
query79	3094	825	581	581
query80	872	527	447	447
query81	513	260	218	218
query82	707	127	98	98
query83	405	268	239	239
query84	299	111	87	87
query85	783	348	370	348
query86	453	327	287	287
query87	4321	4480	4382	4382
query88	3367	2292	2265	2265
query89	404	329	301	301
query90	1885	211	211	211
query91	149	147	113	113
query92	73	64	62	62
query93	2764	974	586	586
query94	699	388	306	306
query95	374	294	283	283
query96	503	569	279	279
query97	2751	2739	2635	2635
query98	235	205	206	205
query99	1461	1385	1308	1308
Total cold run time: 298986 ms
Total hot run time: 193115 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.15 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit a30fe4f03d1999832155c2d291e1c542f4f13731, data reload: false

query1	0.04	0.04	0.03
query2	0.13	0.10	0.13
query3	0.25	0.20	0.20
query4	1.60	0.20	0.11
query5	0.43	0.42	0.43
query6	1.15	0.67	0.67
query7	0.02	0.03	0.02
query8	0.05	0.03	0.04
query9	0.58	0.52	0.52
query10	0.58	0.57	0.58
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.62	0.60	0.60
query14	0.78	0.80	0.81
query15	0.88	0.87	0.86
query16	0.39	0.37	0.36
query17	1.02	1.02	1.06
query18	0.23	0.21	0.21
query19	1.94	1.85	1.85
query20	0.01	0.00	0.01
query21	15.39	0.87	0.53
query22	0.77	1.28	0.78
query23	14.73	1.40	0.66
query24	7.71	0.85	0.71
query25	0.49	0.19	0.08
query26	0.58	0.16	0.15
query27	0.05	0.05	0.05
query28	9.43	0.88	0.45
query29	12.62	4.05	3.35
query30	0.25	0.09	0.07
query31	2.83	0.60	0.40
query32	3.24	0.56	0.48
query33	3.09	3.06	3.10
query34	15.84	5.13	4.49
query35	4.50	4.55	4.50
query36	0.67	0.50	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.14	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 103.67 s
Total hot run time: 29.15 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.39% (15085/26750)
Line Coverage 45.12% (134645/298434)
Region Coverage 44.21% (67758/153262)
Branch Coverage 38.73% (34714/89628)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.50% (20975/26383)
Line Coverage 72.53% (216603/298619)
Region Coverage 70.77% (127567/180251)
Branch Coverage 64.41% (65988/102452)

@dataroaring dataroaring merged commit 2d9f448 into apache:master Jun 6, 2025
23 of 24 checks passed
dataroaring pushed a commit that referenced this pull request Jun 8, 2025
…ion #51482 (#51553)

Cherry-picked from #51482

Co-authored-by: abmdocrt <lianyukang@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.6-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants