Skip to content

[Enhancement] (binlog<row>) move write lsn on sink, ensure lsn same on all replica#64133

Open
Userwhite wants to merge 2 commits into
apache:masterfrom
Userwhite:lsn_allocate_on_skn
Open

[Enhancement] (binlog<row>) move write lsn on sink, ensure lsn same on all replica#64133
Userwhite wants to merge 2 commits into
apache:masterfrom
Userwhite:lsn_allocate_on_skn

Conversation

@Userwhite
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #61956

Related PR: #63110

Problem Summary:

Currently, LSNs are assigned on the BE TabletWriter side during writes, with each node allocating them independently.
As a result, replicas may end up with inconsistent LSNs when reading.

We are now moving LSN allocation to the sink side so that the same batch of data gets a fixed LSN.
At the same time, for publish-conflict cases, we preserve consistency by proactively reading the old LSN instead of allocating a new one.

In addition,fix 2 bug about binlog:

  1. binlog compaction hasn't get cumulative policy, it will cause DCHECK fail
  2. read binlog may receive error, because preaggregation is false.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Userwhite Userwhite changed the title [Enhancement] move write lsn on sink, ensure lsn same on all replica [Enhancement] (binlog<row>) move write lsn on sink, ensure lsn same on all replica Jun 5, 2026
@Userwhite
Copy link
Copy Markdown
Contributor Author

/review

@Userwhite
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 1.56% (1/64) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29456 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ae290af803111d05d3fcaf7b2df40f4da5371aee, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17898	4102	4025	4025
q2	q3	10893	1397	856	856
q4	4751	485	358	358
q5	8349	880	592	592
q6	318	175	138	138
q7	929	882	621	621
q8	10881	1592	1574	1574
q9	7417	4585	4558	4558
q10	6820	1825	1548	1548
q11	436	281	253	253
q12	638	435	287	287
q13	18143	3482	2742	2742
q14	267	261	238	238
q15	q16	829	783	709	709
q17	1023	954	946	946
q18	6922	5813	5637	5637
q19	1202	1221	1265	1221
q20	586	441	288	288
q21	5870	2875	2558	2558
q22	497	370	307	307
Total cold run time: 104669 ms
Total hot run time: 29456 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4841	4907	4744	4744
q2	q3	4981	5256	4696	4696
q4	2164	2177	1412	1412
q5	5006	4777	4648	4648
q6	228	175	135	135
q7	1987	1757	1539	1539
q8	2442	2112	2172	2112
q9	7805	7450	7356	7356
q10	4746	4690	4200	4200
q11	531	389	359	359
q12	742	739	522	522
q13	3051	3397	2850	2850
q14	276	272	261	261
q15	q16	682	702	616	616
q17	1304	1265	1261	1261
q18	7586	6898	6990	6898
q19	1158	1139	1131	1131
q20	2228	2213	1939	1939
q21	5307	4565	4488	4488
q22	530	448	414	414
Total cold run time: 57595 ms
Total hot run time: 51581 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170843 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ae290af803111d05d3fcaf7b2df40f4da5371aee, data reload: false

query5	4344	628	466	466
query6	460	201	184	184
query7	4956	553	318	318
query8	382	219	191	191
query9	8774	4039	4065	4039
query10	472	318	264	264
query11	5899	2320	2237	2237
query12	155	106	101	101
query13	1314	630	427	427
query14	6469	5430	5077	5077
query14_1	4438	4414	4361	4361
query15	206	195	178	178
query16	1039	449	435	435
query17	1108	690	568	568
query18	2593	463	333	333
query19	193	179	144	144
query20	117	108	107	107
query21	215	134	114	114
query22	13679	13548	13433	13433
query23	17437	16512	16200	16200
query23_1	16296	16311	16263	16263
query24	7633	1792	1319	1319
query24_1	1323	1315	1343	1315
query25	563	452	375	375
query26	1317	321	170	170
query27	2748	552	344	344
query28	4458	2069	2033	2033
query29	1080	612	531	531
query30	329	241	199	199
query31	1142	1083	982	982
query32	109	64	62	62
query33	532	340	266	266
query34	1173	1149	646	646
query35	783	788	698	698
query36	1417	1449	1290	1290
query37	159	111	100	100
query38	3229	3160	3016	3016
query39	970	918	896	896
query39_1	910	872	879	872
query40	231	135	109	109
query41	72	72	70	70
query42	99	97	96	96
query43	323	329	280	280
query44	
query45	202	194	187	187
query46	1110	1187	756	756
query47	2442	2449	2272	2272
query48	404	407	281	281
query49	621	484	346	346
query50	974	350	248	248
query51	4354	4326	4280	4280
query52	86	87	76	76
query53	242	271	192	192
query54	262	219	211	211
query55	77	81	72	72
query56	253	234	231	231
query57	1448	1486	1344	1344
query58	255	211	215	211
query59	1582	1656	1426	1426
query60	277	250	233	233
query61	159	161	156	156
query62	706	654	588	588
query63	233	182	185	182
query64	2572	829	628	628
query65	
query66	1770	465	350	350
query67	29815	29711	29655	29655
query68	
query69	426	311	268	268
query70	957	952	972	952
query71	307	228	205	205
query72	3056	2743	2388	2388
query73	849	746	441	441
query74	5155	4975	4761	4761
query75	2705	2593	2233	2233
query76	2322	1170	782	782
query77	370	378	290	290
query78	12451	12461	11898	11898
query79	1298	1013	725	725
query80	587	465	404	404
query81	449	288	243	243
query82	241	159	123	123
query83	273	284	256	256
query84	256	147	110	110
query85	884	538	455	455
query86	326	305	276	276
query87	3390	3326	3175	3175
query88	3611	2733	2732	2732
query89	408	373	330	330
query90	2163	202	176	176
query91	177	176	139	139
query92	64	64	57	57
query93	1458	1446	864	864
query94	530	359	322	322
query95	692	394	438	394
query96	1081	776	336	336
query97	2720	2729	2584	2584
query98	210	207	201	201
query99	1171	1185	1065	1065
Total cold run time: 252077 ms
Total hot run time: 170843 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] add row type for doris binlog

2 participants