Skip to content

[opt](cloud) load data no call partition.getVisibleVersion in cloud mode #51111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 22, 2025

Conversation

yujun777
Copy link
Contributor

@yujun777 yujun777 commented May 21, 2025

What problem does this PR solve?

For olap sink, when available replica not enough, it will load failed, and give a hint to desc the replicas' detail info. The detail info include partition's visible version. But in cloud mode, the partition's get visible version is a rpc, and in cloud mode, the partition's visible version will not affect the succ or failure for the loading data. So cloud mode will not need partition's visible version for hint, then we remove it.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Contributor Author

run buildall

@yujun777
Copy link
Contributor Author

run buildall

gavinchou
gavinchou previously approved these changes May 21, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 21, 2025
Copy link
Contributor

PR approved by anyone and no changes requested.

morrySnow
morrySnow previously approved these changes May 21, 2025
@yujun777
Copy link
Contributor Author

run performance

Copy link
Contributor

@deardeng deardeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yujun777 yujun777 dismissed stale reviews from morrySnow and gavinchou via 87dc9d6 May 21, 2025 07:39
@yujun777
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label May 21, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 33622 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 87dc9d6bf8fa5dde769d4febda1983d6c86ec4c9, data reload: false

------ Round 1 ----------------------------------
q1	26215	5100	4975	4975
q2	2069	270	179	179
q3	10404	1242	689	689
q4	10219	990	505	505
q5	7556	2389	2267	2267
q6	181	159	132	132
q7	888	750	611	611
q8	9321	1297	1074	1074
q9	6961	5219	5077	5077
q10	6877	2303	1890	1890
q11	484	281	266	266
q12	344	345	213	213
q13	17798	3623	3146	3146
q14	236	228	211	211
q15	527	509	481	481
q16	420	442	381	381
q17	593	848	352	352
q18	7654	7133	7230	7133
q19	1820	977	540	540
q20	323	329	217	217
q21	3711	2482	2320	2320
q22	1022	1016	963	963
Total cold run time: 115623 ms
Total hot run time: 33622 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5159	5034	5011	5011
q2	238	330	234	234
q3	2159	2643	2240	2240
q4	1315	1761	1346	1346
q5	4387	4301	4402	4301
q6	215	172	135	135
q7	2012	1914	1740	1740
q8	2573	2516	2480	2480
q9	7277	7225	6897	6897
q10	3078	3172	2758	2758
q11	609	503	514	503
q12	687	773	620	620
q13	3491	3892	3278	3278
q14	278	301	287	287
q15	530	475	468	468
q16	433	472	448	448
q17	1139	1564	1340	1340
q18	7741	7593	7464	7464
q19	774	823	912	823
q20	1986	1971	1899	1899
q21	4844	4454	4521	4454
q22	1089	1044	1045	1044
Total cold run time: 52014 ms
Total hot run time: 49770 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192757 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 87dc9d6bf8fa5dde769d4febda1983d6c86ec4c9, data reload: false

query1	1440	1102	1021	1021
query2	6318	1878	1835	1835
query3	11019	4378	4457	4378
query4	52939	25466	23114	23114
query5	5187	540	457	457
query6	360	214	192	192
query7	4939	528	283	283
query8	305	261	230	230
query9	5703	2633	2634	2633
query10	435	344	267	267
query11	15028	15057	14774	14774
query12	162	113	111	111
query13	1075	534	427	427
query14	10120	6256	6285	6256
query15	211	198	182	182
query16	7054	670	465	465
query17	1079	772	618	618
query18	1550	432	328	328
query19	218	202	178	178
query20	135	135	124	124
query21	214	129	110	110
query22	4559	4650	4465	4465
query23	34575	33712	33549	33549
query24	6736	2376	2474	2376
query25	493	499	445	445
query26	725	277	158	158
query27	2385	526	348	348
query28	3410	2146	2165	2146
query29	584	584	443	443
query30	273	221	201	201
query31	867	857	788	788
query32	78	60	61	60
query33	456	361	336	336
query34	786	879	543	543
query35	794	837	742	742
query36	974	998	866	866
query37	109	100	81	81
query38	4238	4277	4172	4172
query39	1513	1469	1485	1469
query40	218	127	108	108
query41	61	53	52	52
query42	131	115	109	109
query43	507	535	473	473
query44	1356	831	828	828
query45	183	171	164	164
query46	860	1033	664	664
query47	1839	1900	1832	1832
query48	420	440	317	317
query49	708	531	472	472
query50	667	695	408	408
query51	4188	4238	4241	4238
query52	113	117	104	104
query53	228	265	187	187
query54	618	576	522	522
query55	89	90	91	90
query56	327	315	303	303
query57	1202	1195	1132	1132
query58	271	256	266	256
query59	2682	2935	2841	2841
query60	348	329	328	328
query61	137	127	126	126
query62	699	751	669	669
query63	234	199	197	197
query64	1836	1037	664	664
query65	4296	4205	4294	4205
query66	734	399	303	303
query67	15736	15602	15698	15602
query68	7025	881	528	528
query69	547	317	272	272
query70	1219	1094	1126	1094
query71	529	318	295	295
query72	5547	4773	4812	4773
query73	1372	637	363	363
query74	8918	9272	8670	8670
query75	3802	3203	2694	2694
query76	4240	1342	751	751
query77	623	360	286	286
query78	9997	10242	9405	9405
query79	1981	832	573	573
query80	685	513	452	452
query81	470	268	223	223
query82	412	127	96	96
query83	343	256	243	243
query84	292	111	87	87
query85	795	355	325	325
query86	368	315	292	292
query87	4528	4420	4354	4354
query88	3412	2334	2314	2314
query89	416	306	283	283
query90	1923	218	210	210
query91	145	159	114	114
query92	74	66	58	58
query93	1142	937	588	588
query94	663	387	311	311
query95	377	300	284	284
query96	519	580	297	297
query97	2719	2736	2689	2689
query98	226	215	194	194
query99	1446	1384	1274	1274
Total cold run time: 296489 ms
Total hot run time: 192757 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 87dc9d6bf8fa5dde769d4febda1983d6c86ec4c9, data reload: false

query1	0.03	0.03	0.03
query2	0.14	0.11	0.12
query3	0.25	0.20	0.20
query4	1.60	0.19	0.19
query5	0.46	0.43	0.43
query6	1.59	0.66	0.68
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.56	0.51	0.51
query10	0.56	0.57	0.56
query11	0.17	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.60	0.60
query14	0.79	0.81	0.80
query15	0.88	0.86	0.86
query16	0.40	0.38	0.39
query17	1.02	1.02	1.03
query18	0.22	0.21	0.21
query19	1.99	1.95	1.80
query20	0.01	0.01	0.01
query21	15.40	0.90	0.56
query22	0.75	1.14	0.88
query23	14.70	1.40	0.66
query24	6.85	1.45	0.74
query25	0.53	0.22	0.08
query26	0.55	0.17	0.16
query27	0.05	0.05	0.05
query28	9.99	0.90	0.45
query29	12.55	4.05	3.32
query30	0.25	0.09	0.07
query31	2.81	0.60	0.39
query32	3.25	0.55	0.47
query33	3.09	2.99	3.06
query34	15.71	5.11	4.46
query35	4.51	4.49	4.52
query36	0.66	0.50	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.17	0.14	0.13
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.64 s
Total hot run time: 29.22 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 21, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@yujun777
Copy link
Contributor Author

run p0

@yujun777
Copy link
Contributor Author

run external

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit a62e651 into apache:master May 22, 2025
29 checks passed
github-actions bot pushed a commit that referenced this pull request May 22, 2025
…ode (#51111)

### What problem does this PR solve?

For olap sink, when available replica not enough, it will load failed,
and give a hint to desc the replicas' detail info. The detail info
include partition's visible version. But in cloud mode, the partition's
get visible version is a rpc, and in cloud mode, the partition's visible
version will not affect the succ or failure for the loading data. So
cloud mode will not need partition's visible version for hint, then we
remove it.
dataroaring pushed a commit that referenced this pull request May 24, 2025
…n in cloud mode #51111 (#51150)

Cherry-picked from #51111

Co-authored-by: yujun <yujun@selectdb.com>
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…ode (apache#51111)

### What problem does this PR solve?

For olap sink, when available replica not enough, it will load failed,
and give a hint to desc the replicas' detail info. The detail info
include partition's visible version. But in cloud mode, the partition's
get visible version is a rpc, and in cloud mode, the partition's visible
version will not affect the succ or failure for the loading data. So
cloud mode will not need partition's visible version for hint, then we
remove it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.6-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants