Skip to content

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Jul 11, 2024

long time ago, doris may lost data due to program bug:

  1. during migrate or clone, may cause new replica lost data;
  2. a tablet may delete all its replicas.

But all these bugs had been solved. Now trash is useless. There's hardly no user use trash to restore data for prod env. Even a user can use a http to restore data from trash, it still cann't use immedidately because fe not contains its meta.

On the other hand, trash has cause other problems:

  1. make unbalance because not deleleted the trash data;
  2. be chose disks not RR because the disks may suddently delete a lot trash data when disk's usage over 80%;
  3. had more disk load.

Considering all this, we make trash_file_expire_time_sec change to 0. If user really want to keep trash, they need to set trash_file_expire_time_sec > 0 manually.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yujun777
Copy link
Contributor Author

run buildall

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 11, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@deardeng deardeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 39941 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit eb5d3c4ef0e3efead11a5614b648841e8c9278c2, data reload: false

------ Round 1 ----------------------------------
q1	18247	4415	4366	4366
q2	2565	198	205	198
q3	11211	1190	1147	1147
q4	10853	785	761	761
q5	7597	2768	2706	2706
q6	226	140	142	140
q7	972	610	604	604
q8	9213	2040	2070	2040
q9	8628	6547	6518	6518
q10	8631	3784	3718	3718
q11	469	238	239	238
q12	398	236	223	223
q13	18472	3012	2965	2965
q14	276	234	242	234
q15	527	481	496	481
q16	485	381	375	375
q17	966	677	634	634
q18	8002	7517	7399	7399
q19	2934	1486	1419	1419
q20	697	325	337	325
q21	4919	3109	3191	3109
q22	385	341	349	341
Total cold run time: 116673 ms
Total hot run time: 39941 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4309	4252	4279	4252
q2	382	266	272	266
q3	3030	2740	2718	2718
q4	1903	1660	1620	1620
q5	5272	5309	5317	5309
q6	221	128	129	128
q7	2132	1714	1750	1714
q8	3195	3303	3271	3271
q9	8368	8301	8342	8301
q10	3911	3691	3666	3666
q11	577	493	496	493
q12	818	642	621	621
q13	17277	2985	2971	2971
q14	287	274	288	274
q15	511	472	488	472
q16	467	427	439	427
q17	1761	1468	1482	1468
q18	7692	7456	7483	7456
q19	1697	1539	1669	1539
q20	1989	1768	1773	1768
q21	4896	4795	4715	4715
q22	585	557	513	513
Total cold run time: 71280 ms
Total hot run time: 53962 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174509 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit eb5d3c4ef0e3efead11a5614b648841e8c9278c2, data reload: false

query1	915	366	364	364
query2	6455	2517	2459	2459
query3	6661	208	220	208
query4	28631	17664	17454	17454
query5	4145	489	497	489
query6	266	183	178	178
query7	4582	292	285	285
query8	323	306	311	306
query9	8436	2432	2420	2420
query10	449	280	269	269
query11	11207	10142	10093	10093
query12	135	85	88	85
query13	1648	382	377	377
query14	9472	7745	7720	7720
query15	294	190	198	190
query16	8153	328	312	312
query17	1822	555	546	546
query18	2110	282	285	282
query19	203	154	164	154
query20	94	85	82	82
query21	209	130	130	130
query22	4211	3986	4033	3986
query23	34088	33145	33127	33127
query24	11979	2933	2886	2886
query25	673	382	397	382
query26	1758	158	156	156
query27	2925	275	275	275
query28	7397	2089	2072	2072
query29	1064	661	645	645
query30	287	152	151	151
query31	983	748	777	748
query32	97	56	59	56
query33	786	330	316	316
query34	977	475	501	475
query35	683	569	572	569
query36	1102	948	913	913
query37	238	78	79	78
query38	2885	2752	2753	2752
query39	867	814	829	814
query40	282	121	118	118
query41	55	50	51	50
query42	123	99	103	99
query43	585	533	537	533
query44	1325	751	733	733
query45	198	164	162	162
query46	1097	731	746	731
query47	1857	1772	1778	1772
query48	376	300	306	300
query49	1208	422	410	410
query50	769	402	393	393
query51	6845	6791	6826	6791
query52	106	96	99	96
query53	357	297	292	292
query54	1056	447	456	447
query55	77	74	75	74
query56	292	271	273	271
query57	1143	1050	1042	1042
query58	251	258	248	248
query59	3529	3275	3294	3275
query60	308	286	272	272
query61	98	93	94	93
query62	843	670	654	654
query63	331	291	285	285
query64	10478	2232	1663	1663
query65	3181	3108	3091	3091
query66	1371	340	337	337
query67	15528	15057	14975	14975
query68	6542	537	541	537
query69	679	445	350	350
query70	1184	1084	1141	1084
query71	515	318	287	287
query72	8708	5393	5421	5393
query73	809	326	321	321
query74	6046	5605	5451	5451
query75	4623	2687	2714	2687
query76	4956	937	950	937
query77	775	313	311	311
query78	9649	9254	8909	8909
query79	7081	528	535	528
query80	985	491	474	474
query81	587	221	225	221
query82	728	144	135	135
query83	315	172	175	172
query84	285	86	93	86
query85	1427	306	301	301
query86	412	330	331	330
query87	3376	3107	3186	3107
query88	4319	2369	2357	2357
query89	505	398	392	392
query90	2006	194	199	194
query91	130	104	103	103
query92	63	52	50	50
query93	5306	508	515	508
query94	1261	210	209	209
query95	409	325	324	324
query96	610	270	271	270
query97	3238	3032	3039	3032
query98	251	192	199	192
query99	1579	1356	1246	1246
Total cold run time: 302238 ms
Total hot run time: 174509 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.65 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit eb5d3c4ef0e3efead11a5614b648841e8c9278c2, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.66	0.07	0.09
query5	0.51	0.50	0.50
query6	1.14	0.73	0.73
query7	0.02	0.02	0.01
query8	0.06	0.04	0.05
query9	0.55	0.49	0.49
query10	0.54	0.55	0.54
query11	0.15	0.12	0.11
query12	0.14	0.12	0.12
query13	0.59	0.58	0.58
query14	0.75	0.79	0.76
query15	0.84	0.81	0.82
query16	0.35	0.38	0.37
query17	0.97	0.95	0.99
query18	0.22	0.22	0.21
query19	1.78	1.71	1.72
query20	0.01	0.01	0.01
query21	15.40	0.76	0.66
query22	4.54	6.69	1.92
query23	18.32	1.37	1.22
query24	2.10	0.26	0.22
query25	0.16	0.08	0.09
query26	0.31	0.21	0.21
query27	0.45	0.23	0.24
query28	13.23	1.02	1.00
query29	12.66	3.35	3.33
query30	0.25	0.06	0.05
query31	2.87	0.39	0.40
query32	3.26	0.48	0.47
query33	2.91	2.94	2.93
query34	17.05	4.34	4.34
query35	4.42	4.42	4.42
query36	0.65	0.46	0.47
query37	0.19	0.16	0.15
query38	0.15	0.14	0.16
query39	0.05	0.03	0.04
query40	0.15	0.12	0.12
query41	0.10	0.06	0.06
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.95 s
Total hot run time: 30.65 s

@yujun777
Copy link
Contributor Author

run p0

1 similar comment
@yujun777
Copy link
Contributor Author

run p0

Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 19a3d93 into apache:master Jul 13, 2024
seawinde pushed a commit to seawinde/doris that referenced this pull request Jul 17, 2024
long time ago, doris may lost data due to program bug:
1) during migrate or clone,  may cause new replica lost data;
2) a tablet may delete all its replicas.

But all these bugs had been solved. Now trash is useless. There's hardly
no user use trash to restore data for prod env. Even a user can use a
http to restore data from trash, it still cann't use immedidately
because fe not contains its meta.

On the other hand,  trash has cause other problems:
1) make unbalance because not deleleted the trash data;
2) be chose disks not RR because the disks may suddently delete a lot
trash data when disk's usage over 80%;
3) had more disk load.

Considering all this, we make trash_file_expire_time_sec change to 0. If
user really want to keep trash, they need to set
trash_file_expire_time_sec > 0 manually.
dataroaring pushed a commit that referenced this pull request Jul 17, 2024
long time ago, doris may lost data due to program bug:
1) during migrate or clone,  may cause new replica lost data;
2) a tablet may delete all its replicas.

But all these bugs had been solved. Now trash is useless. There's hardly
no user use trash to restore data for prod env. Even a user can use a
http to restore data from trash, it still cann't use immedidately
because fe not contains its meta.

On the other hand,  trash has cause other problems:
1) make unbalance because not deleleted the trash data;
2) be chose disks not RR because the disks may suddently delete a lot
trash data when disk's usage over 80%;
3) had more disk load.

Considering all this, we make trash_file_expire_time_sec change to 0. If
user really want to keep trash, they need to set
trash_file_expire_time_sec > 0 manually.
@gavinchou gavinchou mentioned this pull request Aug 19, 2024
yujun777 added a commit to yujun777/doris that referenced this pull request Dec 25, 2024
long time ago, doris may lost data due to program bug:
1) during migrate or clone,  may cause new replica lost data;
2) a tablet may delete all its replicas.

But all these bugs had been solved. Now trash is useless. There's hardly
no user use trash to restore data for prod env. Even a user can use a
http to restore data from trash, it still cann't use immedidately
because fe not contains its meta.

On the other hand,  trash has cause other problems:
1) make unbalance because not deleleted the trash data;
2) be chose disks not RR because the disks may suddently delete a lot
trash data when disk's usage over 80%;
3) had more disk load.

Considering all this, we make trash_file_expire_time_sec change to 0. If
user really want to keep trash, they need to set
trash_file_expire_time_sec > 0 manually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.0.1-merged kind/behavior-changed reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants