Skip to content

Conversation

@freemandealer
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
@Thearas
Copy link
Contributor

Thearas commented Nov 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
@dataroaring
Copy link
Contributor

run buildall

)
.put("file_cache_info",
new SchemaTable(SystemIdGenerator.getNextId(), "file_cache_info", TableType.SCHEMA,
builder().column("HASH", ScalarType.createStringType())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name 'Hash' is ambiguous.

.column("TYPE", ScalarType.createStringType())
.column("REMOTE_PATH", ScalarType.createStringType())
.column("CACHE_PATH", ScalarType.createStringType())
.column("BE_ID", ScalarType.createType(PrimitiveType.BIGINT))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a table which contains BE_ID and COMPUTE_GROUP_NAME?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but there should be, or at least something alternative, to form 'show backends' results

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 80.51% (1677/2083)
Line Coverage 66.64% (29543/44329)
Region Coverage 67.24% (14737/21917)
Branch Coverage 57.49% (7833/13626)

@doris-robot
Copy link

TPC-H: Total hot run time: 34483 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 591d9793561dd4c00a896cfa2e0739020d1421c4, data reload: false

------ Round 1 ----------------------------------
q1	17446	5194	5109	5109
q2	2044	321	205	205
q3	10157	1267	715	715
q4	10222	875	376	376
q5	7541	2346	2461	2346
q6	189	185	142	142
q7	937	800	621	621
q8	9356	1353	1108	1108
q9	6992	5233	5200	5200
q10	6913	2236	1836	1836
q11	503	308	294	294
q12	391	374	236	236
q13	17802	3628	3031	3031
q14	234	234	215	215
q15	587	494	505	494
q16	1029	1003	952	952
q17	605	899	361	361
q18	7847	7160	7135	7135
q19	1093	962	582	582
q20	353	343	227	227
q21	4001	3221	2323	2323
q22	1085	1035	975	975
Total cold run time: 107327 ms
Total hot run time: 34483 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5177	5166	5135	5135
q2	254	333	235	235
q3	2202	2809	2326	2326
q4	1355	1795	1372	1372
q5	4220	4538	4621	4538
q6	223	179	136	136
q7	2031	2009	1871	1871
q8	2678	2657	2596	2596
q9	7351	7321	7362	7321
q10	3077	3290	2838	2838
q11	614	550	516	516
q12	690	800	657	657
q13	3591	3957	3577	3577
q14	279	310	290	290
q15	544	521	510	510
q16	1036	1115	1043	1043
q17	1194	1636	1390	1390
q18	7895	7629	7628	7628
q19	788	859	1056	859
q20	1884	1939	1851	1851
q21	4692	4414	4415	4414
q22	1040	1048	1008	1008
Total cold run time: 52815 ms
Total hot run time: 52111 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188140 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 591d9793561dd4c00a896cfa2e0739020d1421c4, data reload: false

query1	1050	408	391	391
query2	6564	1673	1728	1673
query3	6765	227	225	225
query4	26734	23355	23641	23355
query5	4427	642	477	477
query6	348	248	230	230
query7	4645	526	309	309
query8	306	275	253	253
query9	8714	2609	2587	2587
query10	501	359	294	294
query11	15574	15002	14864	14864
query12	172	122	115	115
query13	1701	586	471	471
query14	10518	9184	9223	9184
query15	202	194	178	178
query16	7621	723	530	530
query17	1276	783	643	643
query18	2031	433	334	334
query19	223	212	193	193
query20	135	126	129	126
query21	215	136	117	117
query22	3952	4251	4035	4035
query23	34080	33116	32933	32933
query24	8539	2470	2445	2445
query25	619	551	485	485
query26	1253	299	168	168
query27	2712	568	352	352
query28	4382	2204	2194	2194
query29	808	629	499	499
query30	299	226	195	195
query31	911	811	718	718
query32	85	74	73	73
query33	596	377	335	335
query34	812	851	524	524
query35	821	844	763	763
query36	959	994	869	869
query37	126	117	93	93
query38	3502	3468	3527	3468
query39	1488	1414	1418	1414
query40	226	135	118	118
query41	63	60	63	60
query42	124	114	112	112
query43	498	488	460	460
query44	1278	757	747	747
query45	188	182	174	174
query46	964	1045	640	640
query47	1758	1795	1682	1682
query48	391	437	322	322
query49	788	547	420	420
query50	677	728	399	399
query51	3911	3882	3874	3874
query52	112	117	107	107
query53	246	295	203	203
query54	341	302	285	285
query55	88	88	84	84
query56	333	320	337	320
query57	1226	1194	1101	1101
query58	295	273	276	273
query59	2547	2638	2597	2597
query60	347	340	312	312
query61	168	174	154	154
query62	792	734	648	648
query63	238	198	198	198
query64	4502	1161	865	865
query65	4009	3923	3929	3923
query66	1162	441	344	344
query67	15384	15049	15006	15006
query68	8357	949	597	597
query69	499	337	291	291
query70	1365	1304	1298	1298
query71	507	344	318	318
query72	5854	5000	5135	5000
query73	701	643	368	368
query74	8861	9130	8781	8781
query75	3903	3346	2833	2833
query76	3743	1274	781	781
query77	804	426	305	305
query78	9638	9709	8818	8818
query79	2605	878	609	609
query80	703	598	511	511
query81	520	265	228	228
query82	468	177	129	129
query83	266	274	255	255
query84	254	109	96	96
query85	954	490	446	446
query86	392	301	313	301
query87	3700	3767	3632	3632
query88	3591	2301	2273	2273
query89	423	331	292	292
query90	1908	230	223	223
query91	163	163	136	136
query92	83	71	66	66
query93	2024	989	648	648
query94	709	476	347	347
query95	427	324	316	316
query96	505	631	288	288
query97	2957	2986	2879	2879
query98	257	217	213	213
query99	1655	1414	1322	1322
Total cold run time: 277118 ms
Total hot run time: 188140 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.22 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 591d9793561dd4c00a896cfa2e0739020d1421c4, data reload: false

query1	0.05	0.05	0.05
query2	0.12	0.06	0.06
query3	0.31	0.07	0.07
query4	1.60	0.10	0.09
query5	0.27	0.25	0.24
query6	1.16	0.66	0.64
query7	0.03	0.03	0.02
query8	0.08	0.06	0.05
query9	0.65	0.53	0.54
query10	0.61	0.58	0.59
query11	0.25	0.14	0.13
query12	0.26	0.15	0.13
query13	0.63	0.63	0.62
query14	1.04	1.02	1.01
query15	0.93	0.85	0.85
query16	0.39	0.39	0.38
query17	1.04	1.06	1.05
query18	0.23	0.23	0.23
query19	1.97	1.82	1.83
query20	0.01	0.01	0.01
query21	15.42	0.28	0.23
query22	5.00	0.09	0.10
query23	15.38	0.38	0.23
query24	2.95	0.49	0.30
query25	0.10	0.09	0.09
query26	0.19	0.17	0.18
query27	0.09	0.09	0.09
query28	3.71	1.25	1.06
query29	12.58	4.01	3.28
query30	0.33	0.12	0.10
query31	2.82	0.61	0.44
query32	3.25	0.58	0.50
query33	3.06	3.13	3.06
query34	16.33	5.18	4.49
query35	4.46	4.52	4.54
query36	0.66	0.52	0.51
query37	0.22	0.09	0.08
query38	0.20	0.06	0.05
query39	0.06	0.05	0.05
query40	0.20	0.18	0.17
query41	0.12	0.06	0.06
query42	0.07	0.04	0.04
query43	0.06	0.05	0.05
Total cold run time: 98.89 s
Total hot run time: 28.22 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/117) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.77% (18226/34540)
Line Coverage 38.12% (165764/434847)
Region Coverage 33.12% (128938/389291)
Branch Coverage 33.86% (55319/163387)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 76.07% (89/117) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.51% (24282/33954)
Line Coverage 58.02% (252719/435567)
Region Coverage 53.51% (211245/394812)
Branch Coverage 54.73% (90015/164470)

@dataroaring
Copy link
Contributor

Pls create a pr to documentation.


// Get all cache instances for inspection
const std::vector<std::unique_ptr<BlockFileCache>>& get_caches() const { return _caches; }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_all_caches

@lilyq
Copy link

lilyq commented Nov 11, 2025

Pls create a pr to documentation.

Sure. This pr is a basic function introduction. User manual will be updated later on.

std::string hash_str = key.hash.to_string();

// Add to cache entries
cache_entries.emplace_back(hash_str, key.tablet_id, value.size, value.type, cache_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there may be memory issue
too large to storage in memory?
we can compact hash_str as 2 int64 instead of hex string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may need to support predicate push down ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use an integer as index for cache_path to save memory, and replace index with real path when fill column_values

}

Status SchemaFileCacheInfoScanner::_fill_block_impl(vectorized::Block* block) {
SCOPED_TIMER(_fill_block_timer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this timer used?

}

*eos = true;
return _fill_block_impl(block);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there may be some thing wrong, if we return a block contains all result

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

// Collect all cache entries from all file cache instances
std::vector<std::tuple<std::string, int64_t, int64_t, int, std::string>> cache_entries;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to define a struct instead of tuple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 use scoped struct in side function definition

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 12, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

1 similar comment
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

1 similar comment
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@gavinchou gavinchou merged commit 4cbae8a into apache:master Nov 13, 2025
38 of 49 checks passed
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Nov 13, 2025
freemandealer added a commit to freemandealer/doris that referenced this pull request Nov 14, 2025
freemandealer added a commit to freemandealer/doris that referenced this pull request Nov 17, 2025
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Nov 18, 2025
freemandealer added a commit to freemandealer/doris that referenced this pull request Dec 9, 2025
freemandealer added a commit to freemandealer/doris that referenced this pull request Dec 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants