[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases by morningman · Pull Request #32340 · apache/doris

morningman · 2024-03-17T10:01:44Z

Proposed changes

File meta cache on BE is used to cache the meta for external table's file such as parquet footer.
This cache is counted by number, not memory consumption.
So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache
will be large and causing OOM.

This PR mainly changes:

Add a new method exceed_prune_limit() for CachePolicy
For ObjLRUCache, it always return true so that the minor of full gc on BE will prune the cache each time.
Reduce the default capability of file meta cache, from 20000 to 1000

Also change the default capability of hdfs file handle cache, from 20000 to 1000
Change judgement of whether enable file meta cache when querying

If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache
will be disabled for this query. Because cache is useless if there are too many files.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

…for some cases

doris-robot · 2024-03-17T10:01:49Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

github-actions · 2024-03-17T10:07:37Z

clang-tidy review says "All clean, LGTM! 👍"

github-actions · 2024-03-17T12:25:29Z

PR approved by at least one committer and no changes requested.

github-actions · 2024-03-17T12:25:31Z

PR approved by anyone and no changes requested.

morningman · 2024-03-17T12:29:00Z

run buildall

doris-robot · 2024-03-17T13:07:38Z

TPC-H: Total hot run time: 38705 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5501fa18588c0af7fcc353da466a681237de3404, data reload: false

------ Round 1 ----------------------------------
q1	17671	4584	4130	4130
q2	2024	151	147	147
q3	10612	1132	929	929
q4	7775	753	767	753
q5	7496	2781	2756	2756
q6	188	123	121	121
q7	1190	844	819	819
q8	9332	2039	1995	1995
q9	7232	6446	6455	6446
q10	8542	3567	3678	3567
q11	433	222	219	219
q12	645	306	292	292
q13	17798	2914	2867	2867
q14	287	251	247	247
q15	498	458	450	450
q16	501	397	399	397
q17	963	569	545	545
q18	7259	6461	6462	6461
q19	3466	1459	1482	1459
q20	551	286	283	283
q21	6312	3513	3543	3513
q22	354	317	309	309
Total cold run time: 111129 ms
Total hot run time: 38705 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4120	4095	4058	4058
q2	321	219	215	215
q3	3016	2896	2863	2863
q4	1896	1549	1654	1549
q5	5223	5247	5294	5247
q6	193	116	119	116
q7	2287	1833	1892	1833
q8	3141	3315	3309	3309
q9	8548	8581	8553	8553
q10	3752	3676	3686	3676
q11	537	443	437	437
q12	737	563	579	563
q13	16921	2854	2866	2854
q14	273	251	255	251
q15	489	443	441	441
q16	456	414	411	411
q17	1738	1499	1459	1459
q18	7586	7065	7280	7065
q19	1618	1517	1469	1469
q20	1921	1708	1734	1708
q21	4790	4776	4703	4703
q22	521	443	450	443
Total cold run time: 70084 ms
Total hot run time: 53223 ms

doris-robot · 2024-03-17T13:17:57Z

TeamCity be ut coverage result:
Function Coverage: 34.94% (8576/24542)
Line Coverage: 26.66% (69544/260818)
Region Coverage: 25.94% (36114/139205)
Branch Coverage: 22.89% (18446/80582)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5501fa18588c0af7fcc353da466a681237de3404_5501fa18588c0af7fcc353da466a681237de3404/report/index.html

be/src/runtime/memory/lru_cache_policy.h

be/src/util/obj_lru_cache.cpp

xinyiZzz

LGTM

…for some cases (apache#32340) File meta cache on BE is used to cache the meta for external table's file such as parquet footer. This cache is counted by number, not memory consumption. So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache will be large and causing OOM. This PR mainly changes: 1. Add a new method `exceed_prune_limit()` for `CachePolicy` For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time. 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 4. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.

pick part of apache#32340 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 3. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.

pick part of #32340 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 3. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.

…for some cases (#32340) File meta cache on BE is used to cache the meta for external table's file such as parquet footer. This cache is counted by number, not memory consumption. So if the cache object is big(eg, a large parquet footer), the total memory consumption of this cache will be large and causing OOM. This PR mainly changes: 1. Add a new method `exceed_prune_limit()` for `CachePolicy` For `ObjLRUCache`, it always return true so that the minor of full gc on BE will prune the cache each time. 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 4. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.

…ache#32367) pick part of apache#32340 2. Reduce the default capability of file meta cache, from 20000 to 1000 Also change the default capability of hdfs file handle cache, from 20000 to 1000 3. Change judgement of whether enable file meta cache when querying If the number of file need to be read is larger than the 1/3 of the file meta cache's capability, file meta cache will be disabled for this query. Because cache is useless if there are too many files.

[opt](file-meta-cache) reduce file meta cache size and disable cache …

5501fa1

…for some cases

AshinGau approved these changes Mar 17, 2024

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 17, 2024

github-actions bot added the reviewed label Mar 17, 2024

xinyiZzz reviewed Mar 17, 2024

View reviewed changes

be/src/runtime/memory/lru_cache_policy.h Show resolved Hide resolved

xinyiZzz reviewed Mar 17, 2024

View reviewed changes

be/src/util/obj_lru_cache.cpp Show resolved Hide resolved

xinyiZzz approved these changes Mar 18, 2024

View reviewed changes

morningman merged commit 6a36d61 into apache:master Mar 18, 2024

morningman added the dev/2.0.x label Mar 18, 2024

morningman mentioned this pull request Mar 18, 2024

[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases (#32340) #32367

Merged

morningman added dev/2.0.7-merged and removed dev/2.0.x labels Mar 18, 2024

xiaokang mentioned this pull request Mar 22, 2024

Release Note 2.0.7 #32677

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases#32340

[opt](file-meta-cache) reduce file meta cache size and disable cache for some cases#32340
morningman merged 1 commit intoapache:masterfrom
morningman:reduce_file_meta_cache

morningman commented Mar 17, 2024 •

edited

Loading

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

morningman commented Mar 17, 2024

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

Uh oh!

Uh oh!

xinyiZzz left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

morningman commented Mar 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Further comments

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

github-actions bot commented Mar 17, 2024

Uh oh!

morningman commented Mar 17, 2024

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

doris-robot commented Mar 17, 2024

Uh oh!

Uh oh!

Uh oh!

xinyiZzz left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

morningman commented Mar 17, 2024 •

edited

Loading