branch-3.0: [Fix](catalog)Remove the fs.disable.cache parameter to prevent excessive FS-associated objects and memory leaks #46184#46189
Merged
morningman merged 1 commit intobranch-3.0from Dec 31, 2024
Conversation
…ive FS-associated objects and memory leaks (#46184) ### Background In the current file system implementation, the fs.disable.cache parameter allows disabling FS caching. While this provides flexibility, it introduces several critical issues: ``` 1: 22537201 721190432 java.util.HashMap$Node 2: 21559238 689895616 javax.management.MBeanAttributeInfo 3: 21559098 517418352 javax.management.Attribute 4: 19380247 465125928 org.apache.hadoop.metrics2.impl.MetricCounterLong 5: 122603 461180096 [J 6: 294309 255533536 [B 7: 724598 252264048 [Ljava.lang.Object; 8: 2012368 189047432 [C 9: 159442 131064400 [Ljava.util.HashMap$Node; 10: 114752 88075072 [Ljavax.management.MBeanAttributeInfo; 11: 1899581 45589944 java.lang.String 12: 1720140 41283360 org.apache.hadoop.metrics2.impl.MetricGaugeLong ``` #### Unbounded FS Instance Creation When fs.disable.cache=true, a new FS instance is created for every access, preventing instance reuse. ``` String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme); if (conf.getBoolean(disableCacheName, false)) { LOGGER.debug("Bypassing cache to create filesystem {}", uri); return createFileSystem(uri, conf); } ``` #### Resource Leakage Associated objects, such as thread metrics and connection pools, are not properly released due to excessive FS instance creation, leading to memory leaks. #### Performance Degradation Frequent creation and destruction of FS instances impose significant overhead, especially in high-concurrency scenarios. ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [x] Manual test (add detailed scripts or steps below) ``` CREATE CATALOG `iceberg_cos` PROPERTIES ( "warehouse" = "cosn://ha/ha/ha/stress/multi_fs", "type" = "iceberg", "iceberg.catalog.type" = "hadoop", "cos.secret_key" = "*XXX", "cos.region" = "ap-beijing", "cos.endpoint" = "cos.ap-beijing.myqcloud.com", "cos.access_key" = "**************" ); Create a catalog using object storage, then write a scheduled script to continuously refresh the catalog. Query the catalog periodically and monitor whether the thread memory behaves as expected. ``` <img width="1131" alt="image" src="https://github.com/user-attachments/assets/c7b04a5a-449f-432c-975b-524fdb81247a" /> At 22:30, I replaced it with the fixed version.
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
|
run buildall |
TPC-H: Total hot run time: 40702 ms |
TPC-DS: Total hot run time: 198692 ms |
ClickBench: Total hot run time: 32.75 s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-picked from #46184