-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARCTIC-1057][AMS] Only load self-optimizing enabled tables into cache #1145
Conversation
Thanks for your contribution! I will help to review it ASAP. |
Thanks a lot for your contribution! |
Okay |
# Conflicts: # ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XBaith I left some comments, please take a look.
ams/ams-server/src/main/java/com/netease/arctic/ams/server/ArcticMetaStore.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/IOptimizeService.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Outdated
Show resolved
Hide resolved
# Conflicts: # ams/ams-server/src/main/java/com/netease/arctic/ams/server/ArcticMetaStore.java
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #1145 +/- ##
=============================================
+ Coverage 29.27% 52.68% +23.41%
+ Complexity 5374 523 -4851
=============================================
Files 695 43 -652
Lines 70786 3705 -67081
Branches 8180 354 -7826
=============================================
- Hits 20723 1952 -18771
+ Misses 48058 1623 -46435
+ Partials 2005 130 -1875
Flags with carried forward coverage won't be shown. Click here to find out more. see 652 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
# Conflicts: # ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Hi @wangtaohz, change complete , please review if you have time. |
# Conflicts: # ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Show resolved
Hide resolved
It's a bit difficult to understand I suggest it to be:
and the process
|
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Outdated
Show resolved
Hide resolved
ams/ams-server/src/main/java/com/netease/arctic/ams/server/optimize/OptimizeService.java
Show resolved
Hide resolved
#1145) * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * load un-optimized table into db * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * import CatalogLoader * checkstyle * avoid insert duplicate record into sysdb * [WAP] remove table from unOptimizeTables when clear table * rewrite base on code review * rewrite base on code review --------- Co-authored-by: Xavier Bai <xuba@cisco.com> Co-authored-by: luting <1004611953@qq.com> Co-authored-by: ZhouJinsong <zhoujinsong0505@163.com>
#1145) * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * load un-optimized table into db * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * import CatalogLoader * checkstyle * avoid insert duplicate record into sysdb * [WAP] remove table from unOptimizeTables when clear table * rewrite base on code review * rewrite base on code review --------- Co-authored-by: Xavier Bai <xuba@cisco.com> Co-authored-by: luting <1004611953@qq.com> Co-authored-by: ZhouJinsong <zhoujinsong0505@163.com>
apache#1145) * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * load un-optimized table into db * [Optimize] Only load self-optimizing enabled tables into cache * checkstyle * fix log level * refactor code base on the review * import CatalogLoader * checkstyle * avoid insert duplicate record into sysdb * [WAP] remove table from unOptimizeTables when clear table * rewrite base on code review * rewrite base on code review --------- Co-authored-by: Xavier Bai <xuba@cisco.com> Co-authored-by: luting <1004611953@qq.com> Co-authored-by: ZhouJinsong <zhoujinsong0505@163.com>
Why are the changes needed?
This PR was inspired by related issue #1057 , but was not created for implementation.
AMS can not handle large scale Iceberg tables which has a certain number of small files/delete files. Some of tables in iceberg catalog are not optimizing enabled, so these tables are not nessessary for AMS to scan or load into memory.
I assume that self-optimizing disabled tables allocate a mount of memory and compare the GC times after remove them:
The below sreenshot lists number of instances and retained heap per class:
![image](https://user-images.githubusercontent.com/54210725/220833294-d6e733c7-eaa6-49a4-a3cf-7c8f6a5a26b4.png)
I also dump it and find that
ConcurrentHashMap
inOptimizeService
retained too much heapcase 1: Too many snapshots entry
![image](https://user-images.githubusercontent.com/54210725/220832721-dae0e1d8-d859-4ddb-8214-0402015a84d2.png)
![image](https://user-images.githubusercontent.com/54210725/220833017-9388fd0f-ee05-4efe-a32c-1994c33adb1e.png)
case 2: Too many delete files
![image](https://user-images.githubusercontent.com/54210725/220833403-e9ea18f0-9576-4dc0-a474-dbe47b3cd527.png)
Brief change log
How was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request
Documentation