-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement]LocalStorage init use multi thread #71
Comments
How much time will it take to execute the operation? Could you provide some concrete data? |
@xianjingfeng I think clear unnecessary data before start shuffle server by Linux command is a better way, and clear process during shuffle server startup is a backup solution. |
more that 30 seconds per disk, it dependency disk performance and usage. |
What's the type of disk that you use? The performance of disk is a little strange. We only delete one directory on every disk, it shouldn't take so much time. |
hdd, about 1T usage per disk |
if do this in start scirpt, we need parse conf file in start script, i think it is heavy for start script and it is difficult to maintain |
FileUtils.deleteDirectory will delete it's childrens first |
Ok, I got it. It's ok for me to use multi-thread to do cleanup operation. |
How many files are deleted in your disk ? Uniffle has merged data in each partition, so there will not exist very large number of files. Can you do a simple test by using |
I neither count how many files nor test by using rm -rf , i will try to do this recently |
### **What changes were proposed in this pull request?** solve issue #71, use multi thread to clean local storage ### **Why are the changes needed?** If shuffle server exit abnormally, there will be many files need to be clear when shuffle server start again and this operation will cost a lot of time ### **Does this PR introduce any user-facing change?** No ### **How was this patch tested?** Add new ut
solved by pr #72 |
We have multi disk in a host. If shuffle server exit abnormally, there will be many files need to be clear when shuffle server start again and this operation will cost a lot of time
The text was updated successfully, but these errors were encountered: