Skip to content
This repository was archived by the owner on Jan 9, 2023. It is now read-only.

Conversation

@BrettKercher
Copy link
Contributor

The memory optimized cleanup algorithm breaks the process down into two passes:

On the first pass, any expired files are detected and unlinked in the same block.

For the second pass, we return immediately if Max Cache Size is unused, or already satisfied. Otherwise, we maintain a list that will never exceed the difference between Total Cache Size and Max Cache Size items. This list is sorted such that it will contain the most LRU items at the end of execution. Items in the list are then deleted.

Memory Profiling with ~1million files:

============================================

Not memory optimized

Before Cleanup : 13.61 MB
After File Collection : 76.73 MB
After Completion : 460.99 MB

============================================

Low Memory:::Max Cache Size > CacheSize (Best Case)

Before Cleanup : 13.57 MB
First Pass:
After File Collection : 61.81 MB
After Completion : 66.63 MB
Second Pass (Doesn't happen): 
After File Collection : -- MB
After Completion : -- MB

============================================

Memory Optimized::: With Max Cache Size == 0 (Worst Case)

Before Cleanup : 13.57 MB
First Pass:
After File Collection : 76.92 MB
After Completion : 85.08 MB
Second Pass : 
After File Collection : 78.81 MB
After Completion : 384.03 MB

============================================

Low Memory:::Max Cache Size == 19% of total files are expired

Before Cleanup : 13.32 MB
First Pass:
After File Collection : 76.56 MB
After Completion : 97.76 MB
Second Pass : 
After File Collection : 72.46 MB
After Completion : 131.41 MB

============================================

@coveralls
Copy link

coveralls commented Jun 11, 2018

Coverage Status

Coverage decreased (-0.1%) to 90.219% when pulling 8973523 on feature/low-memory-option into bf012ab on master.

Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, a couple comments on self redundancy.


Was this helpful? Let us know!


cleanup_low_mem(dryRun = true, expireDuration, minFileAccessTime, maxCacheSize) {

const self = this;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This self isn't needed as all its usages could use this directly as they are contained within arrow functions.

}

cleanup_fast(dryRun = true, expireDuration, minFileAccessTime, maxCacheSize) {
const self = this;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't seem to be able to find an instance of self here that couldn't just use this directly. There are no function expressions, only arrow functions, which will not have their own context, and can use this. This line should be safe to delete if you change the self usages to this.

@BrettKercher
Copy link
Contributor Author

Fixed the self redundancy issues, and integrated the changes from @stephen-palmer 's reliability manager: I created the delete_cache_item method to perform the file deletion process, since it became a little more involved than just calling fs.unlink

@stephen-palmer
Copy link
Contributor

My general question/feedback: I would prefer that we just have one way of doing things, instead of adding this as a "low memory" option .. is the algorithm so much slower that we need to have a "fast" path?

@BrettKercher
Copy link
Contributor Author

The low memory algorithm uses 2 passes, but each pass is about the same time as the "fast" method. So it takes about twice the time. I'll take the old way out, if you think that's an acceptable increase.

Realistically, this will be running in daemon mode and you'll never really know the difference anyway.

@stephen-palmer
Copy link
Contributor

I might be missing something but it seems like you could combine the passes so you only read the dir once - if the file was deleted based on age, delete and continue to the next. Otherwise, consider it for deletion based on maxSize.

@BrettKercher
Copy link
Contributor Author

The first pass is mainly to get the total CacheSize. Is there a better way to get that?

@stephen-palmer
Copy link
Contributor

right, that's required. No there isn't a better way to do that. So anyhow, even with the two dir passes, this should be fine as the default/only code path for cleanup.

@stephen-palmer stephen-palmer merged commit 51cfa4f into master Jun 20, 2018
@stephen-palmer stephen-palmer deleted the feature/low-memory-option branch December 10, 2018 19:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants