New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many Disk Cache Instances Cause Inode Overflow After Consistent Use #220
Comments
Similar issue mentioned in |
I like the username, @SoundsSerious . And I agree now. If it’s bitten two people then let’s fix it. Can you take a look at the Cache.check code? There’s a fix=True parameter that I think cleans up empty directories. It won’t work for you though if you’re worried about down time. The “fix” parameter will cause it to take an exclusive lock on the cache which might lock things up for an indeterminate amount of time. So I think there are three ideas worth exploring here:
I’ll need to give each of these some thought. Your input is welcome. |
As a temporary workaround, consider running “ find {CACHE_PATH} -type d -empty -delete”. That’ll clean up the empty dirs. There may be a race condition when a key is set though. |
How big are your caches? Are they the default 1GB or bigger? I’m thinking of using a single subdir with just 256 possibilities. How would that affect your setup? |
Follow-up thought, maybe the directory structure should be a function of the total cache size and min file size. With the default settings, we need only 2 characters in the dir layout. But at a terabyte you may want the current 2x2. I’m also unsure how to cleanup the dirs without causing a race condition. I think on cache set, there’d have to be retry logic for directory creation in case a different process deleted the directory in between steps. |
Hey @grantjenks! Thank you for all the thoughtful feedback! I realize this is definitely a challenging problem, since it seems like most of the disk cache work is done on access unless i'm mistaken. Especially with all the sub optimizations that probably are going on for a wide number of platforms / use cases. My scenerio is like this, i have a web server with some complicated responses that need to be cached for a variety of sources (maybe 100 caches). None of them is all that substantial 1GB would probably be the max size. Your temporary solution is a good one, didn't think of using find! I'll be using that to solve the issue for now so no rush on the other solutions. I'm also using cache.check occassionally so maybe i can test out the fix parameter, although for bigger caches that can cause some Some thoughts on those:
Anyways just some thoughts, i think you could close this issue if you'd like since the cache.check fix will work, as well as a find command on a cron routine. |
I created a new branch at origin/cleanup-dirs with a WIP of (1). The plan is to keep the two levels of directories but delete the lower level if it's empty. So you could end up with 256 empty dirs but that's not so bad as 65k. |
I like that, its simple and efficient! Yea we can definitely handle 256 inodes! In the mean time your find empty directories bash command is working great as a nightly cron command. |
I'll see if i can give this a test this week. Thanks again! |
Hey finally found the time to test this. Pulled in the latest code and tried this stress test:
|
I get this error:
|
This seems like it would be a worst case scenerio with rapid sequential access, but admittedly my knowledge of the system could be better. Maybe this could be solved with a custom wrapper |
My apologies @SoundsSerious . The origin/cleanup-dirs is a work-in-progress (WIP) and does not work yet. I may make some more progress tonight. |
I've updated the origin/cleanup-dirs with working changes. There are tests for the new functionality and both levels of directories are cleaned up. |
Fixed by #222 . To be released this month |
@grantjenks Hey Nice work! Hopefully I can give this a test run in the coming week |
Hi Grant,
I've been using disk cache on my AWS systems for probably a year, and i love it quite alot! I have many types of data i'm caching, thus many diskcaches (~100), and i'm using a high performance low capacity NVME disk to keep things fast.
Because of the low size of the disk the number of inodes is 3.2M. Today with all the folders created by diskcache that aren't cleaned I got an
Error 28 - no space left on device
.Is there a recommended way to clean up a disk cache filesystem? It creates a ton of references that are never cleaned or managed. Could we ensure that all previous references are used before assigning new ones?
Deleting its contents takes quite a while and i can't spare that downtime. Is there a cleanup callback, behavior we could add when a file is removed from a folder, and that folder no longer has references? In general I would have better control and knowledge of what is being purged / cleaned up.
The text was updated successfully, but these errors were encountered: