-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in CachingFileSystem
(possibly caused by pickle.load
?)
#825
Comments
CachingFileSystem
( possibly caused by pickle.load
? )CachingFileSystem
(possibly caused by pickle.load
?)
I wonder, how big is the JSON file after savings 10k+ files into it? I suspect that the caching system simply doesn't scale very well to these sizes. We have thought about using other storage such as sqlite3 DB files or the filesystem itself ("sidecar files"). I assume it takes a long time just to list the cache directory. |
The cache file size on disk is root@trollito-7b8b87d679-jm8c5:/usr/src/app# ls /tmp/files | wc -l
1379
root@trollito-7b8b87d679-jm8c5:/usr/src/app# du -h /tmp/files/
400M /tmp/files/
root@trollito-7b8b87d679-jm8c5:/usr/src/app# du -h /tmp/files/cache
424K /tmp/files/cache
root@trollito-7b8b87d679-jm8c5:/usr/src/app# pmap 1 | tail -n 1 | awk '/[0-9]K/{print $2}'
23044924K BTW, the cache file it is not encoded as JSON. It is encoded using the pickle module. |
Sorry, this dropped off my radar. Yes, it probably makes sense to not rely on pickle for anything that might persist long term and be used by multiple pythons. I don't know if that does anything to fix the apparent memory issue. Would you like to make the appropriate PR? It should still allow to read a pickle, so that we don't break anybody's workflow. |
I've been using
fsspec
caching support:I noticed that after opening many files (10k +) my application's memory consumption goes to the roof - only a restart causes memory deallocation.
After some profiling with
tracemalloc
these are the 2 top consumers:If
filecache
is not used, memory consumption is back to normal.I don't know much about the
pickle
module. Could it be causing the memory leak ?The text was updated successfully, but these errors were encountered: