-
Notifications
You must be signed in to change notification settings - Fork 10
feat: Listings Caching implementation using fsspec dircache #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cwognum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice optimization @lmtroper !
There is a more official implementation that simplifies the code further. It's not very well documented, but what I did was to go through an existing implementation to understand how this can be used.
The good news - You reached a very similar solution! Great minds think alike! 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks great!
However, in your experience it has been slower, right?
I'm not sure why it would be... Maybe leave the PR open for now while we investigate?
Changelogs
Implementing a listings caching system using fsspec's dircache to reduce the number of
lscalls made during zarr upload/download.The fsspec
info()method was the method that was resulting in all of thelscalls being made during upload so I tried targeting this method by using a listings caching. Given that listings caching is activated, each call tolschecks the path for cached listings and if none are found, it will check the listings for the parent path. This was an idea to solve the problem of checking child paths even though we learn that the parent path is empty:Profiling with listing caching: