-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-directory metadata cache. #57
Commits on Jan 26, 2018
-
Replace bucket listing with per-directory listing cache.
Refactor gcsfs to list file contents via prefixed bucket listing, rather than cached exhaustive bucket listing. In progress, but provides basic interface compatibility for walk, glob, ls, info. Intended to support re-addition of metadata caching via the _list_objects interface to provide prefix-specific listing caches. Update `info` to retrieve object info via object get. Add per-directory listing cache to GCSFS caching object metadata under the given directory. Resolves listing requests via cache, supporting walk/ls/glob/etc. Resolve `info` requests via cache if the parent directory has been listed, otherwise directly request object data. Updates cache invalidation logic to function on path prefixes, allowing object writes to invalidate their parent/sibling caches, rather than entire listing cache.
Configuration menu - View commit details
-
Copy full SHA for 590cdae - Browse repository at this point
Copy the full SHA 590cdaeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8449a38 - Browse repository at this point
Copy the full SHA 8449a38View commit details
Commits on Feb 14, 2018
-
Add decorator-based method tracing to `gcsfuse.GCSFS` and `core.GCSFileSystem` interface methods. Add `--verbose` command-line option to `gcsfuse` to support debug logging control.
Configuration menu - View commit details
-
Copy full SHA for 2aa6755 - Browse repository at this point
Copy the full SHA 2aa6755View commit details -
Bugfix prototype gcsfuse/per_dir_cache integration.
Prototype `per_dir_cache` integration for gcsfuse. Minimal fixup to gcsfuse to support directory listing.
Configuration menu - View commit details
-
Copy full SHA for b296fee - Browse repository at this point
Copy the full SHA b296feeView commit details -
Fix error in GCSFS::read() cache key resolution.
Configuration menu - View commit details
-
Copy full SHA for 360219d - Browse repository at this point
Copy the full SHA 360219dView commit details -
Fix flush-on-small-block-size errors.
Resolve error when writing small partitions via dask.bag.to_textfiles. Error occurs when partition size is below minimum GCS multipart upload size. Close logic in dask.bytes.core calls flush(force=False), followed by flush(force=True) on GCSFile. Current logic initializes multipart upload on non-force flush and attempts to write a non-final block below the minimum GCS upload block size. Fixup logic to skip flush if buffer size is below minimum upload size on non-forced flush. This, incidentally, avoids initialization of multipart upload in cases where final file size will be below the minimum block size, which was resulting in duplicate uploads for small output partitions. Add tracing logic to GCSFile file operations for debugging. Update `_tracemethod` to perform, optional, traceback logging at `DEBUG-1` log level.
Configuration menu - View commit details
-
Copy full SHA for 30be23c - Browse repository at this point
Copy the full SHA 30be23cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 542c7dc - Browse repository at this point
Copy the full SHA 542c7dcView commit details -
Configuration menu - View commit details
-
Copy full SHA for a55d274 - Browse repository at this point
Copy the full SHA a55d274View commit details -
First-stab attempt at fixing test/interface errors.
Updates `ls` to return non-prefix separated prefix search, needs to be verified? Should this be glob-like? Fix error from dask.bytes when read-only file is flushed. Fixup returning listing with "path" attribute.
Configuration menu - View commit details
-
Copy full SHA for 7ad092c - Browse repository at this point
Copy the full SHA 7ad092cView commit details -
Retry on requests failing due to `google.auth.exceptions.RefreshError`, partial resolution of fsspec#71.
Configuration menu - View commit details
-
Copy full SHA for 2ccf856 - Browse repository at this point
Copy the full SHA 2ccf856View commit details
Commits on Feb 16, 2018
-
Fix flush-on-small-block-size errors and lift block size constants.
Resolve error when writing small partitions via dask.bag.to_textfiles when partition size is below minimum GCS multipart upload size. Close logic in dask.bytes.core calls flush(force=False), followed by flush(force=True) on GCSFile. Current logic initializes multipart upload on non-force flush and then attempts to write a non-final block below the minimum GCS upload block size. Fixup logic to skip flush if buffer size is below minimum upload size on non-forced flush and instead issue a warning. This, incidentally, avoids initialization of multipart upload in cases where final file size will be below the minimum block size, which was resulting in duplicate uploads for small output partitions. Update core.py to lift GCS block size limits into module level constants. Replace use of constants in core.py with symbolic names.
Configuration menu - View commit details
-
Copy full SHA for 9a8537f - Browse repository at this point
Copy the full SHA 9a8537fView commit details -
Defer multipart if simple upload possible, relax read chunk size.
From fsspec#73 review. Defer multipart upload if a simple upload may be at the specified block size on non-forced flush. Minor reorganization of `flush` logic to group error handling vs deferral. Relax block size restrictions on fetch, no longer aligning `range`-ed fetch requests to block boundaries. Fix minor logging error in `_fetch`.
2Configuration menu - View commit details
-
Copy full SHA for 05928e4 - Browse repository at this point
Copy the full SHA 05928e4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a15272c - Browse repository at this point
Copy the full SHA a15272cView commit details -
Configuration menu - View commit details
-
Copy full SHA for d67798b - Browse repository at this point
Copy the full SHA d67798bView commit details -
Update cache semantics, non-expire and no refresh on missing object.
Updates GCSFileSystem cache configuration. Set cache as non-expiring in default configuration, but continue to allow configurable cache timeout. Do not bypass cache on `_get_object` calls if object is not present in cache listing. Updates GCSFileSystem docstring to include description of caching semantics.
Configuration menu - View commit details
-
Copy full SHA for ba83a10 - Browse repository at this point
Copy the full SHA ba83a10View commit details -
Reduce verbosity of test tracing.
Adds explict flag to control stacktrace debugging for traced methods. Reduces log size on test failures.
Configuration menu - View commit details
-
Copy full SHA for a4989a0 - Browse repository at this point
Copy the full SHA a4989a0View commit details -
Cleanup
walk
and fix bucketinfo
calls.From review, cleanup `walk` implementation. Fix pseudodir creation on bucket-level `info` call. Remove `norm_path` todo.
Configuration menu - View commit details
-
Copy full SHA for fc857a3 - Browse repository at this point
Copy the full SHA fc857a3View commit details -
Martin Durant committed
Feb 16, 2018 Configuration menu - View commit details
-
Copy full SHA for 1f1f5e4 - Browse repository at this point
Copy the full SHA 1f1f5e4View commit details -
Flush on open read-only file should be no-op, not error.
`flush` on an open, but read-only, file should be a no-op, not raise a ValueError, compare to builtin `open("read_only", "r").flush()`. Updates `flush` logic and adds test case covering flush behavior.
Configuration menu - View commit details
-
Copy full SHA for 579d38f - Browse repository at this point
Copy the full SHA 579d38fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3af77a1 - Browse repository at this point
Copy the full SHA 3af77a1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 83919c0 - Browse repository at this point
Copy the full SHA 83919c0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 51c5c21 - Browse repository at this point
Copy the full SHA 51c5c21View commit details -
Configuration menu - View commit details
-
Copy full SHA for 94c14d1 - Browse repository at this point
Copy the full SHA 94c14d1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2129f8c - Browse repository at this point
Copy the full SHA 2129f8cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8a02558 - Browse repository at this point
Copy the full SHA 8a02558View commit details -
1
Configuration menu - View commit details
-
Copy full SHA for ac0b4af - Browse repository at this point
Copy the full SHA ac0b4afView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1f95116 - Browse repository at this point
Copy the full SHA 1f95116View commit details