Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add fuse draft #51

Closed
wants to merge 1 commit into from
Closed

[WIP] Add fuse draft #51

wants to merge 1 commit into from

Conversation

mrocklin
Copy link
Contributor

This is mostly here for conversation. This is nowhere near working yet.

@mrocklin
Copy link
Contributor Author

OK, some questions:

  1. What is the right way to implement getattr, a method that expects to get information about a path. That path may point to either a file or a directory
  2. In particular, GCSFileSystem.info on a directory doesn't return what we want it to return
  3. Any thoughts on permissions? Are there ways to see if something is publicly visible or not? Or should we just default to 0o600 for all things?

return self.fs.rmdir(self._path(path))

def mkdir(self, path, mode):
return os.mkdir(self._full_path(path), mode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os->self.fs?
Will be no-op on none top-level directories.

@@ -59,6 +59,13 @@ def test_info(token_restore):
assert gcs.info(a) == gcs.ls(a, detail=True)[0]


@my_vcr.use_cassette(match=['all'])
def test_info_directory(token_restore):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to set the various env variables in the test settings module and run with recording on (will produce a YAML file) before tests will pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this won't work regardless

In [1]: from gcsfs import GCSFileSystem

In [2]: gcs = GCSFileSystem()

In [3]: gcs.ls('pangeo-data')
Out[3]: 
['pangeo-data/newman-met-ensemble/',
 'pangeo-data/newmann-met-ensemble-netcdf/',
 'pangeo-data/test997/',
 'pangeo-data/test998/',
 'pangeo-data/test999/']

In [4]: gcs.info('pangeo-data/test997/')
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-4-210f02b52f31> in <module>()
----> 1 gcs.info('pangeo-data/test997/')

/home/mrocklin/Software/anaconda/lib/python3.6/site-packages/gcsfs/core.py in info(self, path)
    468             return out[0]
    469         else:
--> 470             raise FileNotFoundError(path)
    471 
    472     def url(self, path):

FileNotFoundError: pangeo-data/test997/

@martindurant
Copy link
Member

Probably to do this right, directory listings should be made using prefix/delimiter as happens in s3fs. Currently, the only way to get a list a "directories" is to list all files in a bucket, and do some text processing on the names; this is implemented when listing, and so it could be used for info too. That complicated things though, because there is an intent in #22 to use HEAD on known full paths rather than bucket listing, accounting for cases where a bucket contains a large number of keys.
In addition, all of this might conflict with the serialisation problem for the directories cache - the more information we keep, the less we need to do new listings (i.e., more api calls), but the bigger the file-system object gets.

@martindurant martindurant mentioned this pull request Jan 2, 2018
Merged
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants