-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FUSE capability #111
Comments
Note that there are several projects already in existence that claim to have this functionality, so do not plan to resolve this issue until someone asks for it. |
I am working with a lot of NetCDF files on S3 and am quite interested in this. How would this be different from s3fs-fuse? Similarly, how is dask/gcsfs different from gcsfuse? Say, would it lead to better I/O performance for xarray? Besides xarray, I also want to use the NetCDF Fortran API to read data on S3 (the input data for our group's GEOS-Chem model). Each NetCDF file is pretty small (~100 MB) and s3fs-fuse seems to perform OK. Do you expect dask/s3fs to increase or decrease the performance of the NetCDF Fortran/C API, compared to s3fs-fuse? |
@JiaweiZhuang can you quantify how well s3fs-fuse does? I would be very curious to see numbers on how fast you can get a small bit of data and how fast you can get a large amount of data using this method. In principle there is no difference between what is proposed here and existing fuse systems. This would be redundant. That being said, it would be nice to have easy access to build modify behavior. It might end up being a good idea to write code to make HDF on FUSE work decently across a few cloud object stores. Excerpt from http://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cloud
|
I am testing s3fs-fuse and will keep you posted. Preliminary results: For a file size of 100~200 MB, the latency is close to reading data from EBS volumes. But reading 1 GB data is significantly slower. Even just getting the metadata by |
What are you testing when you test for latency? I would expect this to be something like getting a single value from an array, |
Replication of functionality in fsspec/gcsfs#53
The text was updated successfully, but these errors were encountered: