Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Object storage support #10

Open
joshmoore opened this issue Jul 30, 2024 · 2 comments
Open

RFE: Object storage support #10

joshmoore opened this issue Jul 30, 2024 · 2 comments

Comments

@joshmoore
Copy link

Have you e.g. considered reading/writing from/to S3?

In the resave.py script we are working on for the challenge, options like these:

time ./resave.py \
        zarr/v0.4/idr0001A/2551.zarr \
        --input-bucket=idr \
        --input-endpoint=https://uk1s3.embassy.ebi.ac.uk \
        --input-anon \
        ...

prevent the need to download the data locally.

I'm currently working on using zarrs_reencode but generating a script:

./resave.py zarr/v0.4/idr0001A/2551.zarr --output-script ...

which produces a script per Zarr array of the form:

zarrs_reencode --chunk-shape 1,1,1040,1376 --shard-shape 2,16,1040,1376 --dimension-names c,z,y,x --validate \
    zarr/v0.4/idr0001A/2551.zarr/C/3/0 OUTPUT/C/3/0

but this of course won't work when the source or target are on S3.

@LDeakin
Copy link
Owner

LDeakin commented Jul 31, 2024

I've added read support for HTTP stores in zarrs_tools version 0.5.5 with #11.

zarrs_reencode https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0001A/2551.zarr/C/3/0/0 2551.zarr/C/3/0/0

Writing to remote stores is not supported, and I am not sure it is worth adding support given the complexity of supporting many different services + auth. Eventually zarrs itself will have a Python wrapper for more flexible usage.

P.S. That location reports missing chunks as permission denied when interpreted as an S3 endpoint. I'm not sure if that is standard. HTTP is fine.

@joshmoore
Copy link
Author

I've added read support for HTTP stores in zarrs_tools version 0.5.5 with #11.

🤯 Amazing. I'll give it a try ASAP.

Writing to remote stores is not supported, and I am not sure it is worth adding support given the complexity of supporting many different services + auth.

Understood. Certainly one tricky aspect of all of this.

That location reports missing chunks as permission denied when interpreted as an S3 endpoint. I'm not sure if that is standard.

Heh. Since there's not really a standard, I agree. :) It's definitely my experience that each provider of "S3" has a slightly different take.

HTTP is fine.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants