Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zarr_content_read: include sizes for the "files" (and possibly "directories") #925

Open
yarikoptic opened this issue Feb 25, 2022 · 8 comments · Fixed by dandi/dandi-schema#120 or #991
Labels
released This issue/pull request has been released.
Milestone

Comments

@yarikoptic
Copy link
Member

motivation: dandi/dandi-cli#923 (comment) -- in dandi-cli, to possibly avoid relatively expensive checksumming of the content to decide either a (changed) file within zarr needs to be reuploaded, we first check the size(s) of the file. If that one changed -- we know that we need to reupload even without computing its checksum.

It would be helpful if that API endpoint also returned sizes of the files (if it can do it without extra runtime penalty)

@dchiquito
Copy link
Contributor

I am addressing this in #937

@dchiquito
Copy link
Contributor

Reopening this to track the work required to update the dandischema version to 0.6.0, which includes dandi/dandi-schema#120

@jwodder
Copy link
Member

jwodder commented Apr 6, 2022

Relatedly, if the endpoint also included "last modified" dates and S3 keys (and possibly also S3 version IDs) for Zarr entries, that would help reduce the number of requests needed by the backup script.

@dandibot
Copy link
Member

dandibot commented Apr 6, 2022

🚀 Issue was released in v0.2.3 🚀

@dandibot dandibot added the released This issue/pull request has been released. label Apr 6, 2022
@jwodder
Copy link
Member

jwodder commented Apr 6, 2022

@dchiquito I don't see how #991 resolves the original issue here. How does one get a file's size from a /zarr/{zarr_id}.zarr/{path} response?

@yarikoptic
Copy link
Member Author

yarikoptic commented Apr 6, 2022

can't test ATM due to #1004 , will reopen just so it doesn't RiP while should be alive and chatty (feel welcome to re-close upon demonstration/confirmation of the effect achieved) ;)

@waxlamp
Copy link
Member

waxlamp commented Jul 5, 2022

@yarikoptic, @jwodder, is this still important? How high of a priority is it at the moment?

@jwodder
Copy link
Member

jwodder commented Jul 5, 2022

@waxlamp We removed the part of the upload code that checks local vs. remote file sizes (largely because this issue wasn't implemented). On the other hand, the download code still needs to makes a separate HEAD request before each Zarr entry download in order to get the size and modification time.

I'd say it's no longer a priority, but I suppose it would be nice to have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
released This issue/pull request has been released.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants