-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Directly open compressed files #3443
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is a first batch of comments
910f75c
to
0abb298
Compare
feeea19
to
2d7e733
Compare
For reference, here is a simplistic benchmark:
This uses the following code:
Note that the first access is much slower as the entire tar file has to be read from disk. |
4f60c85
to
22a00f2
Compare
2542744
to
d2210b9
Compare
@yt-fido test this please |
@yt-fido test this please |
1 similar comment
@yt-fido test this please |
pre-commit.ci autofix |
I don't really understand why it fails on Jenkins but it's really hard to find out without having direct access to it (it just doesn't seem to mount properly). I'd be happy to somehow mark to feature as experimental for the time being and see what happens with users on real cases. Wdyt @munkm @matthewturk @neutrinoceros? |
I'd get behind such a plan, as long as it is impossible to use the feature without being warned that it's still experimental. It would also need to be clearly stated in docs and release notes, but this can be done later AFAIC. |
That plan sounds good to me. |
f177e60
to
831c034
Compare
@matthewturk and @neutrinoceros, I have added a warning and I am skipping the tests on Jenkins (at least, that's what I am hoping) as they are failing with a timeout and I cannot understand why (there must be some permission error, ...) |
aa88a81
to
54fe166
Compare
I see a minor flake8 issue so disabling auto-merge while I fix that. |
now mypy is unhappy (I think this branch is older than type-checking so it's not too surprising). @cphyc, do you want to tackle these nits yourself ? Otherwise I'm happy to step in. |
PR Summary
This PR allows the user to directly load a dataset stored in a .tar.gz file (or any foramt supported by ratarmount).
It works by first mounting the .tar file as a read-only filesystem, which can then be used directly in yt.
PR Checklist
Example
# Create .tar file tar czf output_00080.tar.gz output_00080
Now, one can load the dataset in the archive directly using