New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exclude @PaxHeader from archive when exctracting #7172
base: master
Are you sure you want to change the base?
Conversation
@@ -55,7 +55,7 @@ def decompress_file(archive, dir_): | |||
if len(suffixes) > 1 and suffixes[-2] == '.tar': | |||
# we have a compressed tar file that needs to be fed through the | |||
# decompressor first | |||
cmd = '7z x {} -so | 7z x -si -ttar'.format(quote_cmdlinearg(archive)) | |||
cmd = '7z x {} -so | 7z x "-x.\.\@PaxHeader" -si -ttar'.format(quote_cmdlinearg(archive)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm under the impression that -x.\.\@PaxHeader
refers to filenames on windows. I think the \
should be os.path.separator
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bpinsard Are you planning to finish this or should I take it? I think above mentioned change is all it takes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weirdly, I managed to make it work once, but now I can't get it to work. Not sure this is the right solution for now. I think I will be migrating to something else that tar.gz, to allow "random" access similarly to what is done with the ORA 7z archives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact maybe the second 7z
should be replaced with tar
itself to be able to handle these metadatas. Unless there is something I am missing that requires 7z
or is it to reduce number of dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- PR might be worth to be against
maint
- comment better be placed to note why that file is to be excluded etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or is it to reduce number of dependencies.
IIRC that was the point, yes. However, it seems to me it's fine to depend on tar for a dataset that uses tarballs.
PR checklist
Fixes #7171 by excluding @PaxHeader from extracted files
Thanks for contributing!