Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exclude @PaxHeader from archive when exctracting #7172

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bpinsard
Copy link
Contributor

PR checklist

Fixes #7171 by excluding @PaxHeader from extracted files

Thanks for contributing!

@bpinsard bpinsard changed the base branch from maint to master November 11, 2022 21:05
@bpinsard bpinsard marked this pull request as draft November 11, 2022 21:10
@adswa adswa added the semver-patch Increment the patch version when merged label Nov 12, 2022
@@ -55,7 +55,7 @@ def decompress_file(archive, dir_):
if len(suffixes) > 1 and suffixes[-2] == '.tar':
# we have a compressed tar file that needs to be fed through the
# decompressor first
cmd = '7z x {} -so | 7z x -si -ttar'.format(quote_cmdlinearg(archive))
cmd = '7z x {} -so | 7z x "-x.\.\@PaxHeader" -si -ttar'.format(quote_cmdlinearg(archive))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpinsard:

I'm under the impression that -x.\.\@PaxHeader refers to filenames on windows. I think the \ should be os.path.separator instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpinsard Are you planning to finish this or should I take it? I think above mentioned change is all it takes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weirdly, I managed to make it work once, but now I can't get it to work. Not sure this is the right solution for now. I think I will be migrating to something else that tar.gz, to allow "random" access similarly to what is done with the ORA 7z archives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact maybe the second 7z should be replaced with tar itself to be able to handle these metadatas. Unless there is something I am missing that requires 7z or is it to reduce number of dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • PR might be worth to be against maint
  • comment better be placed to note why that file is to be excluded etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or is it to reduce number of dependencies.

IIRC that was the point, yes. However, it seems to me it's fine to depend on tar for a dataset that uses tarballs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver-patch Increment the patch version when merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PaxHeader in tar.gz file create problems with add-archive-content
4 participants