Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tarfile creates output that appears to omit files #85091

Closed
mcr mannequin opened this issue Jun 8, 2020 · 2 comments
Closed

tarfile creates output that appears to omit files #85091

mcr mannequin opened this issue Jun 8, 2020 · 2 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mcr
Copy link
Mannequin

mcr mannequin commented Jun 8, 2020

BPO 40914
Nosy @zware, @mcr

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2020-06-15.19:11:24.717>
created_at = <Date 2020-06-08.21:15:00.298>
labels = ['3.7', 'invalid', 'type-bug', 'library']
title = 'tarfile creates output that appears to omit files'
updated_at = <Date 2020-06-15.19:11:24.716>
user = 'https://github.com/mcr'

bugs.python.org fields:

activity = <Date 2020-06-15.19:11:24.716>
actor = 'zach.ware'
assignee = 'none'
closed = True
closed_date = <Date 2020-06-15.19:11:24.717>
closer = 'zach.ware'
components = ['Library (Lib)']
creation = <Date 2020-06-08.21:15:00.298>
creator = 'mcr314'
dependencies = []
files = []
hgrepos = []
issue_num = 40914
keywords = []
message_count = 2.0
messages = ['371045', '371047']
nosy_count = 2.0
nosy_names = ['zach.ware', 'mcr314']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue40914'
versions = ['Python 3.6', 'Python 3.7']

@mcr
Copy link
Mannequin Author

mcr mannequin commented Jun 8, 2020

The simplest tarcopy program seems to result in output that GNU tar, bsdtar, and even Emacs tar-mode is unable to correctly process.
It appears that the resulting tar file is missing files, but examination of the raw output shows they might be there, but just corrupt.
GNU tar actually complains while reading the file.
https://github.com/mcr/python3-tar-copy-failure

has a test case. Here is the stupid code to reproduce it:

import tarfile
out = tarfile.open(name="./t2.tar", mode="w", format=tarfile.PAX_FORMAT)
with tarfile.open("./t1.tar") as tar:
    for file in tar.getmembers():
        print (file.name)
        out.addfile(file)
out.close()

This has been confirmed on python 3.6.9 (Ubuntu 18.04 LTS), and python 3.7.3 (Devuan Beowulf). It seems to omit different files on 32-bit and 64-bit systems.

@mcr mcr mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error 3.7 (EOL) end of life labels Jun 8, 2020
@zware
Copy link
Member

zware commented Jun 8, 2020

Note that TarFile.getmembers() is documented to return TarInfo objects, which are documented as explicitly not including file data. Try replacing out.addfile(file) with out.addfile(file, tar.extractfile(file)).

@zware zware closed this as completed Jun 15, 2020
@zware zware added the invalid label Jun 15, 2020
@zware zware closed this as completed Jun 15, 2020
@zware zware added the invalid label Jun 15, 2020
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant