You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bpo-23228: The tarfile module crashes when tarfile contains a symlink and unpack directory contain it too
Files
symLinkBugRepro.tar.gz: Zip file containing a bash script and a python script for repro of tarfile symlink issue.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee=Noneclosed_at=<Date2019-01-06.09:58:26.950>created_at=<Date2018-12-13.15:22:39.471>labels= ['3.7', 'type-bug', 'library']
title='tarfile.extractall on existing symlink in Ubuntu overwrites target file, not symlink, unlinke GNU tar'updated_at=<Date2019-01-06.09:58:26.949>user='https://bugs.python.org/michaelbrandlaid-drivingeu'
In Ubuntu 16.04, with python 3.5, as well as custom built 3.6 and 3.7.1:
Given a file foo.txt (with content "foo") and a symlink myLink to it, packed in a tar, and a file bar.txt (with content "bar") with a symlink myLink to it, packed in another tar,
unpacking the two tars into the same folder (first foo.tar, then bar.tar) leads to the following behavior:
In GNU tar, the directory will contain:
foo.txt (content "foo")
bar.txt (content "bar")
myLink ->bar.txt.
Using python's tarfile however, the result of calling tarfile.extractall on the two tars will give:
foo.txt (content "bar")
bar.txt (content "bar")
myLink ->foo.txt.
Repro:
Unpack the attached symLinkBugRepro.tar.gz into a new folder
run > bash repoSymlink.bash (does exactly what is described above)
if the last two lines of the output are "bar" and "bar" (instead of "foo" and "bar"), then the content of foo.txt has been overwritten.
None of these issues target the issue at hand, however.
The problem lies in line 2201 of https://github.com/python/cpython/blob/master/Lib/tarfile.py:
The assumption is that any exception only comes from the os not supporting symlinks. But here, the exception comes from the symlink already existing, which should be caught separately. The correct behavior is then NOT to extract the member, but rather to overwrite the symlink (as GNU tar does).
The second aspect is replacing existing symlinks and other directory entries. This was implemented in 2.7 in bpo-10761 and bpo-12088 (only when replacing non-subdirectories with symbolic links and hard links), and is discussed more generally in bpo-19974.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: