-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'tarfile.StreamError: seeking backwards is not allowed' when extract symlink #57009
Comments
When you extractall a tarball containing a symlink in stream mode ('r|'), an Exception happens: Traceback (most recent call last):
File "./test_extractall_stream_symlink.py", line 26, in <module>
tar.extractall(path=destdir)
File "/usr/lib/python3.2/tarfile.py", line 2134, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir())
File "/usr/lib/python3.2/tarfile.py", line 2173, in extract
set_attrs=set_attrs)
File "/usr/lib/python3.2/tarfile.py", line 2249, in _extract_member
self.makefile(tarinfo, targetpath)
File "/usr/lib/python3.2/tarfile.py", line 2289, in makefile
source.seek(tarinfo.offset_data)
File "/usr/lib/python3.2/tarfile.py", line 553, in seek
raise StreamError("seeking backwards is not allowed")
tarfile.StreamError: seeking backwards is not allowed You can reproduce the bug with this snippet of code: TEMPDIR='/tmp/pyton_test' If source_file is added before target_file, there is no Exception raised. But it still raised when you create the same tarball with GNU tar. |
ping. |
All works to me without exception in 2.7, 3.3 and 3.4. |
This seems to be a similar to bpo-10761 where symlinks are not being overwritten by TarFile.extract but is only an issue in streaming mode and only in python3. To reproduce, attempt to extract a symlink from a tarfile opened with 'r|' and overwrite an existing file. Here's a simple scripts that demonstrates this behavior adapted from Aurélien's. #!/usr/bin/python import os
import shutil
import sys
import tempfile
import tarfile
def main():
tmpdir = tempfile.mkdtemp()
try:
os.chdir(tmpdir)
source = 'source'
link = 'link'
temparchive = 'issue12800'
# create source
with open(source, 'wb'):
pass
os.symlink(source, link)
with tarfile.open(temparchive, 'w') as tar:
tar.add(source, arcname=os.path.basename(source))
tar.add(link, arcname=os.path.basename(link))
with open(temparchive, 'rb') as fileobj:
with tarfile.open(fileobj=fileobj, mode='r|') as tar:
tar.extractall(path=tmpdir)
finally:
shutil.rmtree(tmpdir)
if __name__ == '__main__':
sys.exit(main()) On python 3.3.2 I get the following results: $ python3.3 issue12800.py
Traceback (most recent call last):
File "issue12800.py", line 32, in <module>
sys.exit(main())
File "issue12800.py", line 27, in main
tar.extractall(path=tmpdir)
File "/usr/lib64/python3.3/tarfile.py", line 1984, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir())
File "/usr/lib64/python3.3/tarfile.py", line 2023, in extract
set_attrs=set_attrs)
File "/usr/lib64/python3.3/tarfile.py", line 2100, in _extract_member
self.makelink(tarinfo, targetpath)
File "/usr/lib64/python3.3/tarfile.py", line 2181, in makelink
os.symlink(tarinfo.linkname, targetpath)
FileExistsError: [Errno 17] File exists: '/tmp/tmpt0u1pn/link' On python 3.4.1 I get the following results: $ python3.4 issue12800.py
Traceback (most recent call last):
File "/usr/lib64/python3.4/tarfile.py", line 2176, in makelink
os.symlink(tarinfo.linkname, targetpath)
FileExistsError: [Errno 17] File exists: 'source' -> '/tmp/tmp3b96k5f0/link'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "issue12800.py", line 32, in <module>
sys.exit(main())
File "issue12800.py", line 27, in main
tar.extractall(path=tmpdir)
File "/usr/lib64/python3.4/tarfile.py", line 1979, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir())
File "/usr/lib64/python3.4/tarfile.py", line 2018, in extract
set_attrs=set_attrs)
File "/usr/lib64/python3.4/tarfile.py", line 2095, in _extract_member
self.makelink(tarinfo, targetpath)
File "/usr/lib64/python3.4/tarfile.py", line 2187, in makelink
targetpath)
File "/usr/lib64/python3.4/tarfile.py", line 2087, in _extract_member
self.makefile(tarinfo, targetpath)
File "/usr/lib64/python3.4/tarfile.py", line 2126, in makefile
source.seek(tarinfo.offset_data)
File "/usr/lib64/python3.4/tarfile.py", line 518, in seek
raise StreamError("seeking backwards is not allowed")
tarfile.StreamError: seeking backwards is not allowed |
The problem is in TarFile.makelink() in Lib/tarfile.py. It calls os.symlink() to create the link, which fails because the link already exists and triggers the exception handler. The exception handler then tries to create the linked file under the assumption (per source code comments) that the link creation failed because the system doesn't support symbolic links. The file creation then fails because it requires seeking backwards in the archive. |
Is there anything I can do to help get this landed? The PR in github works for me. |
Hi Chris, which exception did you got exactly? Was it caused by the r| mode or by a symlink (or file) already existing? |
It's caused by the combination of the symlink existing, and having the tarfile opened in r| mode. If I run the attached test file in a fresh directory, I get the following exception: raceback (most recent call last): During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "../test.py", line 12, in <module>
tf.extractall()
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2024, in extractall
self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2065, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2145, in _extract_member
self.makelink(tarinfo, targetpath)
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2237, in makelink
self._extract_member(self._find_link_target(tarinfo),
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2137, in _extract_member
self.makefile(tarinfo, targetpath)
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 2176, in makefile
source.seek(tarinfo.offset_data)
File "/home/catlee/.pyenv/versions/3.8.2/lib/python3.8/tarfile.py", line 513, in seek
raise StreamError("seeking backwards is not allowed")
tarfile.StreamError: seeking backwards is not allowed |
Strange fact, this was already fixed in 011525e (which closes bpo-10761, nice spot Andrew) but was lost during a merge in 0d28a61: $ git show 0d28a61d23
commit 0d28a61d233c02c458c8b4a25613be2f4979331e
Merge: ed3a303548 d7c9d9cdcd
$ git show 0d28a61d23:Lib/tarfile.py | grep unlink # The merge commit does no longer contains the fix
$ git show ed3a303548:Lib/tarfile.py | grep unlink # The "left" parent does not contains it neither
$ git show d7c9d9cdcd:Lib/tarfile.py | grep unlink # The "right" one does contains it.
os.unlink(targetpath)
os.unlink(targetpath) Stranger fact, the test was not lost during the merge, and still lives today (test_extractall_symlinks). Happen that the current test is passing because it's in part erroneous, instead of trying to create a symlink on an existing one, it creates a symlink far far away: (Pdb) p targetpath Aditionally it passes anway because tar.errorlevel equals 1, which means the error is logged but not raised. With the following small patch: --- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -1339,10 +1339,10 @@ class WriteTest(WriteTestBase, unittest.TestCase):
f.write('something\n')
os.symlink(source_file, target_file)
with tarfile.open(temparchive, 'w') as tar:
- tar.add(source_file)
- tar.add(target_file)
+ tar.add(source_file, arcname="source")
+ tar.add(target_file, arcname="symlink")
# Let's extract it to the location which contains the symlink
- with tarfile.open(temparchive) as tar:
+ with tarfile.open(temparchive, errorlevel=2) as tar:
# this should not raise OSError: [Errno 17] File exists
try:
tar.extractall(path=tempdir) the error is raised as expected: FileExistsError: [Errno 17] File exists: '/home/mdk/clones/python/cpython/@test_649794_tmpæ-tardir/testsymlinks/source' -> '/home/mdk/clones/python/cpython/@test_649794_tmpæ-tardir/testsymlinks/symlink' I'm opening an PR to restore this as it was intended. |
See also another duplicate of this issue, bpo-40049. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: