New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executable bit in ZIP-archives get thrown away when reading from stdin. #1106
Comments
Zip archives contains two different ways to describe the content: The short version is that this is an inherent issue with streaming of zip files and something that won't be fixed. |
According to http://unix.stackexchange.com/questions/487338#487371, this also happens if Is there any standard to ZIP which information (per-entry header or central directory) is more to trust? |
|
Because ISO files are not streamable in most situations in a meaningful way. File attributes on the other hand are often enough absend in zip files. |
As Joerg pointed out, there are basic limitations with some of the formats we deal with:
As a workaround, libarchive's Zip support includes an experimental extension (developed in conjunction with the Info-Zip maintainers) that puts more complete metadata with each entry. I hope to enable this by default at some point. In theory, the streaming Zip reader could read the full metadata when it does get to the end and update all the files. This would require some careful rework of the Zip reader and probably changes to the logic that writes files to disk. In essence, every file would get "written to disk" twice: Once with full data and partial metadata, again with full metadata and no data. |
The Zip standard is here: If you study this carefully, you'll notice that the file permissions are only stored in the central directory. All other metadata should be the same. The |
@kientzle quick question -- when libarchive is streaming a .zip file and just using the partial metadata, how does it deal with the possibility mentioned here that some files could not be actually listed in the central directory and thus should not be extracted, as well as the possibility that there is extra data between file chunks / before the first file chunk? Does it just assume that the zip file isn't in these special cases, or does it try to read the central directory at the end to somehow correct what has already been extracted? |
In theory, libarchive could stream Zip archives by extracting all the entries, then reading the central directory and using that information to edit the data on disk. It does not currently do this. As a result, it cannot fully handle some of the pathological cases you describe while performing a streaming extraction. Libarchive does have error-recovery logic that can to a limited extent deal with garbage data appearing in the archive (between entries or before the first entry). You can see the details starting around line 3146 of the
|
I encountered that the command
bsdtar
from the packagelibarchive
(under Arch Linux, at least) does throw away executable bits of files in.zip
-archives when reading fromstdin
, but not when directly working on the file.On
.tar
-archives it preserves the executable bit also when reading from stdin.bsdtar --version
:bsdtar 3.3.3 - libarchive 3.3.3 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 liblz4/1.8.2 libzstd/1.3.5
.Test case:
I made a test archive
http://felics.kettenbruch.de/files/archive_executable_bit_test/archive_exevutable_bit_test.zip
which contains two files in a subdirectory. One file is executable, the other not.Extracting directly:
shows
The executable bit for
executable.sh
is present here.Reading from
stdin
:shows
The executable bit for
executable.sh
is thrown away here..tar
-archive:As a comparison, for a
.tar
-archive, the executable bit in the archive is also honoured works also when reading fromstdin
:shows
Expected behavious:
The text was updated successfully, but these errors were encountered: