-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Zip archive inconsistent" error, although supported by other unpackers #235
Comments
As libzip and unzip say, this is a bug in the zip archive. If this is not an option, but you can change the source code reading the zip archive, you could load the file into a memory buffer and overwrite the wrong extra field data there. Example code for that is in https://github.com/nih-at/libzip/blob/master/examples/in-memory.c |
Ok I think I got a better understanding of this now. I guess my real question was: There seem to be zip files out there in the wild with malformed extra fields (found other issues like this in other projects too). Other tools seem to be able to recover from this, assumingly by truncating or ignoring the extra field data. Couldn't the same be done with libzip? Perhaps with an opt-in zip_open flag for fault tolerant parsing? Or a flag that just skips EF data altogether if you're not interested in it? Fixing such files in memory would be possible, but to do this you'd have to implement a zip structure parser from scratch, right? |
I suggested fixing the zip archive only as a workaround for your immediate problem, I don't think it's a good solution in general. You would need a ZIP parser for that, yes. libzip uses some extra fields for basic features like zip64 or UTF-8 support. When ignoring extra fields completely, that would suffer. I'm not convinced that we should support incorrectly created ZIP archives. |
Regarding how frequent this ZIP_ER_INCONS error could be, I have some data collected over the last few months, with the number of data points in the order of millions. Error percentages are fairly stable over time. ZIP_ER_INCONS occurs for 0.5% of all the ZIP files. This is a significant proportion. But it is less than ZIP_ER_NOZIP, which occurs for 2.2% of all the ZIP files. |
@pwuertz I know this issue is a well over a year old, but do you remember where the The non-standard extra field looks deliberate because I see the identical invalid extra fields in both the local & central headers records in the zip file. That seems too much of a coincidence to mark down as corruption. See the two
|
Yea sure. See that
True, not "corruption" in the sense of random errors affecting the transfer or storage of data. But most probably a fault in the program that was used to create the zip file, i.e. an algorithm that deterministically creates invalid or "corrupt" archives. |
Thanks @pwuertz Interesting to note that the two byte extra ID used just happen to be ASCII "GC" -- that matches well with "GenCam". May be a deliberate non-standard use of the extra field that breaks the zip spec or just vestigial data that ended up getting released. |
Describe the Bug
Trying to open a specific zip file with
zip_open_from_source
results in error "Zip archive inconsistent".I'm using
ZIP_RDONLY
as the only flag, i.e.ZIP_CHECKCONS
is not set.The Test.zip is attached to this bug report.
Opening this archive works with all the usual desktop tools like 7zip, Windows built in unzip, Gnome file roller.
The
unzip -t
command on Linux does complain about this file thoughI'm not an expert on the Zip file format and don't know if it is really corrupt, but given that most prominent tools
are able to handle this archive I'm wondering if there is a way to read it gracefully with
libzip
too.Personal dilemma: This file is hosted by a hardware device, so repacking or fixing the file is not in my scope :/.
libzip Version
libzip 1.7.3 from conan package manager
Operating System
Windows 10, Ubuntu 20.10
Test Files
Offending test file Test.zip
The text was updated successfully, but these errors were encountered: