unable to unzip archives created not by JSZip #250

nikchuk · 2016-01-12T07:00:23Z

Hi. In my case JSZip unable to unpack zip created not by JSZip (e.g. by native windows zipper or by 7z with deflate). Exception is "End of data reached (data length = 4042, asked index = 4168). Corrupted zip ?" What can be wrong? Thanks in advance...

dduponchel · 2016-01-12T12:46:07Z

How do you get the content of the zip file ? An ajax request in a browser ?
An ajax request, if not prepared correctly, will try to decode the binary content as a text and corrupt it (see this page).
If this doesn't solve your issue, we will need the code that get the content / use JSZip and the zip file that fails.

nikchuk · 2016-01-12T13:46:02Z

No, it is not an ajax request in a browser. It is non-browser script engine in client application.

Looks like the mentioned exception happens if there are non-text files in zip (e.g. png):
ZipFailed.zip

One more issue. If there is only text file packed by other tool, zip is loaded but the list of files in JSZip object is empty:
ZipEmpty.zip

If the same text file is packed by JSZip, it is loaded correctly and JSZip files shows correct list:
ZipOK.zip

Here is how I get content of zip:

var ZIP = new JSZip();
//...
var objBinaryFile = new ActiveXObject("ADODB.Stream");
objBinaryFile.type = 2;
objBinaryFile.charset = "utf-8";
objBinaryFile.Open();
objBinaryFile.LoadFromFile(zipPath);

var content = objBinaryFile.ReadText(); // content for JSZip
ZIP.load(content);
//...

dduponchel · 2016-01-12T19:35:07Z

I tested the three zip files on my machine with JSZip (loaded with nodejs REPL) and zipinfo (a command line utility):

ZipFailed.zip is opened by both, they show the Test.xml / TestImg.png files
ZipEmpty.zip is opened by both, they show the Test.xml file
ZipOK.zip is not opened by JSZip (End of data reached (data length = 328, asked index = 44482). Corrupted zip ?) and is not opened by zipinfo (error [ZipOK.zip]: missing 44231 bytes in zipfile)

Same result with other tools.

I don't know how to test this code on my side but I strongly suspect that the ADODB.Stream doesn't give you the exact binary content. The Read method looks promising but from this question on StackOverflow, setting the type parameter to 2 (adTypeText) is ok...

Could you test:

objBinaryFile.type = 1 and objBinaryFile.Read() ?
objBinaryFile.type = 2, objBinaryFile.charset = "windows-1251" and objBinaryFile.ReadText() ?

I never used ADODB.Stream so these are guesses based on what I saw on the internet.

nikchuk · 2016-01-13T13:00:45Z

Strange... Btw, ZipOK.zip is also can not be opened by other tools in my case. They do not recognize the file as zip archive. But JSZip is still can handle it oppositely to you...

Concerning the ADODB.Stream:

If objBinaryFile.type = 1 and objBinaryFile.Read() it returns content as Variant (Array of Byte) which can not be handled by JSZip.
If objBinaryFile.type = 2, objBinaryFile.charset = "windows-1251" and objBinaryFile.ReadText(), JSZip is able load the content, but content of zipped files can not be got by any method (asText(), asBinary()...) with exception "'zipComment.length' is null or not an object":
30524_HandlingTrace_0000000001.zip

nikchuk · 2016-01-13T13:58:37Z

So, I did some more tryings...

The following code makes zip which can be opened by third-party tools (30524_HandlingTrace_0000000025.zip):

//...
// sourceName, sourceContent and zipPath declared and defined somewhere above

zip.file(sourceName, sourceContent); // create file
var content = zip.generate({type:"string", compression: "DEFLATE", compressionOptions : {level:6}}); // generate zip

var objBinaryFile = new ActiveXObject("ADODB.Stream"); // save zip as binary file
objBinaryFile.type = 2;
objBinaryFile.charset = "iso-8859-1"; 
objBinaryFile.open(); 
objBinaryFile.writeText(content);
objBinaryFile.position = 0;
objBinaryFile.saveToFile(zipPath, 2); 
objBinaryFile.close(); 
objBinaryFile = null;

Reading of such zip by JSZip:

var objBinaryFile = new ActiveXObject("ADODB.Stream"); 
objBinaryFile.type = 2;
objBinaryFile.charset = "iso-8859-1"; 
objBinaryFile.open(); 
objBinaryFile.loadFromFile(zipPath);
objBinaryFile.Position = 0;
var content = objBinaryFile.ReadText();
objBinaryFile.close(); 
objBinaryFile = null;

zip.load(content);

zip.files has the list of files now. But access to individual file content via asText() throw exception in function inflate(input, options):

// That will never happens, if you don't cheat with options :)
if (inflator.err) { throw inflator.msg; }
// inflator.msg is "invalid code lengths set"

It looks for me like problem with charsets...
If create and read zip with objBinaryFile.charset = "utf-8" JSZip is able to get file content via asText() but any other tools can not handle this zip. The example is ZipOK.zip from previous comments...

dduponchel · 2016-01-13T21:25:30Z

Ideally, you shouldn't have any charset issue as you handle binary data. A charset is only useful when you transform text to and from binary data, that's why the ReadText method looks suspicious. If this method corrupts the data, we should try Read.

JSZip doesn't know how to handle a Variant, but you could try to convert it to an array of integers (<= 255). That should be a simple loop to iterate and copy all values in an array (but I can't find any example actually using a Variant).
Then, you can give this array to zip.load(array, {checkCRC32: true}).

nikchuk · 2016-01-14T09:37:17Z

I found the tricky way to convert this variant to a normal JS array where each item is integer <= 255... But when I try to load it with zip.load(array, {checkCRC32: true}) JSZip recognizes it as uint8array which is not supported in my case (not a browser). As result exception "uint8array is not supported by this browser" occurs.

nikchuk · 2016-01-14T17:21:57Z

As long as pako supports regular JS arrays may be it make sense to support it in JSZip. It would add there one more usecase for such specific applications.

In Stuk#250 case, we don't have fancy Uint8Array but we have a unsupported binary format. An array of bytes (numbers between 0 and 255) is the lowest common denominator. A binary string would awkward to build here and building it reliably can be tricky (without filling the stack or taking too much time/memory). This commit adds the missing `arrayReader` needed here.

dduponchel · 2016-01-14T20:02:57Z

As result exception "uint8array is not supported by this browser" occurs.

Sorry, I forgot that the fallback was the Uint8ArrayReader. I tested the code in nodejs REPL... which supports Uint8Array.

As long as pako supports regular JS arrays may be it make sense to support it in JSZip. It would add there one more usecase for such specific applications.

We already use pako's support of arrays on platforms that don't support Uint8Arrays. We don't support arrays in the load method because we never needed to :)

Could you check if it works with this branch (I built the dist files here) ?

Out of curiosity, how do you read a Variant object ?

nikchuk · 2016-01-14T20:24:37Z

Thanks, I will check it tomorrow and give you feedback then.
Concerning the reading of Variant. I found an idea and general implementation at some forum and extended it a bit. It works just like that:

var bogusWindows1252chars = "\u20AC\u201A\u0192\u201E\u2026\u2020\u2021" +
    "\u02C6\u2030\u0160\u2039\u0152\u017D" +
    "\u2018\u2019\u201C\u201D\u2022\u2013\u2014" +
    "\u02DC\u2122\u0161\u203A\u0153\u017E\u0178";
var correctLatin1chars = "\u0080\u0082\u0083\u0084\u0085\u0086\u0087" +
    "\u0088\u0089\u008A\u008B\u008C\u008E" +
    "\u0091\u0092\u0093\u0094\u0095\u0096\u0097" +
    "\u0098\u0099\u009A\u009B\u009C\u009E\u009F";

// This turns a string read as codepage 1252 into a boxed string with a
// byteAt method.  We also modify the slice method to return a similar object.
function binaryString(str)
{
    var r = str ? new String(str) : new String();     // always return an object with a .length
    r.byteAt = function(index)
    {
        // translate character back to originating Windows-1252 byte value
        if (this.charCodeAt(index) <= 255)
            return this.charCodeAt(index);

        var p = bogusWindows1252chars.indexOf(this.charAt(index));
        return correctLatin1chars.charCodeAt(p);
    };
    r.slice  = function(start, end)
    {
        return binaryString(this.substring(start, end));
    };
    return r;
}

// Does reverse translation from bytes back to Windows-1252 characters.  You can
// build up a string to write back to disk by concatenating a bunch of these.
function fromByte(num)
{
    var c = String.fromCharCode(num);
    var p = correctLatin1chars.indexOf(c);
    return p >= 0 ? bogusWindows1252chars.charAt(p) : c;
}

var binstream = new ActiveXObject("ADODB.Stream");
binstream.Type = 2 /*adTypeText*/;
binstream.Charset = "iso-8859-1";   // actually Windows codepage 1252
binstream.Open();
binstream.LoadFromFile(zipPath);

var content = binaryString(binstream.ReadText());
binstream.Close();
binstream = null;

var arr = [];
for(var i = 0; i < content.length; i++)
{
    arr.push(content.byteAt(i));
}

// arr repeats content of Variant which can be read as
// var binstream = new ActiveXObject("ADODB.Stream");
// binstream.Type = 1;
// binstream.Open();
// binstream.LoadFromFile(zipPath);
//var variant = binstream.Read();

So, I assume if there is Array of Byte Variant received from somewhere it can be written into binary Stream and then converted in the similar way into array.

nikchuk · 2016-01-15T08:33:54Z

Thank you for support! It works fine now.

ZipFailed.zip is opened by both, they show the Test.xml / TestImg.png files
ZipEmpty.zip is opened by both, they show the Test.xml file
ZipOK.zip is not opened by JSZip (End of data reached (data length = 328, asked index = 44482). Corrupted zip ?) and is not opened by zipinfo (error [ZipOK.zip]: missing 44231 bytes in zipfile)

I have the same result now.

dduponchel · 2016-01-15T13:44:47Z

Cool ! I'll create a pull request with this fix.

For the variant transformation, I though it would be something like

var variant = binstream.Read();
var result = new Array(variant.Length);
for (var i = 0; i < variant.Length; i++) {
  result[i] = variant[i]; // or getByte(i) ? byteAt(i) ? I don't know how to get values
}
new JSZip(result);

I was wrong :)

dduponchel · 2016-04-11T19:51:59Z

Released in v2.6.0.

dduponchel mentioned this issue Jan 19, 2016

Add support of Array in JSZip#load. #252

Merged

dduponchel closed this as completed Apr 11, 2016

mbarnig mentioned this issue Aug 3, 2017

Error: End of data reached. Corrupted zip ? mbarnig/dumpZIPwithDICOMfiles#1

Closed

alexisrolland mentioned this issue Aug 11, 2019

Cannot unzip archive not created by JSZip #613

Closed

deanshub mentioned this issue Dec 12, 2020

Using jszip on files sent from the server #729

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unable to unzip archives created not by JSZip #250

unable to unzip archives created not by JSZip #250

nikchuk commented Jan 12, 2016

dduponchel commented Jan 12, 2016

nikchuk commented Jan 12, 2016

dduponchel commented Jan 12, 2016

nikchuk commented Jan 13, 2016

nikchuk commented Jan 13, 2016

dduponchel commented Jan 13, 2016

nikchuk commented Jan 14, 2016

nikchuk commented Jan 14, 2016

dduponchel commented Jan 14, 2016

nikchuk commented Jan 14, 2016

nikchuk commented Jan 15, 2016

dduponchel commented Jan 15, 2016

dduponchel commented Apr 11, 2016

unable to unzip archives created not by JSZip #250

unable to unzip archives created not by JSZip #250

Comments

nikchuk commented Jan 12, 2016

dduponchel commented Jan 12, 2016

nikchuk commented Jan 12, 2016

dduponchel commented Jan 12, 2016

nikchuk commented Jan 13, 2016

nikchuk commented Jan 13, 2016

dduponchel commented Jan 13, 2016

nikchuk commented Jan 14, 2016

nikchuk commented Jan 14, 2016

dduponchel commented Jan 14, 2016

nikchuk commented Jan 14, 2016

nikchuk commented Jan 15, 2016

dduponchel commented Jan 15, 2016

dduponchel commented Apr 11, 2016