Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of archive structure failed #27

Closed
rahulmutt opened this issue Jan 18, 2017 · 35 comments · Fixed by #28
Closed

Parsing of archive structure failed #27

rahulmutt opened this issue Jan 18, 2017 · 35 comments · Fixed by #28
Assignees
Milestone

Comments

@rahulmutt
Copy link
Contributor

Parsing of archive structure failed:
Cannot locate end of central directory

This error is coming up for Eta users and moreover it's not very consistent - it happens for only a handful of people. Initially, it was happening only on Windows, but we recently got a repro case on Linux.

The last time I filed an issue, I changed the algorithm to select the largest jar file among a set of jar files to merge as the base and merged the rest into that one. After this change, I noticed that the Windows build failure for Eta (due to the error above) progressed to one more step (the step it was able to pass was the case of merging a single zip file). This implies usage of the zip API is somehow causing corrupt zip files on some specific platform configurations. Because it's inconsistent, I'm wondering whether it is a problem with some of the transitive C dependencies that zip uses. What do you think may be causing it?

In the meantime, I'll try to make a minimal test case of the zip library where I create an empty zip file and add a single entry and have the people with the affected systems try compiling that program and opening up the zip file.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 18, 2017

Thanks for this report. According to the exception you quote, it should be a problem with zip library, not its transitive C dependencies (we use zlib under the hood via conduit but it comes into play after we have located all the entries). I don't want to speculate why this might be happening, I'll be waiting for reproducing program/archive.

@rahulmutt
Copy link
Contributor Author

rahulmutt commented Jan 18, 2017

For the full debugging data, you can take a look here: typelead/eta#184

The most important part I'll reproduce here:

john@Fetisov ~/Projects/eta/testZip/extractTest $ xxd -p Tuple.jar
504b06062c000000000000002e002d000000000000000000000000000000
0000000000000000000000000000000000000000000000000000504b0607
00000000000000000000000001000000504b05060000000000000000ffff
ffff000000000000

This is 98 bytes. An empty zip file will be 22 bytes.

Tuple.jar is the one that failed to parse. Note that Tuple.jar was generated using zip library invoking the following APIs:

createArchive p (return ())
withArchive p $ do
  forM_ files $ \(path, contents) -> do
    entrySel <- mkEntrySelector path
    addEntry compress contents entrySel

Let me know if it is at all possible for the code above to generate the output above for suitable values of the free variables.

EDIT: compress = Deflate in the case above.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 19, 2017

Sorry, I don't understand. You claim that zip generated the file and the same time ask if it's possible that it has generated it? Can you please share Tuple.jar file with me? Do you have a complete program that creates such an archive that cannot be parsed back? From the hex dump I see that it clearly has end of central directory record, 504b0506 looks like it's signature (little-endian form).

I haven't read entrie thread you link to, but is it possible that the file is incomplete? You say it should be much bigger.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 19, 2017

I have my doubts though that zip could generate something it can't parse back, there are quite a few generative tests that check this, for example: https://github.com/mrkkrp/zip/blob/master/tests/Main.hs#L629-L639, I'd be really surprised if you provide a proof that somethnig like this happens.

@rahulmutt
Copy link
Contributor Author

I used xxd -r -p to regenerate the hexdump above to Tuple.jar and stashed it in this repo.

Please clone that repo, and run stack build && stack exec zip-bench to get the error.

I'm starting to have a feeling it's from my side, but it seems strange that it only happens on certain platforms. I'm waiting on more information from the guy to trace the bug to a certain execution point.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 19, 2017

Looks like it fails to detect ECD because it's placed on offset that is not end - n * 4 where n is a natural number. To me it looks like the archive is corrupted or somehow truncated.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 22, 2017

@rahulmutt Any progress with your investigation?

@rahulmutt
Copy link
Contributor Author

rahulmutt commented Jan 28, 2017

@mrkkrp Some very interesting stuff turned up. First, I changed the code I had above from

createArchive p (return ())
withArchive p $ do
  forM_ files $ \(path, contents) -> do
    entrySel <- mkEntrySelector path
    addEntry compress contents entrySel

to

createArchive p $ do
  forM_ files $ \(path, contents) -> do
    entrySel <- mkEntrySelector path
    addEntry compress contents entrySel

and it ended up creating almost the correct jar with a bit of junk. In particular, there's a file called Magic.jar. On my laptop where the jar is not corrupted, the size is 2,133 bytes. On the system where the jar is corrupted, the size is 2,209. You can see the difference in the hex dumps in this Gist.

One interesting observation: In the previous info I gave you, an empty jar is 22 bytes and an empty corrupted jar was 98 bytes. Meaning 98 - 22 = 76 bytes somehow appeared. If you take a look at Magic.jar, note that 2,209 - 2,133 = 76! So it certainly looks like an issue with how the CEN record is generated.

Do you have any idea why those extra bytes may be generated on some systems and not others?

Let me know if you need anymore info from my side.

@mrkkrp
Copy link
Owner

mrkkrp commented Jan 28, 2017

This is interesting. I'm not sure why this could happen, the code that generates central directory records is pure Haskell, there should be nothing platform-dependent. I need to take a closer look though.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 4, 2017

Can we try to run zip's test suite on one of the machines that produce invalid output?

@rahulmutt
Copy link
Contributor Author

Yeah of course. You can even add debugging logs inside of the zip library and I can create a branch on the eta repo that uses that one via a git dependency and forward it to the people who are facing the problem. I'm also able to repro it on the Eta's Appveyor build.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 5, 2017

Great, can you then ask the people who are facing the issue try to run the test suite on their machines?

@jneira
Copy link

jneira commented Feb 6, 2017

Hi, i reported the original issue with eta and running the zip test suite i've got this output: https://gist.github.com/jneira/2d2a1c64895b480919710d10f353efcf

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 6, 2017

OK, I consider it a bug now. The next thing I'll try is to reproduce it on a Windows machine I have access to. Thanks for reporting this.

@mrkkrp mrkkrp added the bug label Feb 6, 2017
@mrkkrp
Copy link
Owner

mrkkrp commented Feb 6, 2017

Actually I'll probably add Appveyor testing, it should catch this.

@mrkkrp mrkkrp self-assigned this Feb 6, 2017
@mrkkrp mrkkrp added this to the 0.1.6 milestone Feb 6, 2017
@mrkkrp
Copy link
Owner

mrkkrp commented Feb 7, 2017

I was able to reproduce the issue. The fix is coming.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 7, 2017

FTR, what is happening is that central directory for some reason (not sure why yet) gets too long to be stored without Zip64 extension (longer than 0xffffffff apparently it's visible on the first dump here as well, it's this ffffffff sequence of bytes near the end).

So we write Zip64 end of central directory and Zip64 end of central directory locator before writing the vanilla ECD. With these though, the algorithm for ECD detection chokes (not sure exactly why).

I have modified the Travis script to make a build when we unconditionally use Zip64 (because it's kinda difficult to catch this with normal generative tests, we need something big), this is tested along with normal tests now. It fails as it should:

https://travis-ci.org/mrkkrp/zip/builds/199397711

the set of failures is similar to https://gist.github.com/jneira/2d2a1c64895b480919710d10f353efcf

Now I need to answer the questions:

  1. Why ECD detection fails, while it certainly should not fail. (Then fix this.)
  2. Why on the earth we get such a big size of central directory, certainly
    even the whole file as provided in the dumps is not that big.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 7, 2017

@jneira In the meantime, could you add print cdSize just above https://github.com/mrkkrp/zip/blob/master/Codec/Archive/Zip/Internal.hs#L487 and run the test suite again? It's curious why this value seems to be so high on Windows. I wonder if it may be a bug in the cereal library.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 8, 2017

@rahulmutt I believe I have fixed at least part of the issue in 39ffb9a. The 76 bytes is combined size of zip 64 end of central directory and zip 64 ECD locator. For some reason the lib decides to use zip64 extension and there is indeed a bug preventing it from reading empty zip64 archives (this is also the cause of test most test failures @jneira showed us). I fixed this bug and it allowed me to load Tuple.jar without issues. I wonder why the lib would decide to use zip 64? Here is the code that makes the decision:

https://github.com/mrkkrp/zip/blob/master/Codec/Archive/Zip/Internal.hs#L484-L486

we need either number of files greater than 0xffff or size of central directory record greater than 0xffffffff or offset greater than 0xffffffff. The dump https://gist.github.com/rahulmutt/1ef23bb08552348cdb3c7e8492a4d65d suggests that size of central directory is greater than 0xffffffff, but I don't know why, the entire file is too short to contain such long central directory.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 8, 2017

@rahulmutt Please provide the first file from https://gist.github.com/rahulmutt/1ef23bb08552348cdb3c7e8492a4d65d as a normal file. I tried to restore it from the dump, but xxd -r -p restored somethnig different:

00000000: 0000 0000 504b 0304 1400 0000 0800 4072  ....PK........@r
00000010: 3c4a f0b8 0000 0010 e8a0 7901 0000 8a02  <J........y.....
00000020: 0000 2000 1400 6768 0000 0020 637a 6d70  .. ...gh... czmp
00000030: 7269 6d2f 6768 632f 4d61 6769 0000 0030  rim/ghc/Magi...0
00000040: 6324 696e 6c69 6e65 2e63 6c61 7373 0100  c$inline.class..
00000050: 0000 0040 1000 8a02 0000 0000 0000 7901  ...@..........y.
00000060: 0000 0000 0000 0050 0000 7d91 4b4b c350  .......P..}.KK.P
00000070: 1085 cfd8 b469 637c 0000 0060 b6b6 bedf  .....ic|...`....
00000080: 8fd6 8a51 5c2a 8214 8482 dda8 0000 0070  ...Q\*.........p
00000090: 745b aef5 1a83 691a 925b b0fe 2637 7551  t[....i..[..&7uQ
000000a0: 0000 0080 c185 3fc0 df24 e2a4 5610 acde  ......?..$..V...
000000b0: c599 0c73 0000 0090 e6cb e5dc b78f 9757  ...s...........W
000000c0: 0007 c813 128e e73a 0000 00a0 9ed4 1123  .......:.......#
000000d0: a4ed dbfa 43c3 0f9c 86c5 5f56 0000 00b0  ....C....._V....
000000e0: 45d8 4e5d 479c 303b 60b0 febd a813 a6a5  E.N]G.0;`.......
000000f0: 0000 00c0 1256 d0f2 94d3 9096 f07d b76d  .....V.......}.m
00000100: 5d28 fbb4 0000 00d0 e531 ffc8 f11c 754c  ](.......1....uL
00000110: 88e5 0b55 0309 0c9b 0000 00e0 4822 4588  ...U........H"E.
00000120: 4b4f c980 b094 3ffb b91d 2a3b 0000 00f0  KO....?...*;....
00000130: da2d 3579 7aaf 0e0b 551d 637c 81bf 2d06  .-5yz...U.c|..-.
00000140: 0000 0100 4631 6162 1269 029d 47bc 7261  ....F1ab.i..G.ra
00000150: 20d1 6d86 f1ab 0000 0110 ad40 1eea 98fa   .m........@....
00000160: 83f7 6588 78b9 8837 4d58 0000 0120 ce97  ..e.x..7MX... ..
00000170: ff43 15aa 29cc 625e c71c 21f7 3b82 0000  .C..).b^..!.;...
00000180: 0130 9348 4d2c 6091 9012 7e6d af76 2342  .0.HM,`...~m.v#B
00000190: c569 0000 0140 fd62 9eab 90c3 e2ab 2d13  .i...@.b......-.
000001a0: b283 8706 96b0 0000 0150 6ac2 c408 2169  .........Pj...!i
000001b0: 4b75 1238 aadd 4bb5 4cd0 0000 0160 4acd  Ku.8..K.L....`J.
000001c0: 6b49 18b9 50a2 7e57 11fe a5b8 72b9 0000  kI..P.~W....r...
000001d0: 0170 37cb 9e27 8392 2bc2 5086 d887 c6f1  .p7..'..+.P.....
000001e0: 4727 0000 0180 068a de80 7583 bb09 aec4  G'........u.....
000001f0: 35be fd0c a3d3 0000 0190 3350 f49f fe78  5.........3P...x
00000200: 97ed 31ae 9962 bc8b f14a 0000 01a0 51db  ..1..b...J....Q.
00000210: e922 fb84 9962 172b 9d9e 7793 3509 0000  ."...b.+..w.5...
00000220: 01b0 7a47 4247 86fb 35ac f777 47fb e821  ..zGBG..5..wG..!
00000230: edb1 0000 01c0 cfdd 6235 b86a 1862 4d7d  ........b5.j.bM}
00000240: 0250 4b03 0414 0000 01d0 0000 0008 0040  .PK............@
00000250: 723c 4a7e 32e4 b778 0100 0000 01e0 0088  r<J~2..x........
00000260: 0200 001f 0014 0067 6863 7a6d 7072 0000  .......ghczmpr..
00000270: 01f0 696d 2f67 6863 2f4d 6167 6963 246c  ..im/ghc/Magic$l
00000280: 617a 0000 0200 7a79 2e63 6c61 7373 0100  az....zy.class..
00000290: 1000 8802 0000 0000 0210 0000 0000 7801  ..............x.
000002a0: 0000 0000 0000 7d91 4b4b 0000 0220 c350  ......}.KK... .P
000002b0: 1085 cfd8 b469 637c d7d6 f7fb d15a 0000  .....ic|.....Z..
000002c0: 0230 318a 4b45 9082 50b0 1b95 6ee5 5aaf  .01.KE..P...n.Z.
000002d0: 3198 0000 0240 a621 b905 dbdf e4a6 2e2a  1....@.!.......*
000002e0: b8f0 07f8 9b44 0000 0250 9cd4 2c04 ab77  .....D...P..,..w
000002f0: 7126 c39c f96e 38f7 fdf3 0000 0260 f50d  q&...n8......`..
00000300: c021 0a84 a42b 3a9d b68e 0461 cabe 0000  .!...+:....a....
00000310: 0270 af77 1a7e e034 2cfe b2aa c276 ea3a  .p.w.~.4,....v.:
00000320: 9284 0000 0280 d901 838d 784f 27cc 4825  ..........xO'.H%
00000330: aca0 e529 a721 0000 0290 2de1 fb6e dbba  ...).!....-..n..
00000340: 54f6 59cb 23a4 8e1d cf51 0000 02a0 2784  T.Y.#....Q....'.
00000350: 44a1 5833 90c2 b089 3432 7cad f494 0000  D.X3....42|.....
00000360: 02b0 0c08 cb85 f39f dba1 b2a3 dd72 93a7  .............r..
00000370: 8fea 0000 02c0 a858 d331 4698 fbdb 6260  .......X.1F...b`
00000380: 1413 2626 3145 0000 02d0 a08b 8857 290e  ..&&1E.......W).
00000390: 24ba cdb0 15c8 231d d37f 0000 02e0 f0be  $.....#.........
000003a0: 0d11 2f1f f166 082b 85ca 7fa8 622d 0000  ../..f.+....b-..
000003b0: 02f0 8339 2ce8 9827 e47f 4770 1aa9 8945  ...9,..'..Gp...E
000003c0: 2c11 0000 0300 32c2 bfde bfbe 13a1 e2b4  ,.....2.........
000003d0: 7e31 2f54 c861 0000 0310 f1af ad10 7283  ~1/T.a........r.
000003e0: 8706 96b1 66c2 c408 216d 0000 0320 4b75  ....f...!m... Ku
000003f0: 1a38 aadd 4fb5 42d0 cacd 5b49 18b9 0000  .8..O.B...[I....
00000400: 0330 54a2 fe50 15fe 95b8 71b9 372b 9e27  .0T..P....q.7+.'
00000410: 83b2 0000 0340 2bc2 5086 3880 c6f1 4727  .....@+.P.8...G'
00000420: 018a de80 7593 0000 0350 bb09 aec4 35b9  ....u....P....5.
00000430: f302 a3db 3750 744f 3cde 0000 0360 637b  ....7PtO<....`c{
00000440: 826b b694 ec61 bc5a d276 7bc8 3d63 0000  .k...a.Z.v{.=c..
00000450: 0370 b6d4 c36a b7ef dd62 4d83 3e90 d291  .p...j...bM.>...
00000460: e57e 0000 0380 1d1b f1ee 688c 1ed2 9e62  .~........h....b
00000470: ee36 abc1 55c3 0000 0390 106b e60b 504b  .6..U......k..PK
00000480: 0304 1400 0000 0800 4172 0000 03a0 3c4a  ........Ar....<J
00000490: 95a7 1acb 7b01 0000 8c02 0000 2100 0000  ....{.......!...
000004a0: 03b0 1400 6768 637a 6d70 7269 6d2f 6768  ....ghczmprim/gh
000004b0: 632f 0000 03c0 4d61 6769 6324 6f6e 6553  c/....Magic$oneS
000004c0: 686f 742e 636c 0000 03d0 6173 7301 0010  hot.cl....ass...
000004d0: 008c 0200 0000 0000 007b 0000 03e0 0100  .........{......
000004e0: 0000 0000 007d 914b 4bc3 5010 85cf 0000  .....}.KK.P.....
000004f0: 03f0 d8b4 6963 7c54 6beb fb59 b5b5 6214  ....ic|Tk..Y..b.
00000500: 978a 0000 0400 2005 a160 3756 ba95 6bbd  ...... ..`7V..k.
00000510: c660 9b84 e416 0000 0410 d4df e4a6 2e2a  .`.............*
00000520: b8f0 07f8 9b44 9cd4 2282 0000 0420 d5bb  .....D..".... ..
00000530: 3873 8739 f325 9cfb f6f1 f20a 601f 0000  8s.9.%......`...
00000540: 0430 0582 eeb9 b276 e329 1d31 c2a4 7dd3  .0.....v.).1..}.
00000550: 7868 0000 0440 f981 d3b2 f866 5585 ed34  xh...@.....fU..4
00000560: 74c4 0973 0306 0000 0450 f9ef 4d9d 302d  t..s.....P..M.0-
00000570: 95b0 82b6 ab9c 96b4 84ef 0000 0460 37ef  .............`7.
00000580: ad9a b24f da2e 2171 e8b8 8e3a 22c4 0000  ...O..!q...:"...
00000590: 0470 0ac5 ba81 0486 4d24 9122 c4a5 ab64  .p......M$."...d
000005a0: 4058 0000 0480 2a9c fedc 0e95 1ded 963d  @X....*........=
000005b0: 9ede a983 625d 0000 0490 c718 61f6 6f8b  ....b]......a.o.
000005c0: 8151 a44d 4c60 9240 6711 0000 04a0 af52  .Q.ML`.@g......R
000005d0: 1c48 6c7a 613b 9007 3aa6 fee0 7d19 0000  .Hlza;..:...}...
000005e0: 04b0 225e 2ee2 4d13 960b 95ff 50c5 7a0a  .."^..M.....P.z.
000005f0: b398 0000 04c0 d731 47c8 fd8e e038 5213  .......1G....8R.
00000600: 0b58 24a4 847f 0000 04d0 b17b 712d 42c5  .X$........{q-B.
00000610: 69fd 629e a990 c3e2 5f5b 0000 04e0 2664  i.b....._[....&d
00000620: 070f 0d2c 61d5 8489 1142 d296 ea38 0000  ...,a....B...8..
00000630: 04f0 70d4 7d2f d50a 412b 7b57 9230 5253  ..p.}/..A+{W.0RS
00000640: a271 0000 0500 5b15 feb9 b86c 726f 565c  .q....[....lroV\
00000650: 5706 e5a6 0843 0000 0510 1962 0f1a c71f  W....C.....b....
00000660: 9d18 287a 03d6 75ee d25c 0000 0520 896b  ..(z..u..\... .k
00000670: 7ceb 1946 a767 a0e8 3bfd f10e db63 0000  |..F.g..;....c..
00000680: 0530 5c33 a578 17e3 d592 b6dd 45f6 0933  .0\3.x......E..3
00000690: a52e 0000 0540 563a 3def 066b 12f4 8e84  .....@V:=..k....
000006a0: 8e0c f76b c8f7 0000 0550 7747 fbe8 21ed  ...k.....PwG..!.
000006b0: b1cf dd64 35b8 6a18 624d 0000 0560 7d02  ...d5.j.bM...`}.
000006c0: 504b 0304 1400 0000 0800 4072 3c4a 0000  PK........@r<J..
000006d0: 0570 38f4 710e 5101 0000 1702 0000 1900  .p8.q.Q.........
000006e0: 1400 0000 0580 6768 637a 6d70 7269 6d2f  ......ghczmprim/
000006f0: 6768 632f 4d61 0000 0590 6769 632e 636c  ghc/Ma....gic.cl
00000700: 6173 7301 0010 0017 0200 0000 05a0 0000  ass.............
00000710: 0000 0051 0100 0000 0000 0075 514d 0000  ...Q.......uQM..
00000720: 05b0 4fc2 4010 7d4b 292d a57c 087e a480  ..O.@.}K)-.|.~..
00000730: 0286 0000 05c0 83c6 c31e 3c4a bc70 3291  ..........<J.p2.
00000740: 78c0 9878 324b 0000 05d0 dd94 6269 495b  x..x2K......biI[
00000750: 4ce4 5f19 0e26 92f8 03fc 0000 05e0 51c6  L._..&........Q.
00000760: 6981 78c1 dd6c de6c de9b 3733 99ef 0000  i.x..l.l..73....
00000770: 05f0 9fcf 2f00 97e8 3298 37be 2fc3 be27  ..../...2.7./..'
00000780: a248 0000 0600 460c aa27 168b 370d 5986  .H....F..'..7.Y.
00000790: 9a33 b617 d359 0000 0610 e84e 3945 7c20  .3...Y.....N9E|
000007a0: 1cd7 d690 63b0 7610 dd34 0000 0620 8f41  ....c.v..4... .A
000007b0: 0b7c 391c 07b1 863c 4363 9770 2360 0000  .|9....<Cc.p#`..
000007c0: 0630 c8b9 bee7 fa52 83c9 50df a5dc f225  .0.....R..P....%
000007d0: 86ca 0000 0640 44bc 0aee 09df e177 a389  .....@D......w..
000007e0: b429 5def d944 0000 0650 bbf1 3583 7276  .)]..D...P..5.rv
000007f0: fe60 4047 d544 0d15 72ee 0000 0660 a544  .`@G.D..r....`.D
00000800: 1e0a 0e4c 1ce2 88a1 bc29 fb64 7b41 0000  ...L.....).d{A..
00000810: 0670 340f 2543 f356 c682 8773 3f76 a792  .p4.%C.V...s?v..
00000820: 47b1 0000 0680 c387 b1d3 5fb3 571a 2c6a  G........._.W.,j
00000830: ea7f 8101 15d5 0000 0690 c4bf 69e2 38f1  ............i.8.
00000840: 2fa6 f36f dd0d 14d6 74db 0000 06a0 4427  /..o....t.....D'
00000850: a14b eb59 feaa 67fb c133 4171 180b 0000  .K.Y..g..3Aq....
00000860: 06b0 fb65 2066 f762 e449 d22a 28d2 6614  ...e f.b.I.*(.f.
00000870: 5889 ef00 0006 c02d 4556 d27e 8a9d 1419  X......-EV.~....
00000880: f228 a342 784a bf00 0006 d00b 6490 9cd6  .(.BxJ......d...
00000890: 0afa e307 f696 d85f 41a5 a800 0006 e0be  ......._A.......
000008a0: 4463 8502 4527 4bb4 de53 11a3 0bd4 c9dc  Dc..E'K..S......
000008b0: 0000 06f0 bc40 cf20 2395 3043 a813 d2c2  .....@. #.0C....
000008c0: 7e01 504b 0000 0700 0102 2e00 1400 0000  ~.PK............
000008d0: 0800 4072 3c4a f0b8 0000 0710 e8a0 7901  ..@r<J........y.
000008e0: 0000 8a02 0000 2000 0400 0000 0000 0720  ...... ........
000008f0: 0000 0000 0000 0000 0000 0000 6768 637a  ............ghcz
00000900: 0000 0730 6d70 7269 6d2f 6768 632f 4d61  ...0mprim/ghc/Ma
00000910: 6769 6324 0000 0740 696e 6c69 6e65 2e63  gic$...@inline.c
00000920: 6c61 7373 0100 0000 0000 0750 504b 0102  lass.......PPK..
00000930: 2e00 1400 0000 0800 4072 3c4a 0000 0760  ........@r<J...`
00000940: 7e32 e4b7 7801 0000 8802 0000 1f00 0400  ~2..x...........
00000950: 0000 0770 0000 0000 0000 0000 0000 cb01  ...p............
00000960: 0000 6768 0000 0780 637a 6d70 7269 6d2f  ..gh....czmprim/
00000970: 6768 632f 4d61 6769 0000 0790 6324 6c61  ghc/Magi....c$la
00000980: 7a7a 792e 636c 6173 7301 0000 0000 07a0  zzy.class.......
00000990: 0050 4b01 022e 0014 0000 0008 0041 723c  .PK..........Ar<
000009a0: 0000 07b0 4a95 a71a cb7b 0100 008c 0200  ....J....{......
000009b0: 0021 0004 0000 07c0 0000 0000 0000 0000  .!..............
000009c0: 0000 0094 0300 0067 0000 07d0 6863 7a6d  .......g....hczm
000009d0: 7072 696d 2f67 6863 2f4d 6167 0000 07e0  prim/ghc/Mag....
000009e0: 6963 246f 6e65 5368 6f74 2e63 6c61 7373  ic$oneShot.class
000009f0: 0000 07f0 0100 0000 504b 0102 2e00 1400  ........PK......
00000a00: 0000 0800 0000 0800 4072 3c4a 38f4 710e  ........@r<J8.q.
00000a10: 5101 0000 1702 0000 0000 0810 1900 0400  Q...............
00000a20: 0000 0000 0000 0000 0000 6205 0000 0820  ..........b....
00000a30: 0000 6768 637a 6d70 7269 6d2f 6768 632f  ..ghczmprim/ghc/
00000a40: 0000 0830 4d61 6769 632e 636c 6173 7301  ...0Magic.class.
00000a50: 0000 0050 0000 0840 4b06 062c 0000 0000  ...P...@K..,....
00000a60: 0000 002e 002d 0000 0000 0850 0000 0000  .....-.....P....
00000a70: 0000 0004 0000 0000 0000 0004 0000 0860  ...............`
00000a80: 0000 0000 0000 0041 0100 0000 0000 00fe  .......A........
00000a90: 0000 0870 0600 0000 0000 0050 4b06 0700  ...p.......PK...
00000aa0: 0000 003f 0000 0880 0800 0000 0000 0001  ...?............
00000ab0: 0000 0050 4b05 0600 0000 0890 0000 0004  ...PK...........
00000ac0: 0004 00ff ffff fffe 0600 0000 0000 08a0  ................
00000ad0: 00

I have no time or intention to fight this tool, I just need the file, desirably in its “proper” form and in its “corrupted” form.

@mrkkrp mrkkrp closed this as completed in #28 Feb 8, 2017
@mrkkrp
Copy link
Owner

mrkkrp commented Feb 8, 2017

Going to release version 0.1.6 with the fix, if the problem persists, feel free to reopen.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 8, 2017

@jneira FTR, I can't reproduce your failures on my Windows machine, but they should be fixed in version 0.1.6. It's still very interesting why cdSize contains such big values for you (if it does, you don't provide the print out I requested, so I have no way to check the theory).

In any case, without further input, I can't really help much, one bug I've found and fixed, but there are still strange things I don't understand.

@jneira
Copy link

jneira commented Feb 9, 2017

@mrkkrp hi, i've rerun the test suite after pulling (but without printing cdSize). The result is in https://gist.github.com/jneira/2d2a1c64895b480919710d10f353efcf (11 failures)
fyi i am on windows 10.0.14393 64 bits
I've added the print cdSize here (not exactly above line 487)

writeCD h comment m = do
  let cd = runPut (putCD m)
  cdOffset <- hTell h
  B.hPut h cd -- write central directory
  let totalCount = M.size m
      cdSize     = B.length cd
      needZip64  =
#ifdef USE_ZIP64_ECD
        True
#else
        totalCount  >= 0xffff
        || cdSize   >= 0xffffffff
        || cdOffset >= 0xffffffff
#endif
  print $ "cdSize:" ++ show cdSize
  when needZip64 $ do

i'll post the test suite output with the print line asap

@rahulmutt
Copy link
Contributor Author

@jneira Also, give the following a try? We can see which condition is causing the Zip64 record to be used.

import Debug.Trace(traceShow)
...

writeCD h comment m = do
  let cd = runPut (putCD m)
  cdOffset <- hTell h
  B.hPut h cd -- write central directory
  let totalCount = M.size m
      cdSize     = B.length cd
      needZip64  =
#ifdef USE_ZIP64_ECD
        True
#else
        totalCount  >= 0xffff
        || cdSize   >= 0xffffffff
        || cdOffset >= 0xffffffff
#endif
  print $ "cdSize:" ++ show cdSize
  when (traceShow (totalCount, cdSize, cdOffset) needZip64) $ do

@jneira
Copy link

jneira commented Feb 9, 2017

I've tried to run the test suite with the trace in win xp 32 bits but the test target fails:

[1 of 1] Compiling Main             ( tests\Main.hs, .stack-work\dist\2fae85dd\build\tests\tests-tmp\Main.o )
Warning: If linking fails, consider installing KB2533623.
ghc.EXE: unable to load package `time-1.6.0.1'
ghc.EXE: D:\stack\rs\programs\i386-windows\ghc-8.0.1\lib\time-1.6.0.1\HStime-1.6.0.1.o: unknown symbol `__localtime32'


--  While building package zip-0.1.6 using:
      D:\stack\rs\setup-exe-cache\i386-windows\Cabal-simple_Z6RU0evB_1.24.0.0_ghc-8.0.1.exe --builddir=.stack-work\dist\2fae85dd build
lib:zip test:tests --ghc-options " -ddump-hi -ddump-to-file"
    Process exited with code: ExitFailure 1

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

I had a different issue: snoyberg/bzlib-conduit#3, but also about unknown symbols. Not sure what's going on, Haskell on Windows seems to be still fragile :-|

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

@jneira What you see (in printout of test suite) proves that even without the new zip64-ecd flag zip library in your environment decides to use zip 64 even with empty archives. It's very interesting, we should investigate why.

@mrkkrp mrkkrp reopened this Feb 9, 2017
@rahulmutt
Copy link
Contributor Author

@mrkkrp Even on my Windows machine I was unable to reproduce what @jneira and a couple others have reported. It's not a windows-specific issue, because it was reproduced on a 32-bit LinuxMint system.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

Good, still it's quite strange. @jneira is this a 32 bit machine? Maybe it has something to do with that?

@rahulmutt
Copy link
Contributor Author

rahulmutt commented Feb 9, 2017

I forked this repo, added some trace messages, and created a new branch on the eta repo called corrupt-jar.

Local working installation:

Configuring ghc-prim-0.4.0.0...
Building ghc-prim-0.4.0.0...
Preprocessing library ghc-prim-0.4.0.0...
[1 of 5] Compiling GHC.Types        ( GHC/Types.hs, dist/build/GHC/Types.jar )
(43,3490,16526,False,False,False)
[2 of 5] Compiling GHC.Tuple        ( GHC/Tuple.hs, dist/build/GHC/Tuple.jar )
(186,15034,131748,False,False,False)
[3 of 5] Compiling GHC.Magic        ( GHC/Magic.hs, dist/build/GHC/Magic.jar )
(4,321,1790,False,False,False)
[4 of 5] Compiling GHC.Classes      ( GHC/Classes.hs, dist/build/GHC/Classes.jar )
(550,55138,465428,False,False,False)
[5 of 5] Compiling GHC.CString      ( GHC/CString.hs, dist/build/GHC/CString.jar )
(27,2427,15620,False,False,False)
Linking dist/build/HSghc-prim-0.4.0.0.jar ...
(811,76480,631228,False,False,False)
In-place registering ghc-prim-0.4.0.0...

The Appyveyor build:

Configuring ghc-prim-0.4.0.0...
Building ghc-prim-0.4.0.0...
Preprocessing library ghc-prim-0.4.0.0...
[1 of 5] Compiling GHC.Types        ( GHC\Types.hs, dist\build\GHC\Types.jar )
(43,3490,16526,False,True,False)
[2 of 5] Compiling GHC.Tuple        ( GHC\Tuple.hs, dist\build\GHC\Tuple.jar )
(186,15034,131748,False,True,False)
[3 of 5] Compiling GHC.Magic        ( GHC\Magic.hs, dist\build\GHC\Magic.jar )
(4,321,1790,False,True,False)
[4 of 5] Compiling GHC.Classes      ( GHC\Classes.hs, dist\build\GHC\Classes.jar )
(550,55138,465428,False,True,False)
[5 of 5] Compiling GHC.CString      ( GHC\CString.hs, dist\build\GHC\CString.jar )
(27,2427,15620,False,True,False)
Linking dist\build\HSghc-prim-0.4.0.0.jar ...
(811,76480,631228,False,True,False)
In-place registering ghc-prim-0.4.0.0...

The interesting observation here is that:

cdSize   >= 0xffffffff

returns true on the failing build even if cdSize is the same on both platforms!

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

OK, it means that 0xffffffff is infered as Int and probably overflows on 32 systems (i.e. the literal is too big, so real value that is used is probably -1).

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

This is easy to fix though, I'll probably replace all the literals with a constant of type Integer or Natural and then the type system will force me to do all the necessary casts.

I'll push the fix very soon and release 0.1.7. Thanks a lot for reporting this and helping to investigate, @rahulmutt, @jneira.

@mrkkrp
Copy link
Owner

mrkkrp commented Feb 9, 2017

@rahulmutt I believe we won't hear about this issue again. Make sure Eta requires version 0.1.7 of zip (just released) or newer.

@rahulmutt
Copy link
Contributor Author

Just tested ran the Appveyor build on new branch w/ the updated dependency and it's working. Will update the dependency on master now, thanks!

@jneira
Copy link

jneira commented Feb 9, 2017

great! thanks for the support

@mrkkrp mrkkrp added the 32-bit label Mar 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants