Skip to content

Commit

Permalink
zip: Fix incorrect time/date, add extended timestamp and refactor
Browse files Browse the repository at this point in the history
MSDOS time/date was read in wrong order and also did not take into account
that the bit ranges in the shortis are in little-endian.

Remodel modification_time/date to be one struct with fat_time, fat_date LE shorts
and then synthetic values for day, hours, minute etc and also a unix field with the
timestamp as unix time.

Also refactor and clenaup extra fields/extended code a bit.

Fixes #792
  • Loading branch information
wader committed Oct 23, 2023
1 parent 1a3823f commit a148f76
Show file tree
Hide file tree
Showing 10 changed files with 1,342 additions and 825 deletions.
6 changes: 6 additions & 0 deletions doc/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -1395,9 +1395,15 @@ Decode value as zip

Supports ZIP64.

## Timestamp and time zones

The timestamp accessed via `.local_files[].last_modification` is encoded in ZIP files using [MS-DOS representation](https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime) which lacks a known time zone. Probably the local time/date was used at creation. The `unix_guess` field in `last_modification` is a guess assuming the local time zone was UTC at creation.

### References
- https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT
- https://opensource.apple.com/source/zip/zip-6/unzip/unzip/proginfo/extra.fld
- https://formats.kaitai.io/dos_datetime/
- https://learn.microsoft.com/en-us/windows/win32/api/oleauto/nf-oleauto-dosdatetimetovarianttime


[#]: sh-end
Expand Down
69 changes: 42 additions & 27 deletions format/zip/testdata/bigzero-zip.zip.fqtest
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,16 @@ $ fq -o uncompress=false dv bigzero-zip.zip
0x0000| 00 | . | language_encoding: false 0x7.4-0x7.5 (0.1)
0x0000| 00 | . | unused1: 0 0x7.5-0x8 (0.3)
0x0000| 08 00 | .. | compression_method: "deflated" (8) 0x8-0xa (2)
| | | last_modification_date{}: 0xa-0xc (2)
0x0000| c8 | . | hours: 25 0xa-0xa.5 (0.5)
0x0000| c8 78 | .x | minutes: 3 0xa.5-0xb.3 (0.6)
0x0000| 78 | x | seconds: 24 0xb.3-0xc (0.5)
| | | last_modification_time{}: 0xc-0xe (2)
0x0000| 84 | . | year: 66 0xc-0xc.7 (0.7)
0x0000| 84 45 | .E | month: 2 0xc.7-0xd.3 (0.4)
0x0000| 45 | E | day: 5 0xd.3-0xe (0.5)
| | | last_modification{}: 0xa-0xe (4)
0x0000| c8 78 | .x | fat_time: 0x78c8 0xa-0xc (2)
| | | second: 16 (8)
| | | minute: 6
| | | hour: 15
0x0000| 84 45 | .E | fat_date: 0x4584 0xc-0xe (2)
| | | day: 4
| | | month: 12
| | | year: 2014 (34)
| | | unix_guess: 1417705576 (Timestamp guess based on UTC)
0x0000| 54 81| T.| crc32_uncompressed: 0xae158154 0xe-0x12 (4)
0x0010|15 ae |.. |
0x0010| 4e 28 00 00 | N(.. | compressed_size: 10318 0x12-0x16 (4)
Expand All @@ -38,13 +40,19 @@ $ fq -o uncompress=false dv bigzero-zip.zip
0x0020|67 7a 65 72 6f 2e 7a 69 70 |gzero.zip |
| | | extra_fields[0:2]: 0x29-0x45 (28)
| | | [0]{}: extra_field 0x29-0x36 (13)
0x0020| 55 54 | UT | header_id: 0x5455 (extended timestamp) 0x29-0x2b (2)
0x0020| 09 00 | .. | data_size: 9 0x2b-0x2d (2)
0x0020| 03 57 6a| .Wj| data: raw bits 0x2d-0x36 (9)
0x0030|80 54 7e 6a 80 54 |.T~j.T |
0x0020| 55 54 | UT | tag: 0x5455 (extended timestamp) 0x29-0x2b (2)
0x0020| 09 00 | .. | size: 9 0x2b-0x2d (2)
| | | flags{}: 0x2d-0x2e (1)
0x0020| 03 | . | unused: 0 0x2d-0x2d.5 (0.5)
0x0020| 03 | . | creation_time_present: false 0x2d.5-0x2d.6 (0.1)
0x0020| 03 | . | access_time_present: true 0x2d.6-0x2d.7 (0.1)
0x0020| 03 | . | modification_time_present: true 0x2d.7-0x2e (0.1)
0x0020| 57 6a| Wj| modification_time: 1417701975 (2014-12-04T14:06:15Z) 0x2e-0x32 (4)
0x0030|80 54 |.T |
0x0030| 7e 6a 80 54 | ~j.T | access_time: 1417702014 (2014-12-04T14:06:54Z) 0x32-0x36 (4)
| | | [1]{}: extra_field 0x36-0x45 (15)
0x0030| 75 78 | ux | header_id: 0x7875 (UNIX UID/GID) 0x36-0x38 (2)
0x0030| 0b 00 | .. | data_size: 11 0x38-0x3a (2)
0x0030| 75 78 | ux | tag: 0x7875 (UNIX UID/GID) 0x36-0x38 (2)
0x0030| 0b 00 | .. | size: 11 0x38-0x3a (2)
0x0030| 01 04 74 00 00 00| ..t...| data: raw bits 0x3a-0x45 (11)
0x0040|04 14 00 00 00 |..... |
0x0040| ed dd bf aa 03 df bf df e7 ef 9c| ...........| compressed: raw bits 0x45-0x2893 (10318)
Expand All @@ -70,15 +78,17 @@ $ fq -o uncompress=false dv bigzero-zip.zip
0x2890| 00 | . | language_encoding: false 0x289c.4-0x289c.5 (0.1)
0x2890| 00 | . | unused1: 0 0x289c.5-0x289d (0.3)
0x2890| 08 00 | .. | compression_method: "deflated" (8) 0x289d-0x289f (2)
| | | last_modification_date{}: 0x289f-0x28a1 (2)
0x2890| c8| .| hours: 25 0x289f-0x289f.5 (0.5)
0x2890| c8| .| minutes: 3 0x289f.5-0x28a0.3 (0.6)
| | | last_modification{}: 0x289f-0x28a3 (4)
0x2890| c8| .| fat_time: 0x78c8 0x289f-0x28a1 (2)
0x28a0|78 |x |
0x28a0|78 |x | seconds: 24 0x28a0.3-0x28a1 (0.5)
| | | last_modification_time{}: 0x28a1-0x28a3 (2)
0x28a0| 84 | . | year: 66 0x28a1-0x28a1.7 (0.7)
0x28a0| 84 45 | .E | month: 2 0x28a1.7-0x28a2.3 (0.4)
0x28a0| 45 | E | day: 5 0x28a2.3-0x28a3 (0.5)
| | | second: 16 (8)
| | | minute: 6
| | | hour: 15
0x28a0| 84 45 | .E | fat_date: 0x4584 0x28a1-0x28a3 (2)
| | | day: 4
| | | month: 12
| | | year: 2014 (34)
| | | unix_guess: 1417705576 (Timestamp guess based on UTC)
0x28a0| 54 81 15 ae | T... | crc32_uncompressed: 0xae158154 0x28a3-0x28a7 (4)
0x28a0| 4e 28 00 00 | N(.. | compressed_size: 10318 0x28a7-0x28ab (4)
0x28a0| b9 9a 3f 00 | ..?. | uncompressed_size: 4168377 0x28ab-0x28af (4)
Expand All @@ -94,12 +104,17 @@ $ fq -o uncompress=false dv bigzero-zip.zip
0x28c0| 62 69 67 7a 65 72 6f 2e 7a 69 70 | bigzero.zip | file_name: "bigzero.zip" 0x28c1-0x28cc (11)
| | | extra_fields[0:2]: 0x28cc-0x28e4 (24)
| | | [0]{}: extra_field 0x28cc-0x28d5 (9)
0x28c0| 55 54 | UT | header_id: 0x5455 (extended timestamp) 0x28cc-0x28ce (2)
0x28c0| 05 00| ..| data_size: 5 0x28ce-0x28d0 (2)
0x28d0|03 57 6a 80 54 |.Wj.T | data: raw bits 0x28d0-0x28d5 (5)
0x28c0| 55 54 | UT | tag: 0x5455 (extended timestamp) 0x28cc-0x28ce (2)
0x28c0| 05 00| ..| size: 5 0x28ce-0x28d0 (2)
| | | flags{}: 0x28d0-0x28d1 (1)
0x28d0|03 |. | unused: 0 0x28d0-0x28d0.5 (0.5)
0x28d0|03 |. | creation_time_present: false 0x28d0.5-0x28d0.6 (0.1)
0x28d0|03 |. | access_time_present: true 0x28d0.6-0x28d0.7 (0.1)
0x28d0|03 |. | modification_time_present: true 0x28d0.7-0x28d1 (0.1)
0x28d0| 57 6a 80 54 | Wj.T | modification_time: 1417701975 (2014-12-04T14:06:15Z) 0x28d1-0x28d5 (4)
| | | [1]{}: extra_field 0x28d5-0x28e4 (15)
0x28d0| 75 78 | ux | header_id: 0x7875 (UNIX UID/GID) 0x28d5-0x28d7 (2)
0x28d0| 0b 00 | .. | data_size: 11 0x28d7-0x28d9 (2)
0x28d0| 75 78 | ux | tag: 0x7875 (UNIX UID/GID) 0x28d5-0x28d7 (2)
0x28d0| 0b 00 | .. | size: 11 0x28d7-0x28d9 (2)
0x28d0| 01 04 74 00 00 00 04| ..t....| data: raw bits 0x28d9-0x28e4 (11)
0x28e0|14 00 00 00 |.... |
| | | file_comment: "" 0x28e4-0x28e4 (0)
Expand Down
Loading

0 comments on commit a148f76

Please sign in to comment.