Please sign in to comment.
archive/zip: add FileHeader.NonUTF8 field
The NonUTF8 field provides users with a way to explictly tell the ZIP writer to avoid setting the UTF-8 flag. This is necessary because many readers: 1) (Still) do not support UTF-8 2) And use the local system encoding instead Thus, even though character encodings other than CP-437 and UTF-8 are not officially supported by the ZIP specification, pragmatically the world has permitted use of them. When a non-standard encoding is used, it is the user's responsibility to ensure that the target system is expecting the encoding used (e.g., producing a ZIP file you know is used on a Chinese version of Windows). We adjust the detectUTF8 function to account for Shift-JIS and EUC-KR not being identical to ASCII for two characters. We don't need an API for users to explicitly specify that they are encoding with UTF-8 since all single byte characters are compatible with all other common encodings (Windows-1256, Windows-1252, Windows-1251, Windows-1250, IEC-8859, EUC-KR, KOI8-R, Latin-1, Shift-JIS, GB-2312, GBK) except for the non-printable characters and the backslash character (all of which are invalid characters in a path name anyways). Fixes #10741 Change-Id: I9004542d1d522c9137973f1b6e2b623fa54dfd66 Reviewed-on: https://go-review.googlesource.com/75592 Run-TryBot: Joe Tsai <email@example.com> Reviewed-by: Ian Lance Taylor <firstname.lastname@example.org>
- Loading branch information...
Showing with 119 additions and 13 deletions.
- +18 −0 src/archive/zip/reader.go
- +64 −1 src/archive/zip/reader_test.go
- +16 −4 src/archive/zip/struct.go
- BIN src/archive/zip/testdata/utf8-7zip.zip
- BIN src/archive/zip/testdata/utf8-infozip.zip
- BIN src/archive/zip/testdata/utf8-osx.zip
- BIN src/archive/zip/testdata/utf8-winrar.zip
- BIN src/archive/zip/testdata/utf8-winzip.zip
- +13 −8 src/archive/zip/writer.go
- +8 −0 src/archive/zip/writer_test.go