Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes #3814

LeeTibbert · 2024-03-06T21:44:49Z

javalib java.util.zip classes now support writing and reading UTF-8 ("Unicode Transformation Format – 8-bit")
entry names and archive and entry comments.
java.util.zip.ZipOutputStream now follows the JVM practice of not throwing an Exception is zero entries
are written. Former behavior was sensible, but not the JVM way.
both now use standard java.lang.String methods to do Charset conversions. In particular, this
should now handle 4-byte UTF-8 codepoints.
Scala Native java.util.zip methods are still limited to original zip format. No zip64.
Java 8 and above support zip64, so there is room for improvement here.

It should be noted that .zip files have a lot of time honored complexity, both at the zip level and, especially,
with .zip files written on one operating system being readable on another. The support implemented
in this is designed to match the JVM behavior. A file written by JVM ought to be readable by this code
and so on for the various 2x2 combinations.

Differing operating systems may or may not be able to display the UTF-8 file and comments of this
PR. The intention is that extracting files from archives created by the code of this PR should succeed
and have the expected UTF-8 names, to the greatest extent feasible.

TL; DR - When using UTF-8 names, if it works for you, great! If not, sorry. It may be a Scala Native bug
or it may be the joys of zip and interoperability. The goal is to reduce the former without
diminishing the latter.

Scala Native java.lang currently supports Unicode 13.0. The September 2023 Unicode version is 15.1.
The very latest emojis, etc may not be available. That is an open question.

…java.lang.zip classes

LeeTibbert · 2024-03-06T23:45:12Z

Ready for review, when its turn comes around. Thank you.

The two failures are in macOS JVM compliance. One is a "signal 4". The other
is an error in "pipedOutput". Gratifying but strange that all the straight
macOS tests pass just fine. Go figure. Is there something different
in the environment of those two sets of tests?

WojciechMazur

LGTM, thank you!

Fix scala-native#3798, scala-native#3786: Implement UTF-8 support in …

99eecb5

…java.lang.zip classes

LeeTibbert changed the title ~~Fix #3798, #3786: Implement UTF-8 support in java.util.zip classes~~ Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes Mar 6, 2024

LeeTibbert added the component:javalib label Mar 6, 2024

Supply the missing reference .zip

8dd4448

WojciechMazur approved these changes Mar 7, 2024

View reviewed changes

WojciechMazur merged commit 3c5c8d4 into scala-native:main Mar 7, 2024
61 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes #3814

Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes #3814

LeeTibbert commented Mar 6, 2024 •

edited

LeeTibbert commented Mar 6, 2024

WojciechMazur left a comment

Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes #3814

Fix #3796, #3786: Implement UTF-8 support in java.util.zip classes #3814

Conversation

LeeTibbert commented Mar 6, 2024 • edited

LeeTibbert commented Mar 6, 2024

WojciechMazur left a comment

Choose a reason for hiding this comment

LeeTibbert commented Mar 6, 2024 •

edited