Skip to content

Latest commit

 

History

History
252 lines (176 loc) · 13.5 KB

README.md

File metadata and controls

252 lines (176 loc) · 13.5 KB

Kala Compress

Gradle Check

This project is based on Apache Commons Compress. Kala Compress has made some improvements on its basis: Modularization (JPMS Support), NIO2 Path API support, etc.

Its API is mostly consistent with Apache Commons Compress, with a few incompatibilities. So I renamed the package (and the module name) from org.apache.commons.compress to kala.compress. Therefore, it can coexist with Apache Commons Compress without conflict.

We assume that you already know about Commons Compress. If not, please refer to the User Guide first.

To add Kala Compress as a dependency, see section Modules.

Different from Apache Commons Compress

Modularization (JPMS Support)

Kala Compress has been fully modularized and now fully supports the the JPMS (Java Platform Module System).

Each compressor and archive is split into a separate artifact with a separate module name, you can optionally add dependencies on some of them without importing the entire Kala Compress. (The size of Kala Compress core jar is less than 90KB)

The appearance of ArchiveStreamFactory and CompressorStreamFactory has not changed, but the inside has been remodeled to find the optional component through ServiceLoader at runtime, so they are no longer dependent on them at compile time.

Each module provides its module-info.class, so it can work well with jlink.

For more information about the Kala Compress modules, see Modules.

Charset

Kala Compress has been completely refactored internally to use java.nio.charset.Charset to represent encoding. All methods that accept an encoding represented String then accept Charset. If you are using String to represent the encoding, use kala.compress.utils.Charsets.toCharset(String) to convert it to Charset.

ZipEncoding has been removed, please switch to Charset.

CharsetNames has been removed, please switch to StandardCharsets.

Kala Compress no longer uses Charset.defaultCharset(), but uses UTF-8 as an alternative. Note that file.encoding defaults to UTF-8 since Java 18. When you want to use platform native encoding, use the kala.compress.utils.Charsets.nativeCharset() explicitly as the alternative.

In addition, APIs that accept encoding represented by String now no longer fall back to the default character set when the encoding is not supported or invalid. Now they throw exceptions just like Charset.forName. (The behavior when null is passed in is not affected, it will still fall back to the UTF-8)

NIO2 Support (java.nio.file.Path and java.nio.file.FileSystem)

Based on the Commons Compress 1.21, Kala Compress has better support for the NIO2 Path API.

Internally, Kala Compress has switched entirely to Path-based representation, all methods that accept File provide overloads that accept Path. The original File-based API is now delegated to the Path-based API. Please use the Path-based API first.

Since File is no longer used internally, Kala Compress fully supports Paths on non default file systems.

(TODO: Provide FileSystem implementations for each archivers)

Rename

ZipFile has been renamed to ZipArchiveReader.

TarFile has been renamed to TarArchiveReader.

SevenZFile and SevenZOutputFile has been renamed to SevenZArchiveReader and SevenZArchiveWriter.

The reason for this is that I want to reserve names like [Archive]File for a more full-featured support class in the future. It should be able to support both reading and writing archives, adding or deleting entries, etc.

Deprecation and removal

The deprecated features in Apache Commons Compress 1.21 have all been removed.

Additional support for OSGI is no longer provided, but this shouldn't make a big difference. If the problem is caused by caching of dependency checks, use the corresponding setCache[Library]Availablity to turn off its caching.

ZipEncoding and CharsetNames has been removed, please switch to Charset and StandardCharsets.

All methods that accept encoding represented by String have been removed, please use the Charset instead.

Since Security Manager will be removed from JDK in the future, Kala Compress no longer use it. For more details, see JEP 411: Deprecate the Security Manager for Removal.

Since finalize method will be removed from JDK in the future, Kala Compress no longer used to clean up resources. For more details, see JEP 421: Deprecate Finalization for Removal. The archiveName in the ZipFile constructor is only used for error reporting in finalize, so it is removed together.

The implementation of pack200 was removed, kala.compress.compressors.pack200 now uses a more flexible reflection strategy to select the underlying implementation:

  • By default, it looks for the following pack200 implementations:
    • For Java 13 and earlier, Kala Compress uses the java.util.jar.Pack200 built into the JDK, so it doesn't need external dependencies at this time.
    • org.glavo.pack200.Pack200 (extracted from JDK 13 and published separately)
    • org.apache.commons.compress.java.util.jar.Pack200 (available in Apache Commons Compress 1.21)
    • io.pack200.Pack200
  • You can use Pack200Utils.setPack200Provider(String provider) to specify specific implementations to use, including those not in the list above.

A small number of methods that accept the File have been removed, please use the Path instead.

Modules

Note: Kala Compress is in beta phase. Although it is developed based on mature Apache Commons Compress and has passed all tests, it may still be unstable. I may need to make some adjustments to the API before releasing to production.

The latest Kala Compress version is 1.21.0.1-beta3.

You can add dependencies on Kala Compress modules as follows:

Maven:

<dependency>
  <groupId>org.glavo.kala</groupId>
  <artifactId>${kala-compress-module-name}</artifactId>
  <version>${kala-compress-version}</version>
</dependency>

Gradle:

dependencies {
  implementation("org.glavo.kala:${kala-compress-module-name}:${kala-compress-version}")
}

Replace the ${kala-compress-module-name} with Kala Compress Maven module name (replace . with - in JPMS module name), replace the ${kala-compress-version} with the latest Kala Compress version.

For example, to add the entire Kala Compress, you need to add the following:

Maven:

<dependency>
  <groupId>org.glavo.kala</groupId>
  <artifactId>kala-compress</artifactId>
  <version>1.21.0.1-beta3</version>
</dependency>

Gradle:

dependencies {
  implementation("org.glavo.kala:kala-compress:1.21.0.1-beta3")
}

All Kala Compress modules are listed below.

This is an empty module, which declares the transitivity dependency on all modules of Kala Compress. You can use all the contents of Kala Compress only by adding dependencies on it.

It is the basic module of Kala Compress, and all other modules depend on it.

It contains the following packages:

  • (package) kala.compress
  • (package) kala.compress.archivers
  • (package) kala.compress.compressors
  • (package) kala.compress.compressors.lz77support
  • (package) kala.compress.compressors.lzw
  • (package) kala.compress.compressors.parallel
  • (package) kala.compress.compressors.utils

It is an empty module that contains transitive dependencies on all compressor modules. You can include all compressors by adding a dependency on it.

In addition, each compressor in Kala Compress has a separate module, and you can add dependencies on one or all of them separately. Here is a list of compressors:

Here are some notes:

  • Different from Apache Commons Compress, the brotli compressor has no external dependencies. It copies the Google Brotli code into package kala.compress.compressors.brotli.dec, The reason for this is that Google Brotli does not support JPMS.
  • The lzma compressor and the xz compressor needs XZ for Java to work.
  • The zstandard compressor needs Zstd JNI to work.

It is an empty module that contains transitive dependencies on all archiver modules. You can include all archivers by adding a dependency on it.

In addition, each archiver in Kala Compress has a separate module, and you can add dependencies on one or all of them separately. Here is a list of archivers:

Here are some notes:

  • The sevenz archiver needs XZ for Java to work.
  • The sevenz archiver and the zip archiver have optional dependencies on the bzip2 compressor and the deflate64 compressor. They can work without these compressors, but errors will occur when they are required.
  • Support for jar (in package kala.compress.archivers.jar) is in the module kala.compress.archivers.zip.

It contains the package kala.compress.changes.

Task list

  • Deprecate ZipEncoding and CharsetNames, replace them with Charset, StandardCharsets and Charsets;
  • Full support for Java Charset, allows users to specify encoding without using String at all;
  • Use UTF-8 by default;
  • Clean up all deprecated features;
  • In preparation for Valhalla, replace the constructor of a class that can become a value class with a factory method;
  • Flexible choice of pack200 implementation (when the JDK has built-in pack200 support, external dependencies are no longer required);
  • Enhanced NIO2 Path API support, migration from File to Path;
  • Provide FileSystem for each archiver;
  • Dynamic loading compressors in CompressorStreamFactory
  • Dynamic loading archivers in ArchiveStreamFactory
  • Split compressors and archivers into separate modules;
  • Full support for JPMS;
  • Rename the package;
  • Publish it to Maven Central

Bug Report

If you encounter problems using it, please open an issue.

If it's an issue upstream of Apache Commons Compress, it's best to give feedback here and I'll port the upstream fix here.