Skip to content

john-morales/concurrent-utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

concurrent-utils

Threaded Java implementation of GZIP output stream for high-performance compression.

High-level features:

  • Write to an OutputStream interface.
  • Configurable buffer sizes, compression level, and thread pooling.
  • Fails fast on any I/O issues on underlying stream or compression.
  • Speeds up compression of larger content close to linearly with # threads.
  • Implements pigz technique of priming compression dictionary with last 1/4 of previous block for better compression.

Quick Start Usage

Accepting all internal defaults:

// * Buffer sizes of 128 kB
// * Default `Deflater` compression level
// * Re-usable fixed thread pool of size = # processors.
final OutputStream out = new ConcurrentGZIPOutputStream(new ByteArrayOutputStream());
// write bytes to the stream however you like
out.write(someBytesToCompress);
out.close();

Benchmarks

Benchmarks performed on a 24-core KnownHost VPS-2 instance with JDK 7u21. See ConcurrentGZIPPerformanceTest for the test case, which can be run with:

$ mvn clean test -DincludePerfTests=true

Random input pattern designed for poor compression: alt text

Sequential input pattern designed to compress well: alt text

Compression ratios in both cases are within ~0.1%. (JRE producing the slightly better compression. The last 1/4 of previous block method helps offset some of the disadvantage, but still isn't as good as using a single dictionary for the entire input.)

See ConcurrentGZIPOutputStreamTest unit test for more examples.

Limitations

  • Moderate amount of overhead. JRE's single-threaded performance is better.

About

Threaded Java library implementation of GZIP output stream

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages