Skip to content
/ gzip Public

A fork of java.util.zip.GZIPInputStream that emits the offsets of nested streams.

License

Notifications You must be signed in to change notification settings

cldellow/gzip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DEPRECATED

Don't use this library!

It relied on accessing internals of java.util.zip.Inflater, which have changed in JDK 11.

Instead, use GzipCompressorInputStream from Apache compress-commons. An example of how to use it is at https://github.com/cldellow/warc-service/blob/e95f8f5906c39efeb781a47b343a7cec179af7e3/src/main/scala/com/cldellow/warc/framework/WarcHandler.scala#L62

gzip

Build Status codecov Maven Central

Emit offsets of nested GZIP streams.

GZIP has the interesting property that a sequence of concatenated GZIP streams can be read as though it were a single GZIP stream.

The Web Archive (WARC) format takes advantage of this to store tens of thousands of GZIP streams in a single file. When processing such a file, it can be useful to know the start of the underlying stream. The stock java.util.zip.GZIPInputStream class does not expose this.

This library patches that class expose a callback which gets invoked with the offsets of the member streams.

Usage

int[] offsets = new int[100];

GZIPInputStream gzis = new GZIPInputStream(is, (member, offset) -> { offsets[member] = offset; });

License

This library is a fork of java.util.zip.GZIPInputStream as implemented by Oracle.

This library's contents are subject to the GPL "Classpath" exception. You may link it into an executable without that executable itself having to be licensed under the GPL.

About

A fork of java.util.zip.GZIPInputStream that emits the offsets of nested streams.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages