Skip to content
/ jzran Public

Automatically exported from code.google.com/p/jzran

Notifications You must be signed in to change notification settings

jkff/jzran

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is it?

It's a Java library based on the zran.c sample from zlib.

You can preprocess a large gzip archive, producing an "index" that can be used for random read access.

You can balance between index size and access speed.

What can I use it for?

You've got a file that is very large, compressible and needs random access: some kind of database, DNA, image, video, XML document etc.

You go through it and remember offsets of what is important to you.

You compress the file.

Now you can use these offsets to access the compressed file.

How to use it?

You give it a SeekableInputStream over the compressed data.

It gives you a SeekableInputStream over the decompressed data.

SeekableInputStream sis = new ByteArraySeekableInputStream(buf);
SeekableInputStream index = RandomAccessGZip.index(sis, 1048576);

...

index.open(sis);

...

index.seek(offset);

byte[] dest = new byte[100];
int n = index.read(dest, 0, dest.length);

What else is there?

You can monitor indexing progress and cancel indexing.

The index is serializable.

You can provide as input (gzip source) a byte[], a ByteBuffer or a RandomAccessFile.

How does it work?

zran just snapshots the decoder's internal state periodically.

How fast is it?

I haven't yet done measurements, but essentially the seek method is O(span) (the sparser your index, the smaller it is and the slower seeks work) and after a seek, you read with the speed of zlib (modulo a couple of memory copies maybe).

About

Automatically exported from code.google.com/p/jzran

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages