Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 

snappycat

A command line tool to decompress snappy files produced by Hadoop.

The snappy files produced by Hadoop contain data headers produced by BlockCompressorStream. Each block is prepended with two 32-bit integers stating the size of the decompressed data and size of the compressed data, respectively. Therefore, snappy library cannot directly decompress it. This command line utility handles the headers and decompresses the data without any dependencies on the Hadoop libraries.

Build

snappycat requires the snappy library to build.

To install it under Ubuntu, run:

sudo apt-get install libsnappy-dev

Usage

List the files to decompress as arguments, and snappycat would output the concatenated results to the standard output.

./snappycat DIRECTORY/*.snappy

When no arguments are given, standard input is used as the source of the compressed data.

cat DIRECTARY/*.snappy | snappycat

snappycat is able to handle the input correctly even if some files don't have any records.

To save the output as a file:

./snappycat DIRECTORY/*.snappy > output.txt

About

A command line tool to decompress snappy files produced by Hadoop

Topics

Resources

License

Releases

No releases published

Packages

No packages published