A command line tool to decompress snappy files produced by Hadoop.
The snappy files produced by Hadoop contain data headers produced by BlockCompressorStream. Each block is prepended with two 32-bit integers stating the size of the decompressed data and size of the compressed data, respectively. Therefore, snappy library cannot directly decompress it. This command line utility handles the headers and decompresses the data without any dependencies on the Hadoop libraries.
snappycat requires the snappy library to build.
To install it under Ubuntu, run:
sudo apt-get install libsnappy-dev
List the files to decompress as arguments, and
snappycat would output the concatenated results to the standard output.
When no arguments are given, standard input is used as the source of the compressed data.
cat DIRECTARY/*.snappy | snappycat
snappycat is able to handle the input correctly even if some files don't have any records.
To save the output as a file:
./snappycat DIRECTORY/*.snappy > output.txt