Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Support for Streams 4.0 and BI 4.0 is now available!

Parquet is a columnar storage format for Hadoop. Parquet becoming more and more popular due to its very efficient compression and encoding schemes. See more details at Parquet home page:

The Parquet toolkit allows to write data in Parquet format from streaming applications. The toolkit is implemented in Java and contains ParquetSink operator in its initial version.

Samples showing ParquetSink operator usages are available in a samples folder. The details about the installation, and configuration are about to be published soon.

Toolkit documentation is available at: