Skip to content

Commit

Permalink
Merge pull request #218 from TJC/docs-for-file-format-compat
Browse files Browse the repository at this point in the history
Add README detail around file format compatibility
  • Loading branch information
xerial committed Aug 7, 2018
2 parents 73c67c7 + b40dffc commit 820038c
Showing 1 changed file with 12 additions and 1 deletion.
13 changes: 12 additions & 1 deletion README.md
Expand Up @@ -83,9 +83,20 @@ Stream-based compressor/decompressor `SnappyOutputStream`/`SnappyInputStream` ar
* See also [Javadoc API](https://oss.sonatype.org/service/local/repositories/releases/archive/org/xerial/snappy/snappy-java/1.1.3-M1/snappy-java-1.1.3-M1-javadoc.jar/!/index.html)

#### Compatibility Notes
* `SnappyOutputStream` and `SnappyInputStream` use `[magic header:16 bytes]([block size:int32][compressed data:byte array])*` format. You can read the result of `Snappy.compress` with `SnappyInputStream`, but you cannot read the compressed data generated by `SnappyOutputStream` with `Snappy.uncompress`. Here is the data format compatibility matrix:

The original Snappy format definition did not define a file format. It later added
a "framing" format to define a file format, but by this point major software was
already using an industry standard instead -- represented in this library by the
`SnappyOutputStream` and `SnappyInputStream` methods.

For interoperability with other libraries, check that compatible formats are used.
Note that not all libraries support all variants.

* `SnappyOutputStream` and `SnappyInputStream` use `[magic header:16 bytes]([block size:int32][compressed data:byte array])*` format. You can read the result of `Snappy.compress` with `SnappyInputStream`, but you cannot read the compressed data generated by `SnappyOutputStream` with `Snappy.uncompress`.
* `SnappyHadoopCompatibleOutputStream` does not emit a file header but write out the current block size as a preemble to each block

#### Data format compatibility matrix:

| Write\Read | `Snappy.uncompress` | `SnappyInputStream` | `SnappyFramedInputStream` | `org.apache.hadoop.io.compress.SnappyCodec` |
| --------------- |:-------------------:|:------------------:|:-----------------------:|:-------------------------------------------:|
| `Snappy.compress` | ok | ok | x | x |
Expand Down

0 comments on commit 820038c

Please sign in to comment.