Skip to content
Permalink
Browse files
Notes on using alternative compression (JENA-2181)
Show how Jena can be used with alternative compression formats beyond
those it supports natively
  • Loading branch information
Rob Vesse committed Oct 28, 2021
1 parent 5ae8d21 commit 9eb1cf505a291b9a3fc44f87cb80bd3a4db4458f
Showing 2 changed files with 16 additions and 0 deletions.
@@ -72,6 +72,13 @@ In addition, if the extension is `.gz` the file is assumed to be gzip
compressed. The file name is examined for an inner extension. For
example, `.nt.gz` is gzip compressed N-Triples.

Jena does not support all possible compression formats itself, only
GZip, BZip2 and Snappy are supported directly. If you want to use an
alternative compression format you can do so by piping the output of the
relevant decompression utility into one of Jena's commands e.g.

zstd -d < FILE.nq.zst | riot --syntax NQ ...

These scripts call java programs in the `riotcmd` package. For example:

java -cp ... riotcmd.riot file.ttl
@@ -15,6 +15,15 @@ Files ending in `.gz` are assumed to be gzip-compressed. Input and output
to such files takes this into account, including looking for the other file
extension. `data.nt.gz` is parsed as a gzip-compressed N-Triples file.

Jena does not support all possible compression formats itself, only
GZip, BZip2 and Snappy are supported directly. If you want to use an
alternative compression format you can do so by adding suitable dependencies
yourself into your project and passing a suitable `InputStream`/`OutputStream`
implementation to Jena code e.g.

InputStream input = new ZstdCompressorInputStream(....);
RDFParser.source(input).lang(Lang.NQ).parse(graph);

## StreamRDF

The central abstraction is

0 comments on commit 9eb1cf5

Please sign in to comment.