Java utilities for IO. Requires Java 8+.
Status: released to Maven Central
Maven site reports are here including javadoc.
- OutputStreams as InputStreams using
IOUtil.pipe
BoundedBufferedReader
to trim long lines and avoid OutOfMemoryError callingreadLine()
when line too longStream<byte[]>
to InputStream usingIOUtil.toInputStream
Use this maven dependency:
<dependency>
<groupId>com.github.davidmoten</groupId>
<artifactId>io-extras</artifactId>
<version>VERSION_HERE</version>
</dependency>
If you have a transformation that you can express with OutputStream
s then you can apply that transformation synchronously to an InputStream
using this library.
An example is you want to pass an InputStream
to a library but you want that InputStream
to be compressed with gzip as well:
InputStream is = new FileInputStream("myfile.txt");
// zip the input stream!
InputStream gz = IOUtil.pipe(is, o -> new GZIPOutputStream(o));
In fact for gzip in particular there is a dedicated method:
InputStream gz = IOUtil.gzip(is);
Internally, the IOUtil.pipe
method uses a buffer so that data is read into a fixed length byte array. You can specify the bufferSize
like this:
// set the buffer size (default is 8192)
InputStream gz = IOUtil.pipe(is, o -> new GZIPOutputStream(o), 4096);
Data is passed through the transformation synchronously and this is achieved by reading data from the source InputStream
and passing that data through a transformed QueuedOutputStream
. Once bytes are passed through the transformation the output arrays are placed on a queue and the first item on the queue is then used as the input for the next read for the resultant InputStream
. Given that the amount requested by the client may be less than the size of the first item on the queue, the remnant may be placed back on the first position on the queue ready for the next read. If no data is placed on the queue then more data is read by the source and we continue till something is ready for output.
Unfortunately the InputStream.read
method blocks till a read is available so the only way to handle this is by using another thread. I haven't knocked anything up for this because I think there are products out there already.
When you call BufferedReader.readLine()
you can run out of memory if the line is too long. In a situation where your code is reading data from an uncontrolled source (like an HTTP POST from a user)
this can crash your web server/program which is obviously undesirable. To trim each line without provoking high memory usage:
BoundedBufferedReader br = new BoundedBufferedReader(reader, bufferSize, maxLineLength);
...
String line = br.readLine();
...
Note that BoundedBufferedReader
does make efficient use of its buffer. The code for the class is a copy and modification of BufferedReader
from JDK 8.
To test BoundedBufferedReader
with a really long line call
mvn clean test -Dn=1000000000
n
is the number of characters in a single long line of input. As a guide n = 2x10^9 takes about 25 seconds to complete.
InputStream in = IOUtil.toInputStream(stream);
An example of where this is useful is if you wanted to filter lines from an InputStream:
InputStream in = ...;
Stream<byte[]> lines = new BufferedReader(new InputStreamReader(in))
.lines()
// filter out empty lines and comment lines
.filter(line -> !line.trim().isEmpty() && !line.startsWith("#"))
// not perfect, may add an extra \n to the source!
// to fix use kool Stream with buffer operator
.map(line -> (line + "\n").getBytes(StandardCharsets.UTF_8));
InputStream in2 = IOUtil.toInputStream(lines);