Reactive file readers based on NIO and Flow API that could be integrated with Flux, RxJava.
LineReader provides alternative for Stream<String>
from Files.lines(Path path) and reads lines by ByteBuffers
. Almost always LineReader
consumes only 32KB of heap memory and its consumption isn't depend on file's size (uses additional memory only for lines that are greater than 32768 characters).
Also there's FileReader for simple reactive reading.
Printing all lines
new LineReader(Paths.get("resource.txt")).subscribe(new Flow.Subscriber<>() {
@Override
public void onSubscribe(Flow.Subscription s) { s.request(Long.MAX_VALUE); }
@Override
public void onNext(ByteBuffer line) {
System.out.println(new String(line.array(), line.position(), line.remaining(), UTF_8));
}
@Override
public void onError(Throwable t) { t.printStackTrace(); }
@Override
public void onComplete() { }
});
Getting count of non-empty lines
new LineReader(Paths.get("resource.txt")).subscribe(new Flow.Subscriber<>() {
int count;
@Override
public void onSubscribe(Flow.Subscription s) { s.request(Long.MAX_VALUE); }
@Override
public void onNext(ByteBuffer line) { if (line.hasRemaining()) ++count; }
@Override
public void onError(Throwable t) { t.printStackTrace(); }
@Override
public void onComplete() { System.out.println("The file contains " + count + " non-empty lines."); }
});
When there's no need to process data via String
representation or it's required to know only existance of a line (e.g. getting line count), using of LineReader
gives significant speed up.
On my machine (Core i5-3317U, Ubuntu 14.04, Oracle JDK 9.0.4) getting line count from 8MB file took:
- 12.625 ± 0.351 ms (by
LineReader
) - 48.354 ± 1.890 ms (by
Stream<String>
) - 24.316 ± 0.696 ms (by parallel
Stream<String>
)
This project is licensed under Apache License, version 2.0