feature: Add streaming I/O methods to FileSystem#458
Conversation
Add cross-platform streaming I/O APIs for processing large files without loading them entirely into memory: - readLinesLazy: lazy line-by-line Iterator (truly lazy on JVM/Native) - readChunks: fixed-size byte chunk Iterator - readStream/writeStream: InputStream/OutputStream access - readLinesRx/readChunksRx: Rx-based reactive streaming JVM/Native iterators implement AutoCloseable for resource safety. JS falls back to eager reads since Node.js sync APIs are inherently eager. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request adds streaming and reactive I/O support to the FileSystem trait for JVM, JS, and Native platforms, including lazy line reading and chunked access. The review feedback highlights that the reactive methods currently load entire files into memory, defeating the purpose of streaming. Other identified issues include an inefficient append implementation in JavaScript, a lack of explicit UTF-8 encoding in the Native platform, and resource management concerns in the custom iterators regarding eager initialization and error handling.
| * Reads the file line by line as a reactive stream. Each line is emitted as an OnNext event, | ||
| * followed by OnCompletion. | ||
| */ | ||
| def readLinesRx(path: IOPath): Rx[String] = Rx.fromSeq(readLinesLazy(path).toSeq) |
There was a problem hiding this comment.
The implementation of readLinesRx is not truly streaming because it calls .toSeq on the lazy iterator. This exhausts the iterator and loads the entire file content into memory before creating the reactive stream, which defeats the purpose of using a lazy iterator for large files. If Rx supports it, consider using a method that creates a stream directly from an Iterator or a producer.
| def readChunksRx(path: IOPath, chunkSize: Int = 8192): Rx[Array[Byte]] = Rx.fromSeq( | ||
| readChunks(path, chunkSize).toSeq | ||
| ) |
| case WriteMode.Append => | ||
| val existing = FileSystemJS.readBytes(path) | ||
| FileSystemJS.writeBytes(path, existing ++ bytes, WriteMode.Create) |
There was a problem hiding this comment.
The Append mode implementation in FlushToFileOutputStream is highly inefficient as it reads the entire existing file into memory (readBytes), appends the new data, and then writes it all back. Since FileSystemJS.writeBytes already handles WriteMode.Append (which uses Node's appendFileSync or a browser-specific append), you should delegate to it directly.
case WriteMode.Append =>
FileSystemJS.writeBytes(path, bytes, WriteMode.Append)| override def readLines(path: IOPath): Seq[String] = readString(path).split("\n").toSeq | ||
|
|
||
| override def readLinesLazy(path: IOPath): Iterator[String] = | ||
| val reader = BufferedReader(FileReader(toJavaFile(path))) |
There was a problem hiding this comment.
| private[io] class CloseableLineIterator(reader: BufferedReader) | ||
| extends Iterator[String] | ||
| with AutoCloseable: | ||
| private var nextLine: String | Null = reader.readLine() |
There was a problem hiding this comment.
| val line = nextLine | ||
| if line == null then | ||
| throw java.util.NoSuchElementException("No more lines") | ||
| nextLine = reader.readLine() |
- Move CloseableLineIterator/CloseableChunkIterator to shared sources (uni-core/src/main/scala) to eliminate JVM/Native duplication - Fix JS FlushToFileOutputStream append to use appendFileSync directly instead of read-concatenate-rewrite (avoids loading entire file) - Fix Rx tests to use run() instead of toSeq (unsupported on Scala.js) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use InputStreamReader with UTF-8 in Native readLinesLazy instead of FileReader which uses platform default encoding - Simplify JS FlushToFileOutputStream to delegate directly to writeBytes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
readLinesRx/readChunksRx called .toSeq on the iterator, loading everything into memory before creating the Rx stream. This defeats the purpose of streaming, so remove them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
FileSystemfor processing large files without loading them entirely into memoryreadLinesLazy(lazyIterator[String]),readChunks(lazyIterator[Array[Byte]]),readStream(InputStream),writeStream(OutputStream)readLinesRxandreadChunksRxfor reactive streamingAutoCloseablesupport; JS uses eager fallback since Node.js sync APIs are inherently eagerTest plan
🤖 Generated with Claude Code