You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RowReader can now be transformed into an async reactive stream of incoming rows by calling the appropriately-named rows() method. This can be used to implement map/reduce, transforms, and parallelism into your data processing with a few simple operators. These operations are provided by RxJava, which is now a dependency of Tabitha.
Readers now take a ReaderOptions which make it easier to customize runtime options for reading.
The page-related methods have been removed from RowReader and files are treated as a continuous stream of rows across all pages. To get data for specific pages, you can emulate the old behavior easily with rows() and either grouping by or filtering on the page number.
All Rows from a reader now "remember" their position in the source file. Check the page index and row index of the row using the page() and index() methods, respectively.
Changed
Quite a few classes have been renamed or moved around packages. The "entrypoint" classes RowReaderFactory and RowWriterFactory, have been shortened to RowReaders and RowWriters.
Row writers no longer work in terms of Rows, but instead write List<Variant> as rows. This makes it much easier to generate data in the right format for writing.
Creating a writer with an ambiguous format no longer assumes CSV; the format must be explicit.
Fixed
Fixed a bug in the XLSX reader for text cells with inline data instead of using the string table.
Removed
DataFrame has been removed.
Parallel processing utilities have been removed. This can be done using rows(), which exposes RxJava's much more powerful parallel processing abilities.