v2.16.0
This release introduces a feature that enables significant improvement in the performance of reading Parquet files. Parquet storage, like a data lake usually consists of a huge number of files. How can we speed up the reading of such a storage? Simply by reading multiple files in parallel at the same time!
Parquet4s by default reads a file by file - in a sequence. Now, by using Akka, Pekko or FS2, you can choose a parallelism level and read multiple files at the same time, while still controlling the utilization of resources. Simply use the option parallelism(n = ???)
when defining your reader.
Besides that, there were numerous minor and bugfix dependency updates, e.g. in Pekko, Cats Effect, FS2 and Slf4j.
Big thanks to @calvinlfer for his contribution.