0.16.1
·
675 commits
to master
since this release
New Features
- Apache Arrow is now the default read format. Based on our benchmarking, Arrow provides read performance faster by 40% then Avro. (PR #180)
- Apache Avro has been added as a write intermediate format. Based on our testing it shows performance improvements when the DataFrame is larger than 50GB (PR #163)
- Usage simplification: Now instead of using the
table
mandatory option, user can use the built inpath
parameter ofload()
andsave()
, so that read becomesdf = spark.read.format("bigquery").load("source_table")
and write becomesdf.write.format("bigquery").save("target_table")
(PR #176) - An experimental implementation of the DataSource v2 API has been added. It is not ready for production use.
Dependency Updates
- BigQuery API has been upgraded to version 1.116.1
- BigQuery Storage API has been upgraded to version 0.133.2-beta
- gRPC has been upgraded to version 1.29.0
- Guava has been upgraded to version 29.0-jre