01 Mar 17:27

davidrabinowitz

0f3219d

0.19.1

Bug Fixes

PR #324 - Restoring version 0.18.1 dependencies due to networking issues

Dependency Updates

BigQuery API has been upgraded to version 1.123.2
BigQuery Storage API has been upgraded to version 1.6.0
Guava has been upgraded to version 30.0-jre
Netty has been upgraded to version 4.1.51.Final

Assets 4

25 Feb 22:20

davidrabinowitz

0.19.0

67cbc83

0.19.0

New Features

Issue #247: Allowing to load results of any arbitrary SELECT query from BigQuery.
Issue #310: Allowing to configure the expiration time of materialized data.
PR #283: Implemented Datasource v2 write support.
Improved Spark 3 compatibility.

Dependency Updates

BigQuery API has been upgraded to version 1.127.4
BigQuery Storage API has been upgraded to version 1.10.0
Guava has been upgraded to version 30.1-jre
Netty has been upgraded to version 4.1.52.Final

Assets 4

22 Jan 00:04

davidrabinowitz

0.18.1

5e15e1f

0.18.1

New Features

PR #276: Added the option to enable useAvroLogicalTypes option When writing data to BigQuery.

Bug Fixes

Issue #248: Reducing the size of the URI list when writing to BigQuery. This allows larger DataFrames (>10,000 partitions) to be safely written.
Issue #296: Removed redundant packaged slf4j-api.

Assets 4

13 Nov 00:18

davidrabinowitz

0.18.0

9c8afb6

0.18.0

New Features

Issue #226: Adding support for HOUR, MONTH, DAY Time Partitions
Issue #260: Increasing connection timeout to the BigQuery service, and configuring the request retry settings.

Bug Fixes

Issue #263: Fixed select * error when ColumnarBatch is used (DataSource v2)
Issue #266: Fixed the external configuration not working regression bug (Introduced in version 0.17.2)
PR #262: Filters on BigQuery DATE and TIMESTAMP now use the right type.

Dependency Updates

BigQuery API has been upgraded to version 1.123.2
BigQuery Storage API has been upgraded to version 1.6.0
Guava has been upgraded to version 30.0-jre
Netty has been upgraded to version 4.1.51.Final
netty-tcnative has been upgraded to version 4.1.34.Final

Assets 4

06 Oct 21:46

davidrabinowitz

0.17.3

cdc65d4

0.17.3

Bug Fixes

PR #242, #243: Fixed Spark 3 compatibility, added Spark 3 acceptance test
Issue #249: Fixing credentials creation from key

Assets 4

10 Sep 22:18

davidrabinowitz

0.17.2

be20eb1

0.17.2

Bug Fixes

PR #239: Ensuring that the BigQuery client will have the proper project id

Assets 4

06 Aug 19:15

davidrabinowitz

0.17.1

3c6e978

0.17.1

New Features

PR #229: Adding support for Spark ML Vector and Matrix data types

Bug Fixes

Issue #216: removed redundant ALPN dependency
Issue #219: Fixed the LessThanOrEqual filter SQL compilation in the DataSource v2 implementation
Issue #221: Fixed ProtobufUtilsTest.java with newer BigQuery dependencies

Dependency Updates

BigQuery API has been upgraded to version 1.116.8
BigQuery Storage API has been upgraded to version 1.3.1

Assets 4

22 Jul 00:59

davidrabinowitz

0.17.0

3b61d8d

0.17.0

New Features

Structured streaming write is now supported (PR #201, thanks @varundhussa)
Users now has the option to keep the data on GCS after writing to BigQuery (PR #202, thanks @leoneuwald)
Enabling to overwrite data of a single date partition (PR #211)
Supporting MATERIALIZED_VIEW as table type (PR #192)
Supporting columnar batch reads from Spark in the DataSource V2 implementation. (PR #198) It is not ready for production use.

Bug Fixes

Conditions on StructType fields are now handled by Spark and not the connector, Fixing Issue #197

Dependency Updates

BigQuery API has been upgraded to version 1.116.3
BigQuery Storage API has been upgraded to version 1.0.0
Netty has been upgraded to version 4.1.48.Final (Fixing issue #200)

Assets 4

11 Jun 18:15

davidrabinowitz

0.16.1

3572f91

0.16.1

New Features

Apache Arrow is now the default read format. Based on our benchmarking, Arrow provides read performance faster by 40% then Avro. (PR #180)
Apache Avro has been added as a write intermediate format. Based on our testing it shows performance improvements when the DataFrame is larger than 50GB (PR #163)
Usage simplification: Now instead of using the table mandatory option, user can use the built in path parameter of load() and save(), so that read becomes df = spark.read.format("bigquery").load("source_table") and write becomes df.write.format("bigquery").save("target_table") (PR #176)
An experimental implementation of the DataSource v2 API has been added. It is not ready for production use.

Dependency Updates

BigQuery API has been upgraded to version 1.116.1
BigQuery Storage API has been upgraded to version 0.133.2-beta
gRPC has been upgraded to version 1.29.0
Guava has been upgraded to version 29.0-jre

Assets 4

27 Apr 17:39

davidrabinowitz

0.15.1-beta

b1177aa

0.15.1-beta

A bug fix release:

PR #158: Users can now add the spark.datasource.bigquery prefix to the configuration options in order to support Spark's --conf command line flag
PR #160: View materialization is performed only on action, fixing a bug where view materialization was done too early

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes

Dependency Updates

New Features

Dependency Updates

New Features

Bug Fixes

New Features

Bug Fixes

Dependency Updates

Bug Fixes

Bug Fixes

New Features

Bug Fixes

Dependency Updates

New Features

Bug Fixes

Dependency Updates

New Features

Dependency Updates

Releases: GoogleCloudDataproc/spark-bigquery-connector

0.19.1

Bug Fixes

Dependency Updates

0.19.0

New Features

Dependency Updates

0.18.1

New Features

Bug Fixes

0.18.0

New Features

Bug Fixes

Dependency Updates

0.17.3

Bug Fixes

0.17.2

Bug Fixes

0.17.1

New Features

Bug Fixes

Dependency Updates

0.17.0

New Features

Bug Fixes

Dependency Updates

0.16.1

New Features

Dependency Updates

0.15.1-beta