Releases · GoogleCloudDataproc/spark-bigquery-connector

20 Dec 16:06

dataproc-robot

0.35.0

847abe0

0.35.0

PR #1115: Added new connector, spark-3.5-bigquery aimed to be used in Spark 3.5. This connector implements new APIs and capabilities provided by the Spark Data Source V2 API.
PR #1117: Make read session caching duration configurable
PR #1118: Improve read session caching key
PR #1122: Set traceId on write
PR #1124: Added SparkListenerEvents for Query and Load jobs running on BigQuery
PR #1127: Fix job labeling for mixed case Dataproc job names
PR #1136: Consider projections for biglake stats
PR #1143: Enable async write for default stream
BigQuery API has been upgraded to version 2.35.0
BigQuery Storage API has been upgraded to version 2.47.0
GAX has been upgraded to version 2.38.0
gRPC has been upgraded to version 1.60.0
Netty has been upgraded to version 4.1.101.Final
Protocol Buffers has been upgraded to version 3.25.1

Assets 10

31 Oct 21:26

dataproc-robot

0.34.0

155470c

0.34.0

PR #1057: Enable async writes for greater throughput
PR #1094: CVE-2023-5072: Upgrading the org.json:json dependency
PR #1095: CVE-2023-4586: Upgrading the netty dependencies
PR #1104: Fixed nested field predicate pushdown
PR #1109: Enable read session caching by default for faster Spark planning
PR #1111: Enable retry of failed messages
Issue #103: Support for Dynamic partition overwrite for time and range partitioned table
Issue #1099: Fixing the usage of ExternalAccountCredentials
BigQuery API has been upgraded to version 2.33.2
BigQuery Storage API has been upgraded to version 2.44.0
GAX has been upgraded to version 2.35.0
gRPC has been upgraded to version 1.58.0
Protocol Buffers has been upgraded to version 3.24.4

Assets 9

17 Oct 23:24

dataproc-robot

0.33.0

352cf0e

0.33.0

Added new connector, spark-3.4-bigquery aimed to be used in Spark 3.4 and above. This connector implements new APIs and capabilities provided by the Spark Data Source V2 API.
PR #1008: Adding support to expose BigQuery metrics using Spark custom metrics API.
PR #1038: Logical plan now shows the BigQuery table of DirectBigQueryRelation. Thanks @idc101 !
PR #1058: View names will appear in query plan instead of the materialized table
PR #1061: Handle NPE case when reading BQ table with NUMERIC fields. Thanks @hayssams !
PR #1069: Support TimestampNTZ datatype in spark 3.4
Issue #453: fix comment handling in query
Issue #144: allow writing Spark String to BQ TIME type
Issue #867: Support writing with RangePartitioning
Issue #1046: Add a way to disable map type support
Issue #1062: Adding dataproc job ID and UUID labels to BigQuery jobs

Contributors

hayssams and idc101

Assets 9

07 Aug 18:35

dataproc-robot

0.32.2

4e50c95

0.32.2

CVE-2023-34462: Upgrading netty to verision 4.1.96.Final

Assets 8

04 Aug 02:16

dataproc-robot

0.32.1

9851337

0.32.1

PR #1025: Handle Java 8 types for dates and timestamps when compiling filters. Thanks @tom-s-powell !
Issue #1026: Fixing Numeric conversion
Issue #1028: Fixing PolicyTags removal on overwrite

Contributors

tom-s-powell

Assets 8

18 Jul 16:28

dataproc-robot

0.32.0

ba3b310

0.32.0

Issue #748: _PARTITIONDATE pseudo column is provided only for ingestion time daily partitioned tables
Issue #990: Fix to support allowFieldAddition for columns with nested fields.
Issue #993: Spark ML vector read and write fails
PR #1007: Implement at-least-once option that utilizes default stream

Assets 8

06 Jun 20:31

dataproc-robot

0.31.1

f14e8a5

0.31.1

Issue #988: Read statistics are logged at TRACE level. Update the log4j configuration accordingly in order to log them.

Assets 8

02 Jun 15:23

dataproc-robot

0.31.0

2810182

0.31.0

⚠️ Breaking Change BigNumeric conversion has changed, and it is now converted to Spark's
Decimal data type. Notice that BigNumeric can have a wider precision than Decimal, so additional
setting may be needed. See here
for additional details.
Issue #945: Fixing unable to add new column even with option allowFieldAddition
PR #965: Fix to reuse the same BigQueryClient for the same BigQueryConfig, rather than creating a new one
PR #950: Added support for service account impersonation
PR #960: Added support for basic configuration of the gRPC channel pool size in the BigQueryReadClient.
PR #973: Added support for writing to CMEK managed tables.
PR #971: Fixing wrong results or schema error when Spark nested schema pruning is on for datasource v2
PR #974: Applying DPP to Hive partitioned BigLake tables (spark-3.2-bigquery and spark-3.3-bigquery only)
PR #986: CVE-2020-8908, CVE-2023-2976: Upgrading Guava to version 32.0-jre
BigQuery API has been upgraded to version 2.26.0
BigQuery Storage API has been upgraded to version 2.36.1
GAX has been upgraded to version 2.26.0
gRPC has been upgraded to version 1.55.1
Netty has been upgraded to version 4.1.92.Final
Protocol Buffers has been upgraded to version 3.23.0
PR #957: support direct write with subset field list.

Assets 8

11 Apr 16:12

dataproc-robot

0.30.0

1f9ed24

0.30.0

New connectors are out of preview and are now generally available! This includes all the new
connectors: spark-2.4-bigquery, spark-3.1-bigquery, spark-3.2-bigquery and spark-3.3-bigquery are GA and ready to be used in all workloads. Please
refer to the compatibility matrix
when using them.
Direct write method is out of preview and is now generally available!
spark-bigquery-with-dependencies_2.11 is no longer published. If a recent version of the Scala
2.11 connector is needed, it can be built by checking out the code and running
./mvnw install -Pdsv1_2.11.
Issue #522: Supporting Spark's Map type. Notice there are few restrictions as this is not a
BigQuery native type.
Added support for reading BigQuery table snapshots.
BigQuery API has been upgraded to version 2.24.4
BigQuery Storage API has been upgraded to version 2.34.2
GAX has been upgraded to version 2.24.0
gRPC has been upgraded to version 1.54.0
Netty has been upgraded to version 4.1.90.Final
PR #944: Added support to set query job priority

Assets 8

03 Mar 19:54

dataproc-robot

0.29.0

21558d5

0.29.0

Added two new connectors, spark-3.2-bigquery and spark-3.3-bigquery aimed to be used in Spark 3.2 and 3.3
respectively. Those connectors implement new APIs and capabilities provided by the Spark Data Source V2 API. Both
connectors are in preview mode.
Dynamic partition pruning is supported in preview mode by spark-3.2-bigquery and spark-3.3-bigquery.
This is the last version of the Spark BigQuery connector for scala 2.11. The code will remain in the repository and
can be compiled into a connector if needed.
PR #857: Fixing autovalue shaded classes repackaging
BigQuery API has been upgraded to version 2.22.0
BigQuery Storage API has been upgraded to version 2.31.0
GAX has been upgraded to version 2.23.0
gRPC has been upgraded to version 1.53.0
Netty has been upgraded to version 4.1.89.Final

Assets 9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Contributors

Releases: GoogleCloudDataproc/spark-bigquery-connector

0.35.0

0.34.0

0.33.0

Contributors

0.32.2

0.32.1

Contributors

0.32.0

0.31.1

0.31.0

0.30.0

0.29.0