Skip to content

Releases: GoogleCloudDataproc/spark-bigquery-connector

0.27.1

18 Oct 22:09
Compare
Choose a tag to compare
  • PR #792: Added ability to set table labels while writing to a BigQuery table
  • PR #796: Allowing custom BigQuery API endpoints
  • PR #803: Removed grpc-netty-shaded from the connector jar
  • Protocol Buffers has been upgraded to version 3.21.7, addressing CVE-2022-3171
  • BigQuery API has been upgraded to version 2.16.1
  • BigQuery Storage API has been upgraded to version 2.21.0
  • gRPC has been upgraded to version 1.49.1
  • Netty has been upgraded to version 4.1.82.Final

0.27.0

21 Sep 20:08
Compare
Choose a tag to compare
  • Added new Scala 2.13 connector, aimed at Spark versions from 3.2 and above
  • PR #750: Adding support for custom access token creation. See more here.
  • PR #745: Supporting load from query in spark-3.1-bigquery.
  • PR #767: Adding the option createReadSessionTimeoutInSeconds, to override the timeout for CreateReadSession.

0.26.0

18 Jul 17:44
Compare
Choose a tag to compare
  • All connectors support the DIRECT write method, using the BigQuery Storage Write API,
    without first writing the data to GCS. DIRECT write method is in preview mode.
  • spark-3.1-bigquery has been released in preview mode. This is a Java only library,
    implementing the Spark 3.1 DataSource v2 APIs.
  • BigQuery API has been upgraded to version 2.13.8
  • BigQuery Storage API has been upgraded to version 2.16.0
  • gRPC has been upgraded to version 1.47.0
  • Netty has been upgraded to version 4.1.79.Final

0.25.2

23 Jun 00:55
Compare
Choose a tag to compare
  • PR #673: Added integration tests for BigLake external tables.
  • PR #674: Increasing default maxParallelism to 10K for BigLake external tables

0.25.1

13 Jun 21:33
Compare
Choose a tag to compare
  • Issue #651: Fixing the write back to BigQuery.
  • PR #664: Add support for BigLake external tables.
  • PR #667: Allowing clustering on unpartitioned tables.
  • PR #668: Using spark default parallelism as default.

0.25.0

31 May 21:57
Compare
Choose a tag to compare
  • Issue #593: Allow users to disable cache when loading data via SQL query,
    by setting cacheExpirationTimeInMinutes=0
  • PR #613: Added field level schema checks. This can be disabled by setting
    enableModeCheckForSchemaFields=false
  • PR #618: Added support for the enableListInterface option. This allows to
    use parquet as an intermediate format also for arrays, without adding the
    list element in the resulting schema as described
    here
  • PR #641: Removed Conscrypt from the shaded artifact in order to improve
    compatibility with Dataproc Serverless and with clusters where Conscrypt is
    disabled.
  • BigQuery API has been upgraded to version 2.10.6
  • BigQuery Storage API has been upgraded to version 2.12.0
  • gRPC has been upgraded to version 1.46.0
  • Netty has been upgraded to version 4.1.75.Final

0.24.2

05 Apr 21:16
Compare
Choose a tag to compare

Bug Fixes

  • PR #580: Fixed shaded artifacts version flattening, the version appears
    correctly in the released POM
  • PR #583: netty-tcnative is taken from the Netty BOM
  • PR #584: CVE-2020-36518 - Upgraded jackson

0.24.1

05 Apr 21:13
Compare
Choose a tag to compare

Bug Fixes

  • PR #576: Fixed error running on Datapoc clusters where conscrypt is disabled
    (the propertydataproc.conscrypt.provider.enable set to false)

0.24.0

25 Mar 00:58
Compare
Choose a tag to compare

New Features

  • PR #518: Cache expiration time can be configured now.
  • PR #561: Added support for adding trace ID to the BigQuery reads and writes.
    The trace Id will be of the format Spark:ApplicateName:JobID. The
    application name must be set by the user, job ID is defaults to Dataproc job
    ID if exists, otherwise it is set to spark.app.id.
  • PR #568: Added support for BigQuery jobs labels

Bug Fixes

  • PR #563: Fixed a bug where using writeMethod=DIRECT and SaveMode=Append the
    destination table may have been deleted in case abort() has been called.
  • Issue #530: Treating Field.mode==null as Nullable

Dependency Updates

  • BigQuery API has been upgraded to version 2.9.4
  • BigQuery Storage API has been upgraded to version 2.11.0
  • gRPC has been upgraded to version 1.44.1
  • Netty has been upgraded to version 4.1.73.Final

0.23.2

20 Jan 18:32
Compare
Choose a tag to compare

New Features

  • PR #521: Added Arrow compression options to the spark-bigquery-with-dependencies_2.* connectors
  • PR #526: Added the option to use parent project for the metadata/jobs API as well

Dependency Updtyes

  • BigQuery API has been upgraded to version 2.3.3
  • BigQuery Storage API has been upgraded to version 2.4.2
  • gRPC has been upgraded to version 1.42.1
  • Netty has been upgraded to version 4.1.70.Final