Skip to content

Commit

Permalink
prepare release 0.16.1
Browse files Browse the repository at this point in the history
  • Loading branch information
davidrabinowitz committed Jun 11, 2020
1 parent d8bf8db commit 3572f91
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 13 deletions.
15 changes: 11 additions & 4 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
# Release Notes

## 0.16.1 - 2020-06-11
* PR #186: Fixed SparkBigQueryConnectorUserAgentProvider initialization bug

## 0.16.0 - 2020-06-09
* Apache Arrow is not the default read format. Based on our benchmarking, Arrow provides read
performance faster by 40% then Avro. (PR #180)
* Usage simplification: Now instead of using the `table` mandatory option, user can use the built
**Please don't use this version, use 0.16.1 instead**

* PR #180: Apache Arrow is now the default read format. Based on our benchmarking, Arrow provides read
performance faster by 40% then Avro.
* PR #163: Apache Avro was added as a write intermediate format. It shows better performance over parquet
in large (>50GB) datasets. The spark-avro package must be added in runtime in order to use this format.
* PR #176: Usage simplification: Now instead of using the `table` mandatory option, user can use the built
in `path` parameter of `load()` and `save()`, so that read becomes
`df = spark.read.format("bigquery").load("source_table")` and write becomes
`df.write.format("bigquery").save("target_table")` (PR #176)
`df.write.format("bigquery").save("target_table")`
* An experimental implementation of the DataSource v2 API has been added. **It is not ready for
production use.**
* BigQuery API has been upgraded to version 1.116.1
Expand Down
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ repository. It can be used using the `--packages` option or the

| Scala version | Connector Artifact |
| --- | --- |
| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.0` |
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.16.0` |
| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.1` |
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.16.1` |

## Hello World Example

Expand Down Expand Up @@ -278,7 +278,8 @@ The API Supports a number of options to configure the read
<td><code>intermediateFormat</code>
</td>
<td>The format of the data before it is loaded to BigQuery, values can be
either "parquet" or "orc".
either "parquet","orc" or "avro". In order to use the Avro format, the
spark-avro package must be added in runtime.
<br/>(Optional. Defaults to <code>parquet</code>). On write only.
</td>
<td>Write</td>
Expand Down Expand Up @@ -536,7 +537,7 @@ using the following code:
```python
from pyspark.sql import SparkSession
spark = SparkSession.builder\
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.0")\
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.1")\
.getOrCreate()
df = spark.read.format("bigquery")\
.load("dataset.table")
Expand All @@ -545,15 +546,15 @@ df = spark.read.format("bigquery")\
**Scala:**
```python
val spark = SparkSession.builder
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.0")
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.16.1")
.getOrCreate()
val df = spark.read.format("bigquery")
.load("dataset.table")
```

In case Spark cluster is using Scala 2.12 (it's optional for Spark 2.4.x,
mandatory in 3.0.x), then the relevant package is
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.16.0. In
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.16.1. In
order to know which Scala version is used, please run the following code:

**Python:**
Expand All @@ -577,14 +578,14 @@ To include the connector in your project:
<dependency>
<groupId>com.google.cloud.spark</groupId>
<artifactId>spark-bigquery-with-dependencies_${scala.version}</artifactId>
<version>0.16.0</version>
<version>0.16.1</version>
</dependency>
```

### SBT

```sbt
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.16.0"
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.16.1"
```

## Building the Connector
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ lazy val sparkVersion = "2.4.0"

lazy val commonSettings = Seq(
organization := "com.google.cloud.spark",
version := "0.16.1-SNAPSHOT",
version := "0.16.1",
scalaVersion := scala211Version,
crossScalaVersions := Seq(scala211Version, scala212Version)
)
Expand Down

0 comments on commit 3572f91

Please sign in to comment.