Release 0.25.0.

GoogleCloudDataproc · May 31, 2022 · 3ad1259 · 3ad1259
1 parent 92bc202
commit 3ad1259
Show file tree

Hide file tree

Showing 2 changed files with 39 additions and 23 deletions.
diff --git a/CHANGES.md b/CHANGES.md
@@ -1,6 +1,6 @@
 # Release Notes
 
-## Next
+## 0.25.0 - 2022-05-31
 * Issue #593: Allow users to disable cache when loading data via SQL query,
   by setting `cacheExpirationTimeInMinutes=0`
 * PR #613: Added field level schema checks. This can be disabled by setting

diff --git a/README.md b/README.md
@@ -57,9 +57,9 @@ The latest version of the connector is publicly available in the following links
 
 | version | Link |
 | --- | --- |
-| Scala 2.11 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar)) |
-| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.24.2.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.24.2.jar)) |
-| Spark 2.4  | `gs://spark-lib/bigquery/spark-2.4-bigquery-0.24.2-preview.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-2.4-bigquery-0.24.2-preview.jar)) |
+| Scala 2.11 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar)) |
+| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.25.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.25.0.jar)) |
+| Spark 2.4  | `gs://spark-lib/bigquery/spark-2.4-bigquery-0.25.0-preview.jar`([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-2.4-bigquery-0.25.0-preview.jar)) |
 
 The only difference between first two connectors is that the former is a Scala 2.11 based connector, targeting Spark 2.3
 and 2.4 using Scala 2.11 whereas the latter is a Scala 2.12 based connector, targeting Spark 2.4 and 3.x using Scala 2.12.
@@ -78,9 +78,9 @@ repository. It can be used using the `--packages` option or the
 
 | version | Connector Artifact |
 | --- | --- |
-| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.24.2` |
-| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.24.2` |
-| Spark 2.4  | `com.google.cloud.spark:spark-2.4-bigquery:0.24.2-preview` |
+| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.25.0` |
+| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.25.0` |
+| Spark 2.4  | `com.google.cloud.spark:spark-2.4-bigquery:0.25.0-preview` |
 
 ## Hello World Example
 
@@ -90,15 +90,15 @@ You can run a simple PySpark wordcount against the API without compilation by ru
 
 ```
 gcloud dataproc jobs submit pyspark --cluster "$MY_CLUSTER" \
-  --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.24.2.jar \
+  --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.25.0.jar \
   examples/python/shakespeare.py
 ```
 
 **Dataproc image 1.4 and below**
 
 ```
 gcloud dataproc jobs submit pyspark --cluster "$MY_CLUSTER" \
-  --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar \
+  --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar \
   examples/python/shakespeare.py
 ```
 
@@ -672,6 +672,22 @@ The API Supports a number of options to configure the read
      </td>
      <td>Read</td>
    </tr>
+   <tr valign="top">
+     <td><code>enableModeCheckForSchemaFields</code>
+     </td>
+     <td>  Checks the mode of every field in destination schema to be equal to the mode in corresponding source field schema, while writing from one big query table to another.
+          <br/> Default value is true i.e., the check is done by default. If set to false the mode check is ignored.
+     </td>
+     <td>Write</td>
+  </tr>
+     <td><code>enableListInference</code>
+     </td>
+     <td>  Indicates whether to use schema inference specifically when the mode is Parquet (https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#parquetoptions).
+        <br/> Defaults to false.
+        <br/>
+     </td>
+     <td>Write</td>
+   </tr>
 </table>
 
 Options can also be set outside of the code, using the `--conf` parameter of `spark-submit` or `--properties` parameter
@@ -854,9 +870,9 @@ creating the job or added during runtime. See examples below:
 1) Adding python files while launching pyspark
 ```
 # use appropriate version for jar depending on the scala version
-pyspark --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar
-  --py-files gs://spark-lib/bigquery/spark-bigquery-support-0.24.2.zip
-  --files gs://spark-lib/bigquery/spark-bigquery-support-0.24.2.zip
+pyspark --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar
+  --py-files gs://spark-lib/bigquery/spark-bigquery-support-0.25.0.zip
+  --files gs://spark-lib/bigquery/spark-bigquery-support-0.25.0.zip
 ```
 
 2) Adding python files in Jupyter Notebook
@@ -866,14 +882,14 @@ from pyspark import SparkFiles
 # use appropriate version for jar depending on the scala version
 spark = SparkSession.builder\
   .appName('BigNumeric')\
-  .config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar')\
-  .config('spark.submit.pyFiles', 'gs://spark-lib/bigquery/spark-bigquery-support-0.24.2.zip')\
-  .config('spark.files', 'gs://spark-lib/bigquery/spark-bigquery-support-0.24.2.zip')\
+  .config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar')\
+  .config('spark.submit.pyFiles', 'gs://spark-lib/bigquery/spark-bigquery-support-0.25.0.zip')\
+  .config('spark.files', 'gs://spark-lib/bigquery/spark-bigquery-support-0.25.0.zip')\
   .getOrCreate()
 
 # extract the spark-bigquery-support zip file
 import zipfile
-with zipfile.ZipFile(SparkFiles.get("spark-bigquery-support-0.24.2.zip")) as zf:
+with zipfile.ZipFile(SparkFiles.get("spark-bigquery-support-0.25.0.zip")) as zf:
   zf.extractall()
 ```
 
@@ -882,10 +898,10 @@ with zipfile.ZipFile(SparkFiles.get("spark-bigquery-support-0.24.2.zip")) as zf:
 # use appropriate version for jar depending on the scala version
 spark = SparkSession.builder\
   .appName('BigNumeric')\
-  .config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.24.2.jar')\
+  .config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.25.0.jar')\
   .getOrCreate()
 
-spark.sparkContext.addPyFile("gs://spark-lib/bigquery/spark-bigquery-support-0.24.2.zip")
+spark.sparkContext.addPyFile("gs://spark-lib/bigquery/spark-bigquery-support-0.25.0.zip")
 ```
 
 Usage Example:
@@ -993,7 +1009,7 @@ using the following code:
 ```python
 from pyspark.sql import SparkSession
 spark = SparkSession.builder
-  .config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.24.2")
+  .config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.25.0")
   .getOrCreate()
 df = spark.read.format("bigquery")
   .load("dataset.table")
@@ -1002,15 +1018,15 @@ df = spark.read.format("bigquery")
 **Scala:**
 ```python
 val spark = SparkSession.builder
-.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.24.2")
+.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.25.0")
 .getOrCreate()
 val df = spark.read.format("bigquery")
 .load("dataset.table")
 ```
 
 In case Spark cluster is using Scala 2.12 (it's optional for Spark 2.4.x,
 mandatory in 3.0.x), then the relevant package is
-com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.24.2. In
+com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.25.0. In
 order to know which Scala version is used, please run the following code:
 
 **Python:**
@@ -1034,14 +1050,14 @@ To include the connector in your project:
 <dependency>
   <groupId>com.google.cloud.spark</groupId>
   <artifactId>spark-bigquery-with-dependencies_${scala.version}</artifactId>
-  <version>0.24.2</version>
+  <version>0.25.0</version>
 </dependency>
 ```
 
 ### SBT
 
 ```sbt
-libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.24.2"
+libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.25.0"
 ```
 
 ## FAQ