Skip to content

Commit

Permalink
Changes for 3.2.1 release
Browse files Browse the repository at this point in the history
Summary:
Changes for 3.2.1 release

**Design doc/spec**:
**Docs impact**: none

Test Plan: None

Reviewers: pmishchenko-ua, carl

Reviewed By: pmishchenko-ua

Subscribers: engineering-list

Differential Revision: https://grizzly.internal.memcompute.com/D53538
  • Loading branch information
AdalbertMemSQL committed Dec 14, 2021
1 parent 2cb80b7 commit b5588ab
Show file tree
Hide file tree
Showing 6 changed files with 14 additions and 10 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
2021-12-29 Version 3.2.0
2021-12-14 Version 3.2.1
* Added support of the Spark 3.2
* Fixed links in the README

2021-11-29 Version 3.2.0
* Added support for reading in parallel from aggregator nodes instead of leaf nodes

2021-09-16 Version 3.1.3
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# SingleStore Spark Connector
## Version: 3.2.0 [![License](http://img.shields.io/:license-Apache%202-brightgreen.svg)](http://www.apache.org/licenses/LICENSE-2.0.txt)
## Version: 3.2.1 [![License](http://img.shields.io/:license-Apache%202-brightgreen.svg)](http://www.apache.org/licenses/LICENSE-2.0.txt)

## Getting Started

Expand All @@ -13,11 +13,11 @@ spark-packages.org. The group is `com.singlestore` and the artifact is

You can add the connector to your Spark application using: spark-shell, pyspark, or spark-submit
```
$SPARK_HOME/bin/spark-shell --packages com.singlestore:singlestore-spark-connector_2.12:3.2.0-spark-3.1.0
$SPARK_HOME/bin/spark-shell --packages com.singlestore:singlestore-spark-connector_2.12:3.2.1-spark-3.1.0
```

We release two versions of the `singlestore-spark-connector`, one per Spark version.
An example version number is: `3.2.0-spark-3.1.0` which is the 3.2.0
An example version number is: `3.2.1-spark-3.1.0` which is the 3.2.1
version of the connector, compiled and tested against Spark 3.1.0. Make sure
you are using the most recent version of the connector.

Expand Down Expand Up @@ -569,13 +569,13 @@ Happy querying!

## Major changes from the 2.0.0 connector

The SingleStore Spark Connector 3.2.0 has a number of key features and enhancements:
The SingleStore Spark Connector 3.2.1 has a number of key features and enhancements:

* Introduces SQL Optimization & Rewrite for most query shapes and compatible expressions
* Implemented as a native Spark SQL plugin
* Supports both the DataSource and DataSourceV2 API for maximum support of current and future functionality
* Contains deep integrations with the Catalyst query optimizer
* Is compatible with Spark 3.0 and 3.1
* Is compatible with Spark 3.0, 3.1 and 3.2
* Leverages SingleStore LOAD DATA to accelerate ingest from Spark via compression, vectorized cpu instructions, and optimized segment sizes
* Takes advantage of all the latest and greatest features in SingleStore 7.x

Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ lazy val root = project
case "3.1.0" => "scala-sparkv3.1"
case "3.2.0" => "scala-sparkv3.2"
}),
version := s"3.2.0-spark-${sparkVersion}",
version := s"3.2.1-spark-${sparkVersion}",
licenses += "Apache-2.0" -> url(
"http://opensource.org/licenses/Apache-2.0"
),
Expand Down
2 changes: 1 addition & 1 deletion demo/notebook/pyspark-singlestore-demo_2F8XQUKFG.zpln
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
},
{
"title": "Configure Spark",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.0-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.1-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"user": "anonymous",
"dateUpdated": "2021-09-23 10:50:07.416",
"progress": 0,
Expand Down
2 changes: 1 addition & 1 deletion demo/notebook/scala-singlestore-demo_2F6Y3APTX.zpln
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
},
{
"title": "Configure Spark",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.0-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.1-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"user": "anonymous",
"dateUpdated": "2021-09-23 10:55:20.533",
"progress": 0,
Expand Down
2 changes: 1 addition & 1 deletion demo/notebook/spark-sql-singlestore-demo_2F7PZ81H6.zpln
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
},
{
"title": "Configure Spark",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.0-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"text": "%spark.conf\n\n// Comma-separated list of Maven coordinates of jars to include on the driver and executor classpaths\nspark.jars.packages com.singlestore:singlestore-spark-connector_2.12:3.2.1-spark-3.0.0\n\n// Hostname or IP address of the SingleStore Master Aggregator in the format host[:port] (port is optional). \n// singlestore-ciab-for-zeppelin - hostname of the docker created by https://hub.docker.com/r/singlestore/cluster-in-a-box\n// 3306 - port on which SingleStore Master Aggregator is started\nspark.datasource.singlestore.ddlEndpoint singlestore-ciab-for-zeppelin:3306\n\n// Hostname or IP address of SingleStore Aggregator nodes to run queries against in the format host[:port],host[:port],...\n// (port is optional, multiple hosts separated by comma) (default: ddlEndpoint)\n// Example\n// spark.datasource.singlestore.dmlEndpoints child-agg:3308,child-agg2\nspark.datasource.singlestore.dmlEndpoints singlestore-ciab-for-zeppelin:3306\n\n// SingleStore username (default: root)\nspark.datasource.singlestore.user root\n\n// SingleStore password (default: no password)\nspark.datasource.singlestore.password my_password\n\n// If set, all connections will default to using this database (default: empty)\n// Example\n// spark.datasource.singlestore.database demoDB\nspark.datasource.singlestore.database\n\n// Disable SQL Pushdown when running queries (default: false)\nspark.datasource.singlestore.disablePushdown false\n\n// Enable reading data in parallel for some query shapes (default: false)\nspark.datasource.singlestore.enableParallelRead false\n\n// Truncate instead of drop an existing table during Overwrite (default: false)\nspark.datasource.singlestore.truncate false\n\n// Compress data on load; one of (GZip, LZ4, Skip) (default: GZip)\nspark.datasource.singlestore.loadDataCompression GZip\n\n// Specify additional keys to add to tables created by the connector\n// Examples\n// * A primary key on the id column\n// spark.datasource.singlestore.tableKey.primary id\n// * A regular key on the columns created, firstname with the key name created_firstname\n// spark.datasource.singlestore.tableKey.key.created_firstname created, firstName\n// * A unique key on the username column\n// spark.datasource.singlestore.tableKey.unique username\nspark.datasource.singlestore.tableKey",
"user": "anonymous",
"dateUpdated": "2021-09-23 11:00:50.193",
"progress": 0,
Expand Down

0 comments on commit b5588ab

Please sign in to comment.