fix version validation (#88)

* fix the version validation info * update version match * fix assert info * update Nebula to NebulaGraph
vesoft-inc · Mar 7, 2023 · 06e8900 · 06e8900
1 parent f904dee
commit 06e8900
Show file tree

Hide file tree

Showing 4 changed files with 97 additions and 52 deletions.
diff --git a/README.md b/README.md
@@ -1,15 +1,15 @@
-# Nebula Spark Connector
+# NebulaGraph Spark Connector
 [中文版](https://github.com/vesoft-inc/nebula-spark-connector/blob/master/README_CN.md)
 
 ## Introduction
 
-Nebula Spark Connector 2.0/3.0 only supports Nebula Graph 2.x/3.x. If you are using Nebula Graph v1.x, please use [Nebula Spark Connector v1.0](https://github.com/vesoft-inc/nebula-java/tree/v1.0/tools/nebula-spark) .
+NebulaGraph Spark Connector 2.0/3.0 only supports NebulaGraph 2.x/3.x. If you are using NebulaGraph v1.x, please use [NebulaGraph Spark Connector v1.0](https://github.com/vesoft-inc/nebula-java/tree/v1.0/tools/nebula-spark) .
 
-Nebula Spark Connector support spark 2.2 and 2.4.
+NebulaGraph Spark Connector support spark 2.2 and 2.4.
 
 ## How to Compile
 
-1. Package Nebula Spark Connector.
+1. Package NebulaGraph Spark Connector.
 
     ```bash
     $ git clone https://github.com/vesoft-inc/nebula-spark-connector.git
@@ -24,27 +24,43 @@ Nebula Spark Connector support spark 2.2 and 2.4.
 
     After the packaging, you can see the newly generated nebula-spark-connector-3.0-SNAPSHOT.jar under the nebula-spark-connector/nebula-spark-connector/target/ directory.
 
-## New Features (Compared to Nebula Spark Connector 1.0)
+## New Features (Compared to NebulaGraph Spark Connector 1.0)
 * Supports more connection configurations, such as timeout, connectionRetry, and executionRetry.
 * Supports more data configurations, such as whether vertexId can be written as vertex's property, whether srcId, dstId and rank can be written as edge's properties.
 * Spark Reader Supports non-property, all-property, and specific-properties read.
-* Spark Reader Supports reading data from Nebula Graph to Graphx as VertexRD and EdgeRDD, it also supports String type vertexId.
-* Nebula Spark Connector 2.0 uniformly uses SparkSQL's DataSourceV2 for data source expansion.
-* Nebula Spark Connector 2.1.0 support UPDATE write mode to NebulaGraph, see [Update Vertex](https://docs.nebula-graph.io/2.0.1/3.ngql-guide/12.vertex-statements/2.update-vertex/) .
-* Nebula Spark Connector 2.5.0 support DELETE write mode to NebulaGraph, see [Delete Vertex](https://docs.nebula-graph.io/master/3.ngql-guide/12.vertex-statements/4.delete-vertex/)
+* Spark Reader Supports reading data from NebulaGraph to Graphx as VertexRD and EdgeRDD, it also supports String type vertexId.
+* NebulaGraph Spark Connector 2.0 uniformly uses SparkSQL's DataSourceV2 for data source expansion.
+* NebulaGraph Spark Connector 2.1.0 support UPDATE write mode to NebulaGraph, see [Update Vertex](https://docs.nebula-graph.io/2.0.1/3.ngql-guide/12.vertex-statements/2.update-vertex/) .
+* NebulaGraph Spark Connector 2.5.0 support DELETE write mode to NebulaGraph, see [Delete Vertex](https://docs.nebula-graph.io/master/3.ngql-guide/12.vertex-statements/4.delete-vertex/)
 
 ## How to Use
 
-  If you use Maven to manage your project, add the following dependency to your pom.xml:
+  If you use Maven to manage your project, add one of the following dependency to your pom.xml:
+
   ```
+  <!-- connector for spark 2.4 -->
   <dependency>
      <groupId>com.vesoft</groupId>
      <artifactId>nebula-spark-connector</artifactId>
      <version>3.0-SNAPSHOT</version>
   </dependency>
+
+ <!-- connector for spark 2.2 -->
+  <dependency>
+     <groupId>com.vesoft</groupId>
+     <artifactId>nebula-spark-connector_2.2</artifactId>
+     <version>3.0-SNAPSHOT</version>
+  </dependency>
+
+ <!-- connector for spark 3.0 -->
+  <dependency>
+     <groupId>com.vesoft</groupId>
+     <artifactId>nebula-spark-connector_3.0</artifactId>
+     <version>3.0-SNAPSHOT</version>
+  </dependency>
   ```
 
-  Write DataFrame `INSERT` into Nebula Graph as Vertices:
+  Write DataFrame `INSERT` into NebulaGraph as Vertices:
   ```
     val config = NebulaConnectionConfig
       .builder()
@@ -61,7 +77,7 @@ Nebula Spark Connector support spark 2.2 and 2.4.
       .build()
     df.write.nebula(config, nebulaWriteVertexConfig).writeVertices()
   ```
-  Write DataFrame `UPDATE` into Nebula Graph as Vertices:
+  Write DataFrame `UPDATE` into NebulaGraph as Vertices:
   ```
     val config = NebulaConnectionConfig
       .builder()
@@ -79,7 +95,7 @@ Nebula Spark Connector support spark 2.2 and 2.4.
       .build()
     df.write.nebula(config, nebulaWriteVertexConfig).writeVertices()
   ```
-  Write DataFrame `DELETE` into Nebula Graph as Vertices:
+  Write DataFrame `DELETE` into NebulaGraph as Vertices:
   ```
     val config = NebulaConnectionConfig
       .builder()
@@ -96,7 +112,7 @@ Nebula Spark Connector support spark 2.2 and 2.4.
       .build()
     df.write.nebula(config, nebulaWriteVertexConfig).writeVertices()
   ```
-  Read vertices from Nebula Graph: 
+  Read vertices from NebulaGraph: 
   ```
     val config = NebulaConnectionConfig
       .builder()
@@ -115,7 +131,7 @@ Nebula Spark Connector support spark 2.2 and 2.4.
     val vertex = spark.read.nebula(config, nebulaReadVertexConfig).loadVerticesToDF()
   ```
 
-  Read vertices and edges from Nebula Graph to construct Graphx's graph:
+  Read vertices and edges from NebulaGraph to construct Graphx's graph:
   ```
     val config = NebulaConnectionConfig
       .builder()
@@ -148,7 +164,7 @@ Nebula Spark Connector support spark 2.2 and 2.4.
 
 For more information on usage, please refer to [Example](https://github.com/vesoft-inc/nebula-spark-connector/tree/master/example/src/main/scala/com/vesoft/nebula/examples/connector).
 
-## PySpark with Nebula Spark Connector
+## PySpark with NebulaGraph Spark Connector
 
 Below is an example of calling nebula-spark-connector jar package in pyspark.
 
@@ -276,7 +292,7 @@ For more options, i.e. delete edge with vertex being deleted, refer to [nebula/c
   val DELETE_EDGE: String  = "deleteEdge"
 ```
 
-### Call Nebula Spark Connector in PySpark shell and .py file
+### Call NebulaGraph Spark Connector in PySpark shell and .py file
 
 Also, below are examples on how we run above code with pyspark shell or in python code files:
 
@@ -307,21 +323,27 @@ df = spark.read.format(
     "partitionNumber", 1).load()
 ```
 
-## Version match
-
-There are the version correspondence between Nebula Spark Connector and Nebula:
-
-| Nebula Spark Connector Version | Nebula Version |
-|:------------------------------:|:--------------:|
-|           2.0.0                |  2.0.0, 2.0.1  |
-|           2.0.1                |  2.0.0, 2.0.1  |
-|           2.1.0                |  2.0.0, 2.0.1  |
-|           2.5.0                |  2.5.0, 2.5.1  |
-|           2.5.1                |  2.5.0, 2.5.1  |
-|           2.6.0                |  2.6.0, 2.6.1  |
-|           2.6.1                |  2.6.0, 2.6.1  |
-|           3.0.0                |  3.0.x, 3.1.x  |
-|         3.0-SNAPSHOT           |     nightly    |
+## Compatibility matrix
+
+There are the version correspondence between NebulaGraph Spark Connector and Nebula、Spark:
+
+| NebulaGraph Spark Connector Version            | NebulaGraph Version | Spark Version |
+|:-----------------------------------------:|:--------------:|:-------------:|
+|nebula-spark-connector-2.0.0.jar           |  2.0.0, 2.0.1  | 2.4.* |
+|nebula-spark-connector-2.0.1.jar           |  2.0.0, 2.0.1  | 2.4.* | 
+|nebula-spark-connector-2.1.0.jar           |  2.0.0, 2.0.1  | 2.4.* |
+|nebula-spark-connector-2.5.0.jar           |  2.5.0, 2.5.1  | 2.4.* |
+|nebula-spark-connector-2.5.1.jar           |  2.5.0, 2.5.1  | 2.4.* |
+|nebula-spark-connector-2.6.0.jar           |  2.6.0, 2.6.1  | 2.4.* |
+|nebula-spark-connector-2.6.1.jar           |  2.6.0, 2.6.1  | 2.4.* |
+|nebula-spark-connector-3.0.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector-3.3.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector_2.2-3.3.0.jar       |      3.x       | 2.2.* |
+|nebula-spark-connector-3.4.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector_2.2-3.4.0.jar       |      3.x       | 2.2.* |
+|nebula-spark-connector-3.0-SNAPSHOT.jar    |     nightly    | 2.4.* |
+|nebula-spark-connector_2.2-3.0-SNAPSHOT.jar|     nightly    | 2.2.* |
+|nebula-spark-connector_3.0-3.0-SNAPSHOT.jar|     nightly    | 3.*   |
 
 ## Performance
 We use LDBC dataset to test nebula-spark-connector's performance, here's the result.
@@ -332,7 +354,7 @@ We choose tag Comment and edge REPLY_OF for space sf30 and sf100 to test the con
 And the application's resources are: standalone mode with three workers, 2G driver-memory, 
 3 num-executors, 30G executor-memory and 20 executor-cores.
 The ReadNebulaConfig has 2000 limit and 100 partitionNum, 
-the same partition number with nebula space parts.
+the same partition number with NebulaGraph space parts.
 
 
 |data type|ldbc 67.12million with No Property| ldbc 220 million with No Property|ldbc  67.12million with All Property|ldbc 220 million with All Property|
@@ -360,8 +382,8 @@ The writeConfig has 2000 batch sizes, and the DataFrame has 60 partitions.
 
 ## How to Contribute
 
-Nebula Spark Connector is a completely opensource project, opensource enthusiasts are welcome to participate in the following ways:
+NebulaGraph Spark Connector is a completely opensource project, opensource enthusiasts are welcome to participate in the following ways:
 
-- Go to [Nebula Graph Forum](https://discuss.nebula-graph.com.cn/ "go to“Nebula Graph Forum") to discuss with other users. You can raise your own questions, help others' problems, share your thoughts.
+- Go to [NebulaGraph Forum](https://discuss.nebula-graph.com.cn/ "go to“NebulaGraph Forum") to discuss with other users. You can raise your own questions, help others' problems, share your thoughts.
 - Write or improve documents.
 - Submit code to add new features or fix bugs.
diff --git a/README_CN.md b/README_CN.md
@@ -29,13 +29,28 @@ Nebula Spark Connector 支持 Spark 2.2 和 2.4.
 * Nebula Spark Connector 2.5.0 增加了 DELETE 写入模式，相关说明参考[Delete Vertex](https://docs.nebula-graph.com.cn/2.5.1/3.ngql-guide/12.vertex-statements/4.delete-vertex/)
 
 ## 使用说明
-  如果你使用Maven管理项目，请在pom.xml文件中增加依赖:
+  如果你使用Maven管理项目，请在pom.xml文件中增加下列某一项依赖:
   ```
+  <!-- connector for spark 2.4 -->
   <dependency>
      <groupId>com.vesoft</groupId>
      <artifactId>nebula-spark-connector</artifactId>
      <version>3.0-SNAPSHOT</version>
   </dependency>
+
+ <!-- connector for spark 2.2 -->
+  <dependency>
+     <groupId>com.vesoft</groupId>
+     <artifactId>nebula-spark-connector_2.2</artifactId>
+     <version>3.0-SNAPSHOT</version>
+  </dependency>
+
+ <!-- connector for spark 3.0 -->
+  <dependency>
+     <groupId>com.vesoft</groupId>
+     <artifactId>nebula-spark-connector_3.0</artifactId>
+     <version>3.0-SNAPSHOT</version>
+  </dependency>
   ```
 
   将 DataFrame 作为点 `INSERT` 写入 Nebula Graph :
@@ -305,19 +320,26 @@ df = spark.read.format(
 ```
 
 ## 版本匹配
-Nebula Spark Connector 和 Nebula 的版本对应关系如下:
-
-| Nebula Spark Connector Version | Nebula Version |
-|:------------------------------:|:--------------:|
-|          2.0.0                 |  2.0.0, 2.0.1  |
-|          2.0.1                 |  2.0.0, 2.0.1  |
-|          2.1.0                 |  2.0.0, 2.0.1  |
-|          2.5.0                 |  2.5.0, 2.5.1  |
-|          2.5.1                 |  2.5.0, 2.5.1  |
-|          2.6.0                 |  2.6.0, 2.6.1  |
-|          2.6.1                 |  2.6.0, 2.6.1  |
-|          3.0.0                 |     3.0.0      |
-|        3.0-SNAPSHOT            |     nightly    |
+Nebula Spark Connector 和 Nebula 、Spark 的版本对应关系如下:
+
+| Nebula Spark Connector Version            | Nebula Version | Spark Version |
+|:-----------------------------------------:|:--------------:|:-------------:|
+|nebula-spark-connector-2.0.0.jar           |  2.0.0, 2.0.1  | 2.4.* |
+|nebula-spark-connector-2.0.1.jar           |  2.0.0, 2.0.1  | 2.4.* | 
+|nebula-spark-connector-2.1.0.jar           |  2.0.0, 2.0.1  | 2.4.* |
+|nebula-spark-connector-2.5.0.jar           |  2.5.0, 2.5.1  | 2.4.* |
+|nebula-spark-connector-2.5.1.jar           |  2.5.0, 2.5.1  | 2.4.* |
+|nebula-spark-connector-2.6.0.jar           |  2.6.0, 2.6.1  | 2.4.* |
+|nebula-spark-connector-2.6.1.jar           |  2.6.0, 2.6.1  | 2.4.* |
+|nebula-spark-connector-3.0.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector-3.3.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector_2.2-3.3.0.jar       |      3.x       | 2.2.* |
+|nebula-spark-connector-3.4.0.jar           |      3.x       | 2.4.* |
+|nebula-spark-connector_2.2-3.4.0.jar       |      3.x       | 2.2.* |
+|nebula-spark-connector-3.0-SNAPSHOT.jar    |     nightly    | 2.4.* |
+|nebula-spark-connector_2.2-3.0-SNAPSHOT.jar|     nightly    | 2.2.* |
+|nebula-spark-connector_3.0-3.0-SNAPSHOT.jar|     nightly    | 3.*   |
+
 
 ## 性能
 我们使用LDBC数据集进行Nebula-Spark-Connector的性能测试，测试结果如下：

diff --git a/nebula-spark-common/src/main/scala/com/vesoft/nebula/connector/NebulaConfig.scala b/nebula-spark-common/src/main/scala/com/vesoft/nebula/connector/NebulaConfig.scala
@@ -784,7 +784,8 @@ object ReadNebulaConfig {
       assert(space != null && !space.isEmpty, s"config space is empty.")
       assert(label != null && !label.isEmpty, s"config label is empty.")
       assert(limit > 0, s"config limit must be positive, your limit is $limit")
-      assert(partitionNum > 0, s"config partitionNum must be positive, your partitionNum is $limit")
+      assert(partitionNum > 0,
+             s"config partitionNum must be positive, your partitionNum is $partitionNum")
       if (noColumn && returnCols.nonEmpty) {
         LOG.warn(
           s"noColumn is true, returnCols will be invalidate "

diff --git a/nebula-spark-common/src/main/scala/com/vesoft/nebula/connector/utils/SparkValidate.scala b/nebula-spark-common/src/main/scala/com/vesoft/nebula/connector/utils/SparkValidate.scala
@@ -12,8 +12,8 @@ object SparkValidate {
     val sparkVersion = SparkSession.getActiveSession.map(_.version).getOrElse("UNKNOWN")
     if (sparkVersion != "UNKNOWN" && !supportedVersions.exists(sparkVersion.matches)) {
       throw new RuntimeException(
-        s"""Your current spark version ${sparkVersion} is not supported by the current NebulaGraph Exchange.
-           | please visit https://github.com/vesoft-inc/nebula-exchange#version-match to know which Exchange you need.
+        s"""Your current spark version ${sparkVersion} is not supported by the current NebulaGraph Spark Connector.
+           | please visit https://github.com/vesoft-inc/nebula-spark-connector#version-match to know which Connector you need.
            | """.stripMargin)
     }
   }