Skip to content

Commit

Permalink
update version in docs to 2.3.1
Browse files Browse the repository at this point in the history
Signed-off-by: chenxu <chenxu@dmetasoul.com>
  • Loading branch information
dmetasoul01 committed Aug 22, 2023
1 parent afdef16 commit 43e8c0c
Show file tree
Hide file tree
Showing 16 changed files with 36 additions and 40 deletions.
2 changes: 1 addition & 1 deletion website/docs/01-Getting Started/01-setup-local-env.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ After unpacking spark package, you could find LakeSoul distribution jar from htt
wget https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/spark/spark-3.3.2-bin-hadoop-3.tgz
tar xf spark-3.3.2-bin-hadoop-3.tgz
export SPARK_HOME=${PWD}/spark-3.3.2-bin-hadoop3
wget https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-spark-2.3.0-spark-3.3.jar -P $SPARK_HOME/jars
wget https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-spark-2.3.1-spark-3.3.jar -P $SPARK_HOME/jars
```

:::tip
Expand Down
2 changes: 1 addition & 1 deletion website/docs/01-Getting Started/02-docker-compose.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ docker run --net lakesoul-docker-compose-env_default --rm -ti \
-v $(pwd)/lakesoul.properties:/opt/spark/work-dir/lakesoul.properties \
--env lakesoul_home=/opt/spark/work-dir/lakesoul.properties bitnami/spark:3.3.1 \
spark-shell \
--packages com.dmetasoul:lakesoul-spark:2.3.0-spark-3.3 \
--packages com.dmetasoul:lakesoul-spark:2.3.1-spark-3.3 \
--conf spark.sql.extensions=com.dmetasoul.lakesoul.sql.LakeSoulSparkSessionExtension \
--conf spark.sql.catalog.lakesoul=org.apache.spark.sql.lakesoul.catalog.LakeSoulCatalog \
--conf spark.sql.defaultCatalog=lakesoul \
Expand Down
2 changes: 1 addition & 1 deletion website/docs/02-Tutorials/02-flink-cdc-sink/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Submit a LakeSoul Flink CDC Sink job to the Flink cluster started above:
```bash
./bin/flink run -ys 1 -yjm 1G -ytm 2G \
-c org.apache.flink.lakesoul.entry.MysqlCdc\
lakesoul-flink-2.3.0-flink-1.14.jar \
lakesoul-flink-2.3.1-flink-1.14.jar \
--source_db.host localhost \
--source_db.port 3306 \
--source_db.db_name test_cdc \
Expand Down
4 changes: 2 additions & 2 deletions website/docs/02-Tutorials/07-kafka-topics-data-to-lakesoul.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ export lakesoul_home=./pg.properties && ./bin/spark-submit \
--driver-memory 4g \
--executor-memory 4g \
--master local[4] \
./jars/lakesoul-spark-2.3.0-spark-3.3.jar \
./jars/lakesoul-spark-2.3.1-spark-3.3.jar \
localhost:9092 test.* /tmp/kafka/data /tmp/kafka/checkpoint/ kafka earliest false
```

Expand Down Expand Up @@ -151,6 +151,6 @@ export lakesoul_home=./pg.properties && ./bin/spark-submit \
--driver-memory 4g \
--executor-memory 4g \
--master local[4] \
./jars/lakesoul-spark-2.3.0-spark-3.3.jar \
./jars/lakesoul-spark-2.3.1-spark-3.3.jar \
localhost:9092 test.* /tmp/kafka/data /tmp/kafka/checkpoint/ kafka earliest false http://localhost:8081
```
10 changes: 5 additions & 5 deletions website/docs/03-Usage Docs/02-setup-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ To use `spark-shell`, `pyspark` or `spark-sql` shells, you should include LakeSo

### Use Maven Coordinates via --packages
```bash
spark-shell --packages com.dmetasoul:lakesoul-spark:2.3.0-spark-3.3
spark-shell --packages com.dmetasoul:lakesoul-spark:2.3.1-spark-3.3
```

### Use Local Packages
You can find the LakeSoul packages from our release page: [Releases](https://github.com/lakesoul-io/LakeSoul/releases).
Download the jar file and pass it to `spark-submit`.
```bash
spark-submit --jars "lakesoul-spark-2.3.0-spark-3.3.jar"
spark-submit --jars "lakesoul-spark-2.3.1-spark-3.3.jar"
```

Or you could directly put the jar into `$SPARK_HOME/jars`
Expand All @@ -26,7 +26,7 @@ Include maven dependencies in your project:
<dependency>
<groupId>com.dmetasoul</groupId>
<artifactId>lakesoul</artifactId>
<version>2.3.0-spark-3.3</version>
<version>2.3.1-spark-3.3</version>
</dependency>
```

Expand Down Expand Up @@ -92,7 +92,7 @@ taskmanager.memory.task.off-heap.size: 3000m
:::

## Add LakeSoul Jar to Flink's directory
Download LakeSoul Flink Jar from: https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-flink-2.3.0-flink-1.14.jar
Download LakeSoul Flink Jar from: https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-flink-2.3.1-flink-1.14.jar

And put the jar file under `$FLINK_HOME/lib`. After this, you could start flink session cluster or application as usual.

Expand All @@ -103,6 +103,6 @@ Add the following to your project's pom.xml
<dependency>
<groupId>com.dmetasoul</groupId>
<artifactId>lakesoul</artifactId>
<version>2.3.0-flink-1.14</version>
<version>2.3.1-flink-1.14</version>
</dependency>
```
6 changes: 3 additions & 3 deletions website/docs/03-Usage Docs/05-flink-cdc-sync.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ In the Stream API, the main functions of LakeSoul Sink are:

## How to use the command line
### 1. Download LakeSoul Flink Jar
It can be downloaded from the LakeSoul Release page: https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-flink-2.3.0-flink-1.14.jar.
It can be downloaded from the LakeSoul Release page: https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-flink-2.3.1-flink-1.14.jar.

The currently supported Flink version is 1.14.

Expand Down Expand Up @@ -54,7 +54,7 @@ export LAKESOUL_PG_PASSWORD=root
#### 2.2 Start sync job
```bash
bin/flink run -c org.apache.flink.lakesoul.entry.MysqlCdc \
lakesoul-flink-2.3.0-flink-1.14.jar \
lakesoul-flink-2.3.1-flink-1.14.jar \
--source_db.host localhost \
--source_db.port 3306 \
--source_db.db_name default \
Expand All @@ -73,7 +73,7 @@ Description of required parameters:
| Parameter | Meaning | Value Description |
|----------------|------------------------------------|-------------------------------------------- |
| -c | The task runs the main function entry class | org.apache.flink.lakesoul.entry.MysqlCdc |
| Main package | Task running jar | lakesoul-flink-2.3.0-flink-1.14.jar |
| Main package | Task running jar | lakesoul-flink-2.3.1-flink-1.14.jar |
| --source_db.host | The address of the MySQL database | |
| --source_db.port | MySQL database port | |
| --source_db.user | MySQL database username | |
Expand Down
4 changes: 1 addition & 3 deletions website/docs/03-Usage Docs/06-flink-lakesoul-connector.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,12 @@ LakeSoul provides Flink Connector which implements the Dynamic Table interface,

To setup Flink environment, please refer to [Setup Spark/Flink Job/Project](../03-Usage%20Docs/02-setup-spark.md)

Introduce LakeSoul dependency: package and compile the lakesoul-flink folder to get lakesoul-flink-2.3.0-flink-1.14.jar.

In order to use Flink to create LakeSoul tables, it is recommended to use Flink SQL Client, which supports direct use of Flink SQL commands to operate LakeSoul tables. In this document, the Flink SQL is to directly enter statements on the Flink SQL Client cli interface; whereas the Table API needs to be used in a Java projects.

Switch to the flink folder and execute the command to start the SQLclient client.
```bash
# Start Flink SQL Client
bin/sql-client.sh embedded -j lakesoul-flink-2.3.0-flink-1.14.jar
bin/sql-client.sh embedded -j lakesoul-flink-2.3.1-flink-1.14.jar
```

## 2. DDL
Expand Down
2 changes: 1 addition & 1 deletion website/docs/03-Usage Docs/08-auto-compaction-task.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ The use the following command to start the compaction service job:
--conf "spark.executor.extraJavaOptions=-XX:MaxDirectMemorySize=4G" \
--conf "spark.executor.memoryOverhead=3g" \
--class com.dmetasoul.lakesoul.spark.compaction.CompactionTask \
jars/lakesoul-spark-2.3.0-spark-3.3.jar
jars/lakesoul-spark-2.3.1-spark-3.3.jar
--threadpool.size=10
--database=test
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@ https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-without-hadoop.tgz

LakeSoul 发布 jar 包可以从 GitHub Releases 页面下载:https://github.com/lakesoul-io/LakeSoul/releases 。下载后请将 Jar 包放到 Spark 安装目录下的 jars 目录中:
```bash
wget https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-spark-2.3.0-spark-3.3.jar -P $SPARK_HOME/jars
wget https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-spark-2.3.1-spark-3.3.jar -P $SPARK_HOME/jars
```

如果访问 Github 有问题,也可以从如下链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-spark-2.3.0-spark-3.3.jar
如果访问 Github 有问题,也可以从如下链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-spark-2.3.1-spark-3.3.jar

:::tip
从 2.1.0 版本起,LakeSoul 自身的依赖已经通过 shade 方式打包到一个 jar 包中。之前的版本是多个 jar 包以 tar.gz 压缩包的形式发布。
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ docker run --net lakesoul-docker-compose-env_default --rm -ti \
-v $(pwd)/lakesoul.properties:/opt/spark/work-dir/lakesoul.properties \
--env lakesoul_home=/opt/spark/work-dir/lakesoul.properties bitnami/spark:3.3.1 \
spark-shell \
--packages com.dmetasoul:lakesoul-spark:2.3.0-spark-3.3 \
--packages com.dmetasoul:lakesoul-spark:2.3.1-spark-3.3 \
--conf spark.sql.extensions=com.dmetasoul.lakesoul.sql.LakeSoulSparkSessionExtension \
--conf spark.sql.catalog.lakesoul=org.apache.spark.sql.lakesoul.catalog.LakeSoulCatalog \
--conf spark.sql.defaultCatalog=lakesoul \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ $FLINK_HOME/bin/start-cluster.sh
```bash
./bin/flink run -ys 1 -yjm 1G -ytm 2G \
-c org.apache.flink.lakesoul.entry.MysqlCdc \
lakesoul-flink-2.3.0-flink-1.14.jar \
lakesoul-flink-2.3.1-flink-1.14.jar \
--source_db.host localhost \
--source_db.port 3306 \
--source_db.db_name test_cdc \
Expand All @@ -99,7 +99,7 @@ $FLINK_HOME/bin/start-cluster.sh
--server_time_zone UTC
```

其中 lakesoul-flink 的 jar 包可以从 [Github Release](https://github.com/lakesoul-io/LakeSoul/releases/) 页面下载。如果访问 Github 有问题,也可以通过这个链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-flink-2.3.0-flink-1.14.jar
其中 lakesoul-flink 的 jar 包可以从 [Github Release](https://github.com/lakesoul-io/LakeSoul/releases/) 页面下载。如果访问 Github 有问题,也可以通过这个链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-flink-2.3.1-flink-1.14.jar

http://localhost:8081 Flink 作业页面中,点击 Running Job,进入查看 LakeSoul 作业是否已经处于 `Running` 状态。

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ export lakesoul_home=./pg.properties && ./bin/spark-submit \
--driver-memory 4g \
--executor-memory 4g \
--master local[4] \
./jars/lakesoul-spark-2.3.0-spark-3.3.jar \
./jars/lakesoul-spark-2.3.1-spark-3.3.jar \
localhost:9092 test.* /tmp/kafka/data /tmp/kafka/checkpoint/ kafka earliest false
```

Expand Down Expand Up @@ -149,6 +149,6 @@ export lakesoul_home=./pg.properties && ./bin/spark-submit \
--driver-memory 4g \
--executor-memory 4g \
--master local[4] \
./jars/lakesoul-spark-2.3.0-spark-3.3.jar \
./jars/lakesoul-spark-2.3.1-spark-3.3.jar \
localhost:9092 test.* /tmp/kafka/data /tmp/kafka/checkpoint/ kafka earliest false http://localhost:8081
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,30 +8,30 @@ LakeSoul 目前支持 Spark 3.3 + Scala 2.12.

### 使用 `--packages` 传 Maven 仓库和包名
```bash
spark-shell --packages com.dmetasoul:lakesoul-spark:2.3.0-spark-3.3
spark-shell --packages com.dmetasoul:lakesoul-spark:2.3.1-spark-3.3
```

### 使用打包好的 LakeSoul 包
可以从 [Releases](https://github.com/lakesoul-io/LakeSoul/releases) 页面下载已经打包好的 LakeSoul Jar 包。
下载 jar 并传给 `spark-submit` 命令:
```bash
spark-submit --jars "lakesoul-spark-2.3.0-spark-3.3.jar"
spark-submit --jars "lakesoul-spark-2.3.1-spark-3.3.jar"
```

### 直接将 Jar 包放在 Spark 环境中
可以将 Jar 包下载后,放在 $SPARK_HOME/jars 中。

Jar 包可以从 Github Release 页面下载:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-spark-2.3.0-spark-3.3.jar
Jar 包可以从 Github Release 页面下载:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-spark-2.3.1-spark-3.3.jar

或者从国内地址下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-spark-2.3.0-spark-3.3.jar
或者从国内地址下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-spark-2.3.1-spark-3.3.jar

## 设置 Java/Scala 项目
增加以下 Maven 依赖项:
```xml
<dependency>
<groupId>com.dmetasoul</groupId>
<artifactId>lakesoul-spark</artifactId>
<version>2.3.0-spark-3.3</version>
<version>2.3.1-spark-3.3</version>
</dependency>
```

Expand Down Expand Up @@ -116,7 +116,7 @@ taskmanager.memory.task.off-heap.size: 3000m


## 添加 LakeSoul Jar 到 Flink 部署的目录
从以下地址下载 LakeSoul Flink Jar:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-flink-2.3.0-flink-1.14.jar
从以下地址下载 LakeSoul Flink Jar:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-flink-2.3.1-flink-1.14.jar

并将 jar 文件放在 `$FLINK_HOME/lib` 下。在此之后,您可以像往常一样启动 flink 会话集群或应用程序。

Expand All @@ -136,6 +136,6 @@ export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`
<dependency>
<groupId>com.dmetasoul</groupId>
<artifactId>lakesoul</artifactId>
<version>2.3.0-flink-1.14</version>
<version>2.3.1-flink-1.14</version>
</dependency>
```
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,9 @@ LakeSoul 自 2.1.0 版本起,实现了 Flink CDC Sink,能够支持 Table API

## 命令行使用方法
### 1. 下载 LakeSoul Flink Jar
可以在 LakeSoul Release 页面下载:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.0/lakesoul-flink-2.3.0-flink-1.14.jar。
可以在 LakeSoul Release 页面下载:https://github.com/lakesoul-io/LakeSoul/releases/download/v2.3.1/lakesoul-flink-2.3.1-flink-1.14.jar。

如果访问 Github 有问题,也可以通过这个链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-flink-2.3.0-flink-1.14.jar。
如果访问 Github 有问题,也可以通过这个链接下载:https://dmetasoul-bucket.obs.cn-southwest-2.myhuaweicloud.com/releases/lakesoul/lakesoul-flink-2.3.1-flink-1.14.jar。

目前支持的 Flink 版本为 1.14。

Expand Down Expand Up @@ -52,7 +52,7 @@ export LAKESOUL_PG_PASSWORD=root
#### 2.2 启动同步作业
```bash
bin/flink run -c org.apache.flink.lakesoul.entry.MysqlCdc \
lakesoul-flink-2.3.0-flink-1.14.jar \
lakesoul-flink-2.3.1-flink-1.14.jar \
--source_db.host localhost \
--source_db.port 3306 \
--source_db.db_name default \
Expand All @@ -71,7 +71,7 @@ bin/flink run -c org.apache.flink.lakesoul.entry.MysqlCdc \
| 参数 | 含义 | 取值说明 |
|----------------|--------------------------------------------------------------------------------------|---------------------------------------------|
| -c | 任务运行main函数入口类 | org.apache.flink.lakesoul.entry.MysqlCdc |
| 主程序包 | 任务运行jar包 | lakesoul-flink-2.3.0-flink-1.14.jar |
| 主程序包 | 任务运行jar包 | lakesoul-flink-2.3.1-flink-1.14.jar |
| --source_db.host | MySQL 数据库的地址 | |
| --source_db.port | MySQL 数据库的端口 | |
| --source_db.user | MySQL 数据库的用户名 | |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,14 @@ LakeSoul 提供了 Flink Connector,实现了 Flink Dynamic Table 接口,可

## 1. 环境准备

设置 LakeSoul 元数据,请参考 [设置 Spark/Flink 工程/作业](../03-Usage%20Docs/02-setup-spark.md)

Flink引入 LakeSoul 依赖的方法:将 lakesoul-flink 文件夹打包编译后得到 lakesoul-flink-2.3.0-flink-1.14.jar。
设置 LakeSoul 元数据以及依赖包,请参考 [设置 Spark/Flink 工程/作业](../03-Usage%20Docs/02-setup-spark.md)

为了使用 Flink 创建 LakeSoul 表,推荐使用 Flink SQL Client,支持直接使用 Flink SQL 命令操作 LakeSoul 表,本文档中 Flink SQL 是在 Flink SQL Client 界面直接输入语句;Table API 需要在 Java 项目中编写使用。

切换到 Flink 文件夹下,执行命令开启 SQL Client 客户端。
```bash
# 启动 Flink SQL Client
bin/sql-client.sh embedded -j lakesoul-flink-2.3.0-flink-1.14.jar
bin/sql-client.sh embedded -j lakesoul-flink-2.3.1-flink-1.14.jar
```

## 2. DDL
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ trigger 和 pg 函数在数据库初始化的时候已经配置,默认压缩
--conf "spark.executor.extraJavaOptions=-XX:MaxDirectMemorySize=4G" \
--conf "spark.executor.memoryOverhead=3g" \
--class com.dmetasoul.lakesoul.spark.compaction.CompactionTask \
jars/lakesoul-spark-2.3.0-spark-3.3.jar
jars/lakesoul-spark-2.3.1-spark-3.3.jar
--threadpool.size=10
--database=test
```
Expand Down

0 comments on commit 43e8c0c

Please sign in to comment.