diff --git a/.github/workflows/backend.yml b/.github/workflows/backend.yml
index 34a173cd984..113235daa9d 100644
--- a/.github/workflows/backend.yml
+++ b/.github/workflows/backend.yml
@@ -362,8 +362,6 @@ jobs:
java-version: ${{ matrix.java }}
distribution: 'temurin'
cache: 'maven'
- - name: free disk space
- run: tools/github/free_disk_space.sh
- name: run updated modules integration test (part-3)
if: needs.changes.outputs.api == 'false' && needs.changes.outputs.it-modules != ''
run: |
diff --git a/config/plugin_config b/config/plugin_config
index 42fc280a65a..76a7254b378 100644
--- a/config/plugin_config
+++ b/config/plugin_config
@@ -76,5 +76,4 @@ connector-tablestore
connector-selectdb-cloud
connector-hbase
connector-amazonsqs
-connector-easysearch
--end--
\ No newline at end of file
diff --git a/docs/en/connector-v2/sink/CosFile.md b/docs/en/connector-v2/sink/CosFile.md
index 6c88e922947..f0d6517a055 100644
--- a/docs/en/connector-v2/sink/CosFile.md
+++ b/docs/en/connector-v2/sink/CosFile.md
@@ -29,7 +29,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Options
@@ -58,9 +57,6 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
### path [string]
@@ -114,7 +110,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`.
@@ -193,18 +189,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
## Example
For text file format with `have_partition` and `custom_filename` and `sink_columns`
diff --git a/docs/en/connector-v2/sink/Easysearch.md b/docs/en/connector-v2/sink/Easysearch.md
deleted file mode 100644
index f474735082d..00000000000
--- a/docs/en/connector-v2/sink/Easysearch.md
+++ /dev/null
@@ -1,202 +0,0 @@
-# INFINI Easysearch
-
-## Support Those Engines
-
-> Spark
-> Flink
-> SeaTunnel Zeta
-
-## Description
-
-A sink plugin which use send data to `INFINI Easysearch`.
-
-## Using Dependency
-
-> Depenndency [easysearch-client](https://central.sonatype.com/artifact/com.infinilabs/easysearch-client)
->
- ## Key features
-
-- [ ] [exactly-once](../../concept/connector-v2-features.md)
-- [x] [cdc](../../concept/connector-v2-features.md)
-
-:::tip
-
-Engine Supported
-
-* Supported all versions released by [INFINI Easysearch](https://www.infini.com/download/?product=easysearch).
-
-:::
-
-## Data Type Mapping
-
-| Easysearch Data Type | SeaTunnel Data Type |
-|-----------------------------|----------------------|
-| STRING KEYWORD TEXT | STRING |
-| BOOLEAN | BOOLEAN |
-| BYTE | BYTE |
-| SHORT | SHORT |
-| INTEGER | INT |
-| LONG | LONG |
-| FLOAT HALF_FLOAT | FLOAT |
-| DOUBLE | DOUBLE |
-| Date | LOCAL_DATE_TIME_TYPE |
-
-## Sink Options
-
-| name | type | required | default value |
-|-------------------------|---------|----------|---------------|
-| hosts | array | yes | - |
-| index | string | yes | - |
-| primary_keys | list | no | |
-| key_delimiter | string | no | `_` |
-| username | string | no | |
-| password | string | no | |
-| max_retry_count | int | no | 3 |
-| max_batch_size | int | no | 10 |
-| tls_verify_certificate | boolean | no | true |
-| tls_verify_hostnames | boolean | no | true |
-| tls_keystore_path | string | no | - |
-| tls_keystore_password | string | no | - |
-| tls_truststore_path | string | no | - |
-| tls_truststore_password | string | no | - |
-| common-options | | no | - |
-
-### hosts [array]
-
-`INFINI Easysearch` cluster http address, the format is `host:port` , allowing multiple hosts to be specified. Such as `["host1:9200", "host2:9200"]`.
-
-### index [string]
-
-`INFINI Easysearch` `index` name.Index support contains variables of field name,such as `seatunnel_${age}`,and the field must appear at seatunnel row.
-If not, we will treat it as a normal index.
-
-### primary_keys [list]
-
-Primary key fields used to generate the document `_id`, this is cdc required options.
-
-### key_delimiter [string]
-
-Delimiter for composite keys ("_" by default), e.g., "$" would result in document `_id` "KEY1$KEY2$KEY3".
-
-### username [string]
-
-security username
-
-### password [string]
-
-security password
-
-### max_retry_count [int]
-
-one bulk request max try size
-
-### max_batch_size [int]
-
-batch bulk doc max size
-
-### tls_verify_certificate [boolean]
-
-Enable certificates validation for HTTPS endpoints
-
-### tls_verify_hostname [boolean]
-
-Enable hostname validation for HTTPS endpoints
-
-### tls_keystore_path [string]
-
-The path to the PEM or JKS key store. This file must be readable by the operating system user running SeaTunnel.
-
-### tls_keystore_password [string]
-
-The key password for the key store specified
-
-### tls_truststore_path [string]
-
-The path to PEM or JKS trust store. This file must be readable by the operating system user running SeaTunnel.
-
-### tls_truststore_password [string]
-
-The key password for the trust store specified
-
-### common options
-
-Sink plugin common parameters, please refer to [Sink Common Options](common-options.md) for details
-
-## Examples
-
-Simple
-
-```bash
-sink {
- Easysearch {
- hosts = ["localhost:9200"]
- index = "seatunnel-${age}"
- }
-}
-```
-
-CDC(Change data capture) event
-
-```bash
-sink {
- Easysearch {
- hosts = ["localhost:9200"]
- index = "seatunnel-${age}"
-
- # cdc required options
- primary_keys = ["key1", "key2", ...]
- }
-}
-```
-
-SSL (Disable certificates validation)
-
-```hocon
-sink {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_verify_certificate = false
- }
-}
-```
-
-SSL (Disable hostname validation)
-
-```hocon
-sink {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_verify_hostname = false
- }
-}
-```
-
-SSL (Enable certificates validation)
-
-```hocon
-sink {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_keystore_path = "${your Easysearch home}/config/certs/http.p12"
- tls_keystore_password = "${your password}"
- }
-}
-```
-
-## Changelog
-
-### 2.3.4 2023-11-16
-
-- Add Easysearch Sink Connector
-- Support http/https protocol
-- Support CDC write DELETE/UPDATE/INSERT events
-
diff --git a/docs/en/connector-v2/sink/FtpFile.md b/docs/en/connector-v2/sink/FtpFile.md
index 9a3af0e744c..cdc3512485e 100644
--- a/docs/en/connector-v2/sink/FtpFile.md
+++ b/docs/en/connector-v2/sink/FtpFile.md
@@ -27,7 +27,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Options
@@ -57,9 +56,6 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
### host [string]
@@ -119,7 +115,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -198,18 +194,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
## Example
For text file format simple config
diff --git a/docs/en/connector-v2/sink/HdfsFile.md b/docs/en/connector-v2/sink/HdfsFile.md
index 4df905ff439..535b4fc6cda 100644
--- a/docs/en/connector-v2/sink/HdfsFile.md
+++ b/docs/en/connector-v2/sink/HdfsFile.md
@@ -21,7 +21,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
- [x] compress codec
- [x] lzo
@@ -46,7 +45,7 @@ Output data to hdfs file
| custom_filename | boolean | no | false | Whether you need custom the filename |
| file_name_expression | string | no | "${transactionId}" | Only used when `custom_filename` is `true`.`file_name_expression` describes the file expression which will be created into the `path`. We can add the variable `${now}` or `${uuid}` in the `file_name_expression`, like `test_${uuid}_${now}`,`${now}` represents the current time, and its format can be defined by specifying the option `filename_time_format`.Please note that, If `is_enable_transaction` is `true`, we will auto add `${transactionId}_` in the head of the file. |
| filename_time_format | string | no | "yyyy.MM.dd" | Only used when `custom_filename` is `true`.When the format in the `file_name_expression` parameter is `xxxx-${now}` , `filename_time_format` can specify the time format of the path, and the default value is `yyyy.MM.dd` . The commonly used time formats are listed as follows:[y:Year,M:Month,d:Day of month,H:Hour in day (0-23),m:Minute in hour,s:Second in minute] |
-| file_format_type | string | no | "csv" | We supported as the following file types:`text` `json` `csv` `orc` `parquet` `excel` `xml`.Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`. |
+| file_format_type | string | no | "csv" | We supported as the following file types:`text` `json` `csv` `orc` `parquet` `excel`.Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`. |
| field_delimiter | string | no | '\001' | Only used when file_format is text,The separator between columns in a row of data. Only needed by `text` file format. |
| row_delimiter | string | no | "\n" | Only used when file_format is text,The separator between rows in a file. Only needed by `text` file format. |
| have_partition | boolean | no | false | Whether you need processing partitions. |
@@ -64,9 +63,6 @@ Output data to hdfs file
| common-options | object | no | - | Sink plugin common parameters, please refer to [Sink Common Options](common-options.md) for details |
| max_rows_in_memory | int | no | - | Only used when file_format is excel.When File Format is Excel,The maximum number of data items that can be cached in the memory. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel.Writer the sheet of the workbook |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml, specifies the tag name of the root element within the XML file. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml, specifies the tag name of the data rows within the XML file |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml, specifies Whether to process data using the tag attribute format. |
### Tips
diff --git a/docs/en/connector-v2/sink/Hive.md b/docs/en/connector-v2/sink/Hive.md
index eec92b46b1b..2ede5d07893 100644
--- a/docs/en/connector-v2/sink/Hive.md
+++ b/docs/en/connector-v2/sink/Hive.md
@@ -30,18 +30,17 @@ By default, we use 2PC commit to ensure `exactly-once`
## Options
-| name | type | required | default value |
-|-------------------------------|---------|----------|----------------|
-| table_name | string | yes | - |
-| metastore_uri | string | yes | - |
-| compress_codec | string | no | none |
-| hdfs_site_path | string | no | - |
-| hive_site_path | string | no | - |
-| krb5_path | string | no | /etc/krb5.conf |
-| kerberos_principal | string | no | - |
-| kerberos_keytab_path | string | no | - |
-| abort_drop_partition_metadata | boolean | no | true |
-| common-options | | no | - |
+| name | type | required | default value |
+|----------------------|--------|----------|----------------|
+| table_name | string | yes | - |
+| metastore_uri | string | yes | - |
+| compress_codec | string | no | none |
+| hdfs_site_path | string | no | - |
+| hive_site_path | string | no | - |
+| krb5_path | string | no | /etc/krb5.conf |
+| kerberos_principal | string | no | - |
+| kerberos_keytab_path | string | no | - |
+| common-options | | no | - |
### table_name [string]
@@ -71,10 +70,6 @@ The principal of kerberos
The keytab path of kerberos
-### abort_drop_partition_metadata [list]
-
-Flag to decide whether to drop partition metadata from Hive Metastore during an abort operation. Note: this only affects the metadata in the metastore, the data in the partition will always be deleted(data generated during the synchronization process).
-
### common options
Sink plugin common parameters, please refer to [Sink Common Options](common-options.md) for details
diff --git a/docs/en/connector-v2/sink/LocalFile.md b/docs/en/connector-v2/sink/LocalFile.md
index e16c81c3f3a..2f88f0fe720 100644
--- a/docs/en/connector-v2/sink/LocalFile.md
+++ b/docs/en/connector-v2/sink/LocalFile.md
@@ -27,7 +27,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Options
@@ -52,9 +51,6 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
| enable_header_write | boolean | no | false | Only used when file_format_type is text,csv. false:don't write header,true:write header. |
### path [string]
@@ -93,7 +89,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -172,18 +168,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
### enable_header_write [boolean]
Only used when file_format_type is text,csv.false:don't write header,true:write header.
diff --git a/docs/en/connector-v2/sink/OssFile.md b/docs/en/connector-v2/sink/OssFile.md
index 4c85121c20c..7cbab4347de 100644
--- a/docs/en/connector-v2/sink/OssFile.md
+++ b/docs/en/connector-v2/sink/OssFile.md
@@ -32,7 +32,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Data Type Mapping
@@ -109,9 +108,6 @@ If write to `csv`, `text` file type, All column will be string.
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
### path [string]
@@ -165,7 +161,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${Now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -244,18 +240,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
## How to Create an Oss Data Synchronization Jobs
The following example demonstrates how to create a data synchronization job that reads data from Fake Source and writes it to the Oss:
diff --git a/docs/en/connector-v2/sink/OssJindoFile.md b/docs/en/connector-v2/sink/OssJindoFile.md
index 1a55c319704..40441ea83ec 100644
--- a/docs/en/connector-v2/sink/OssJindoFile.md
+++ b/docs/en/connector-v2/sink/OssJindoFile.md
@@ -33,7 +33,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Options
@@ -62,9 +61,6 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
### path [string]
@@ -118,7 +114,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -197,18 +193,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
## Example
For text file format with `have_partition` and `custom_filename` and `sink_columns`
diff --git a/docs/en/connector-v2/sink/S3File.md b/docs/en/connector-v2/sink/S3File.md
index a3811ea34ac..84bca3cb80c 100644
--- a/docs/en/connector-v2/sink/S3File.md
+++ b/docs/en/connector-v2/sink/S3File.md
@@ -22,7 +22,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -117,9 +116,6 @@ If write to `csv`, `text` file type, All column will be string.
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml, specifies the tag name of the root element within the XML file. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml, specifies the tag name of the data rows within the XML file |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml, specifies Whether to process data using the tag attribute format. |
| hadoop_s3_properties | map | no | | If you need to add a other option, you could add it here and refer to this [link](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html) |
| schema_save_mode | Enum | no | CREATE_SCHEMA_WHEN_NOT_EXIST | Before turning on the synchronous task, do different treatment of the target path |
| data_save_mode | Enum | no | APPEND_DATA | Before opening the synchronous task, the data file in the target path is differently processed |
@@ -171,7 +167,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -250,18 +246,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
### schema_save_mode[Enum]
Before turning on the synchronous task, do different treatment of the target path.
diff --git a/docs/en/connector-v2/sink/SftpFile.md b/docs/en/connector-v2/sink/SftpFile.md
index 448d1dd050d..7bb3f12559b 100644
--- a/docs/en/connector-v2/sink/SftpFile.md
+++ b/docs/en/connector-v2/sink/SftpFile.md
@@ -27,7 +27,6 @@ By default, we use 2PC commit to ensure `exactly-once`
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Options
@@ -56,9 +55,6 @@ By default, we use 2PC commit to ensure `exactly-once`
| common-options | object | no | - | |
| max_rows_in_memory | int | no | - | Only used when file_format_type is excel. |
| sheet_name | string | no | Sheet${Random number} | Only used when file_format_type is excel. |
-| xml_root_tag | string | no | RECORDS | Only used when file_format is xml. |
-| xml_row_tag | string | no | RECORD | Only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Only used when file_format is xml. |
### host [string]
@@ -112,7 +108,7 @@ When the format in the `file_name_expression` parameter is `xxxx-${now}` , `file
We supported as the following file types:
-`text` `json` `csv` `orc` `parquet` `excel` `xml`
+`text` `json` `csv` `orc` `parquet` `excel`
Please note that, The final file name will end with the file_format_type's suffix, the suffix of the text file is `txt`.
@@ -191,18 +187,6 @@ When File Format is Excel,The maximum number of data items that can be cached in
Writer the sheet of the workbook
-### xml_root_tag [string]
-
-Specifies the tag name of the root element within the XML file.
-
-### xml_row_tag [string]
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Specifies Whether to process data using the tag attribute format.
-
## Example
For text file format with `have_partition` and `custom_filename` and `sink_columns`
diff --git a/docs/en/connector-v2/source/CosFile.md b/docs/en/connector-v2/source/CosFile.md
index 7f0d6020800..406c86fab5b 100644
--- a/docs/en/connector-v2/source/CosFile.md
+++ b/docs/en/connector-v2/source/CosFile.md
@@ -26,7 +26,6 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -61,8 +60,6 @@ To use this connector you need put hadoop-cos-{hadoop.version}-{version}.jar and
| time_format | string | no | HH:mm:ss |
| schema | config | no | - |
| sheet_name | string | no | - |
-| xml_row_tag | string | no | - |
-| xml_use_attr_format | boolean | no | - |
| file_filter_pattern | string | no | - |
| compress_codec | string | no | none |
| common-options | | no | - |
@@ -75,7 +72,7 @@ The source file path.
File type, supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
If you assign file type to `json`, you should also assign schema option to tell connector how to parse data to the row you want.
@@ -239,7 +236,7 @@ default `HH:mm:ss`
### schema [config]
-Only need to be configured when the file_format_type are text, json, excel, xml or csv ( Or other format we can't read the schema from metadata).
+Only need to be configured when the file_format_type are text, json, excel or csv ( Or other format we can't read the schema from metadata).
#### fields [Config]
@@ -251,18 +248,6 @@ Only need to be configured when file_format is excel.
Reader the sheet of the workbook.
-### xml_row_tag [string]
-
-Only need to be configured when file_format is xml.
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Only need to be configured when file_format is xml.
-
-Specifies Whether to process data using the tag attribute format.
-
### file_filter_pattern [string]
Filter pattern, which used for filtering files.
diff --git a/docs/en/connector-v2/source/Easysearch.md b/docs/en/connector-v2/source/Easysearch.md
deleted file mode 100644
index d94609c7723..00000000000
--- a/docs/en/connector-v2/source/Easysearch.md
+++ /dev/null
@@ -1,209 +0,0 @@
-# Easysearch
-
-> Easysearch source connector
-
-## Support Those Engines
-
-> Spark
-> Flink
-> SeaTunnel Zeta
-
-## Description
-
-Used to read data from INFINI Easysearch.
-
-## Using Dependency
-
-> Depenndency [easysearch-client](https://central.sonatype.com/artifact/com.infinilabs/easysearch-client)
-
-## Key features
-
-- [x] [batch](../../concept/connector-v2-features.md)
-- [ ] [stream](../../concept/connector-v2-features.md)
-- [ ] [exactly-once](../../concept/connector-v2-features.md)
-- [x] [column projection](../../concept/connector-v2-features.md)
-- [ ] [parallelism](../../concept/connector-v2-features.md)
-- [ ] [support user-defined split](../../concept/connector-v2-features.md)
-
-:::tip
-
-Engine Supported
-
-* Supported all versions released by [INFINI Easysearch](https://www.infini.com/download/?product=easysearch).
-
-:::
-
-## Data Type Mapping
-
-| Easysearch Data Type | SeaTunnel Data Type |
-|-----------------------------|----------------------|
-| STRING KEYWORD TEXT | STRING |
-| BOOLEAN | BOOLEAN |
-| BYTE | BYTE |
-| SHORT | SHORT |
-| INTEGER | INT |
-| LONG | LONG |
-| FLOAT HALF_FLOAT | FLOAT |
-| DOUBLE | DOUBLE |
-| Date | LOCAL_DATE_TIME_TYPE |
-
-### hosts [array]
-
-Easysearch cluster http address, the format is `host:port`, allowing multiple hosts to be specified. Such as `["host1:9200", "host2:9200"]`.
-
-### username [string]
-
-security username.
-
-### password [string]
-
-security password.
-
-### index [string]
-
-Easysearch index name, support * fuzzy matching.
-
-### source [array]
-
-The fields of index.
-You can get the document id by specifying the field `_id`.If sink _id to other index,you need specify an alias for _id due to the Easysearch limit.
-If you don't config source, you must config `schema`.
-
-### query [json]
-
-Easysearch DSL.
-You can control the range of data read.
-
-### scroll_time [String]
-
-Amount of time Easysearch will keep the search context alive for scroll requests.
-
-### scroll_size [int]
-
-Maximum number of hits to be returned with each Easysearch scroll request.
-
-### schema
-
-The structure of the data, including field names and field types.
-If you don't config schema, you must config `source`.
-
-### tls_verify_certificate [boolean]
-
-Enable certificates validation for HTTPS endpoints
-
-### tls_verify_hostname [boolean]
-
-Enable hostname validation for HTTPS endpoints
-
-### tls_keystore_path [string]
-
-The path to the PEM or JKS key store. This file must be readable by the operating system user running SeaTunnel.
-
-### tls_keystore_password [string]
-
-The key password for the key store specified
-
-### tls_truststore_path [string]
-
-The path to PEM or JKS trust store. This file must be readable by the operating system user running SeaTunnel.
-
-### tls_truststore_password [string]
-
-The key password for the trust store specified
-
-### common options
-
-Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details
-
-## Examples
-
-simple
-
-```hocon
-Easysearch {
- hosts = ["localhost:9200"]
- index = "seatunnel-*"
- source = ["_id","name","age"]
- query = {"range":{"firstPacket":{"gte":1700407367588,"lte":1700407367588}}}
-}
-```
-
-complex
-
-```hocon
-Easysearch {
- hosts = ["Easysearch:9200"]
- index = "st_index"
- schema = {
- fields {
- c_map = "map"
- c_array = "array"
- c_string = string
- c_boolean = boolean
- c_tinyint = tinyint
- c_smallint = smallint
- c_int = int
- c_bigint = bigint
- c_float = float
- c_double = double
- c_decimal = "decimal(2, 1)"
- c_bytes = bytes
- c_date = date
- c_timestamp = timestamp
- }
- }
- query = {"range":{"firstPacket":{"gte":1700407367588,"lte":1700407367588}}}
-}
-```
-
-SSL (Disable certificates validation)
-
-```hocon
-source {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_verify_certificate = false
- }
-}
-```
-
-SSL (Disable hostname validation)
-
-```hocon
-source {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_verify_hostname = false
- }
-}
-```
-
-SSL (Enable certificates validation)
-
-```hocon
-source {
- Easysearch {
- hosts = ["https://localhost:9200"]
- username = "admin"
- password = "admin"
-
- tls_keystore_path = "${your Easysearch home}/config/certs/http.p12"
- tls_keystore_password = "${your password}"
- }
-}
-```
-
-## Changelog
-
-### next version
-
-- Add Easysearch Source Connector
-- Support https protocol
-- Support DSL
-
diff --git a/docs/en/connector-v2/source/FtpFile.md b/docs/en/connector-v2/source/FtpFile.md
index e103c14a9ae..ee231bb087b 100644
--- a/docs/en/connector-v2/source/FtpFile.md
+++ b/docs/en/connector-v2/source/FtpFile.md
@@ -21,7 +21,6 @@
- [x] csv
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -55,8 +54,6 @@ If you use SeaTunnel Engine, It automatically integrated the hadoop jar when you
| skip_header_row_number | long | no | 0 |
| schema | config | no | - |
| sheet_name | string | no | - |
-| xml_row_tag | string | no | - |
-| xml_use_attr_format | boolean | no | - |
| file_filter_pattern | string | no | - |
| compress_codec | string | no | none |
| common-options | | no | - |
@@ -85,7 +82,7 @@ The source file path.
File type, supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
If you assign file type to `json` , you should also assign schema option to tell connector how to parse data to the row you want.
@@ -224,7 +221,7 @@ then SeaTunnel will skip the first 2 lines from source files
### schema [config]
-Only need to be configured when the file_format_type are text, json, excel, xml or csv ( Or other format we can't read the schema from metadata).
+Only need to be configured when the file_format_type are text, json, excel or csv ( Or other format we can't read the schema from metadata).
The schema information of upstream data.
@@ -236,18 +233,6 @@ The read column list of the data source, user can use it to implement field proj
Reader the sheet of the workbook,Only used when file_format_type is excel.
-### xml_row_tag [string]
-
-Only need to be configured when file_format is xml.
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Only need to be configured when file_format is xml.
-
-Specifies Whether to process data using the tag attribute format.
-
### compress_codec [string]
The compress codec of files and the details that supported as the following shown:
diff --git a/docs/en/connector-v2/source/HdfsFile.md b/docs/en/connector-v2/source/HdfsFile.md
index 5534dcd9653..ffcb0b68678 100644
--- a/docs/en/connector-v2/source/HdfsFile.md
+++ b/docs/en/connector-v2/source/HdfsFile.md
@@ -26,7 +26,6 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -43,9 +42,9 @@ Read data from hdfs file system.
| Name | Type | Required | Default | Description |
|---------------------------|---------|----------|---------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| path | string | yes | - | The source file path. |
-| file_format_type | string | yes | - | We supported as the following file types:`text` `json` `csv` `orc` `parquet` `excel` `xml`.Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`. |
+| file_format_type | string | yes | - | We supported as the following file types:`text` `json` `csv` `orc` `parquet` `excel`.Please note that, The final file name will end with the file_format's suffix, the suffix of the text file is `txt`. |
| fs.defaultFS | string | yes | - | The hadoop cluster address that start with `hdfs://`, for example: `hdfs://hadoopcluster` |
-| read_columns | list | yes | - | The read column list of the data source, user can use it to implement field projection.The file type supported column projection as the following shown:[text,json,csv,orc,parquet,excel,xml].Tips: If the user wants to use this feature when reading `text` `json` `csv` files, the schema option must be configured. |
+| read_columns | list | yes | - | The read column list of the data source, user can use it to implement field projection.The file type supported column projection as the following shown:[text,json,csv,orc,parquet,excel].Tips: If the user wants to use this feature when reading `text` `json` `csv` files, the schema option must be configured. |
| hdfs_site_path | string | no | - | The path of `hdfs-site.xml`, used to load ha configuration of namenodes |
| delimiter/field_delimiter | string | no | \001 | Field delimiter, used to tell connector how to slice and dice fields when reading text files. default `\001`, the same as hive's default delimiter |
| parse_partition_from_path | boolean | no | true | Control whether parse the partition keys and values from file path. For example if you read a file from path `hdfs://hadoop-cluster/tmp/seatunnel/parquet/name=tyrantlucifer/age=26`. Every record data from file will be added these two fields:[name:tyrantlucifer,age:26].Tips:Do not define partition fields in schema option. |
@@ -59,8 +58,6 @@ Read data from hdfs file system.
| skip_header_row_number | long | no | 0 | Skip the first few lines, but only for the txt and csv.For example, set like following:`skip_header_row_number = 2`.then Seatunnel will skip the first 2 lines from source files |
| schema | config | no | - | the schema fields of upstream data |
| sheet_name | string | no | - | Reader the sheet of the workbook,Only used when file_format is excel. |
-| xml_row_tag | string | no | - | Specifies the tag name of the data rows within the XML file, only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Specifies whether to process data using the tag attribute format, only used when file_format is xml. |
| compress_codec | string | no | none | The compress codec of files |
| common-options | | no | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
diff --git a/docs/en/connector-v2/source/Hive.md b/docs/en/connector-v2/source/Hive.md
index 5d51a19f89c..14306ef953d 100644
--- a/docs/en/connector-v2/source/Hive.md
+++ b/docs/en/connector-v2/source/Hive.md
@@ -33,19 +33,20 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
## Options
-| name | type | required | default value |
-|----------------------|--------|----------|----------------|
-| table_name | string | yes | - |
-| metastore_uri | string | yes | - |
-| krb5_path | string | no | /etc/krb5.conf |
-| kerberos_principal | string | no | - |
-| kerberos_keytab_path | string | no | - |
-| hdfs_site_path | string | no | - |
-| hive_site_path | string | no | - |
-| read_partitions | list | no | - |
-| read_columns | list | no | - |
-| compress_codec | string | no | none |
-| common-options | | no | - |
+| name | type | required | default value |
+|-------------------------------|---------|----------|----------------|
+| table_name | string | yes | - |
+| metastore_uri | string | yes | - |
+| krb5_path | string | no | /etc/krb5.conf |
+| kerberos_principal | string | no | - |
+| kerberos_keytab_path | string | no | - |
+| hdfs_site_path | string | no | - |
+| hive_site_path | string | no | - |
+| read_partitions | list | no | - |
+| read_columns | list | no | - |
+| abort_drop_partition_metadata | boolean | no | true |
+| compress_codec | string | no | none |
+| common-options | | no | - |
### table_name [string]
@@ -86,6 +87,10 @@ The keytab file path of kerberos authentication
The read column list of the data source, user can use it to implement field projection.
+### abort_drop_partition_metadata [list]
+
+Flag to decide whether to drop partition metadata from Hive Metastore during an abort operation. Note: this only affects the metadata in the metastore, the data in the partition will always be deleted(data generated during the synchronization process).
+
### compress_codec [string]
The compress codec of files and the details that supported as the following shown:
diff --git a/docs/en/connector-v2/source/LocalFile.md b/docs/en/connector-v2/source/LocalFile.md
index 172049498cc..4d20ca532d1 100644
--- a/docs/en/connector-v2/source/LocalFile.md
+++ b/docs/en/connector-v2/source/LocalFile.md
@@ -26,7 +26,6 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -55,8 +54,6 @@ If you use SeaTunnel Engine, It automatically integrated the hadoop jar when you
| skip_header_row_number | long | no | 0 |
| schema | config | no | - |
| sheet_name | string | no | - |
-| xml_row_tag | string | no | - |
-| xml_use_attr_format | boolean | no | - |
| file_filter_pattern | string | no | - |
| compress_codec | string | no | none |
| common-options | | no | - |
@@ -70,7 +67,7 @@ The source file path.
File type, supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
If you assign file type to `json`, you should also assign schema option to tell connector how to parse data to the row you want.
@@ -218,7 +215,7 @@ then SeaTunnel will skip the first 2 lines from source files
### schema [config]
-Only need to be configured when the file_format_type are text, json, excel, xml or csv ( Or other format we can't read the schema from metadata).
+Only need to be configured when the file_format_type are text, json, excel or csv ( Or other format we can't read the schema from metadata).
#### fields [Config]
@@ -230,18 +227,6 @@ Only need to be configured when file_format is excel.
Reader the sheet of the workbook.
-### xml_row_tag [string]
-
-Only need to be configured when file_format is xml.
-
-Specifies the tag name of the data rows within the XML file.
-
-### xml_use_attr_format [boolean]
-
-Only need to be configured when file_format is xml.
-
-Specifies Whether to process data using the tag attribute format.
-
### file_filter_pattern [string]
Filter pattern, which used for filtering files.
diff --git a/docs/en/connector-v2/source/OssFile.md b/docs/en/connector-v2/source/OssFile.md
index 85d922644de..233eb76800f 100644
--- a/docs/en/connector-v2/source/OssFile.md
+++ b/docs/en/connector-v2/source/OssFile.md
@@ -37,13 +37,12 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Data Type Mapping
Data type mapping is related to the type of file being read, We supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
### JSON File Type
@@ -189,28 +188,26 @@ If you assign file type to `parquet` `orc`, schema option not required, connecto
## Options
-| name | type | required | default value | Description |
-|---------------------------|---------|----------|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| path | string | yes | - | The Oss path that needs to be read can have sub paths, but the sub paths need to meet certain format requirements. Specific requirements can be referred to "parse_partition_from_path" option |
-| file_format_type | string | yes | - | File type, supported as the following file types: `text` `csv` `parquet` `orc` `json` `excel` `xml` |
-| bucket | string | yes | - | The bucket address of oss file system, for example: `oss://seatunnel-test`. |
-| endpoint | string | yes | - | fs oss endpoint |
-| read_columns | list | no | - | The read column list of the data source, user can use it to implement field projection. The file type supported column projection as the following shown: `text` `csv` `parquet` `orc` `json` `excel` `xml` . If the user wants to use this feature when reading `text` `json` `csv` files, the "schema" option must be configured. |
-| access_key | string | no | - | |
-| access_secret | string | no | - | |
-| delimiter | string | no | \001 | Field delimiter, used to tell connector how to slice and dice fields when reading text files. Default `\001`, the same as hive's default delimiter. |
-| parse_partition_from_path | boolean | no | true | Control whether parse the partition keys and values from file path. For example if you read a file from path `oss://hadoop-cluster/tmp/seatunnel/parquet/name=tyrantlucifer/age=26`. Every record data from file will be added these two fields: name="tyrantlucifer", age=16 |
-| date_format | string | no | yyyy-MM-dd | Date type format, used to tell connector how to convert string to date, supported as the following formats:`yyyy-MM-dd` `yyyy.MM.dd` `yyyy/MM/dd`. default `yyyy-MM-dd` |
-| datetime_format | string | no | yyyy-MM-dd HH:mm:ss | Datetime type format, used to tell connector how to convert string to datetime, supported as the following formats:`yyyy-MM-dd HH:mm:ss` `yyyy.MM.dd HH:mm:ss` `yyyy/MM/dd HH:mm:ss` `yyyyMMddHHmmss` |
-| time_format | string | no | HH:mm:ss | Time type format, used to tell connector how to convert string to time, supported as the following formats:`HH:mm:ss` `HH:mm:ss.SSS` |
-| skip_header_row_number | long | no | 0 | Skip the first few lines, but only for the txt and csv. For example, set like following:`skip_header_row_number = 2`. Then SeaTunnel will skip the first 2 lines from source files |
-| schema | config | no | - | The schema of upstream data. |
-| sheet_name | string | no | - | Reader the sheet of the workbook,Only used when file_format is excel. |
-| xml_row_tag | string | no | - | Specifies the tag name of the data rows within the XML file, only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Specifies whether to process data using the tag attribute format, only used when file_format is xml. |
-| compress_codec | string | no | none | Which compress codec the files used. |
-| file_filter_pattern | string | no | | `*.txt` means you only need read the files end with `.txt` |
-| common-options | config | no | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
+| name | type | required | default value | Description |
+|---------------------------|---------|----------|---------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| path | string | yes | - | The Oss path that needs to be read can have sub paths, but the sub paths need to meet certain format requirements. Specific requirements can be referred to "parse_partition_from_path" option |
+| file_format_type | string | yes | - | File type, supported as the following file types: `text` `csv` `parquet` `orc` `json` `excel` |
+| bucket | string | yes | - | The bucket address of oss file system, for example: `oss://seatunnel-test`. |
+| endpoint | string | yes | - | fs oss endpoint |
+| read_columns | list | no | - | The read column list of the data source, user can use it to implement field projection. The file type supported column projection as the following shown: `text` `csv` `parquet` `orc` `json` `excel` . If the user wants to use this feature when reading `text` `json` `csv` files, the "schema" option must be configured. |
+| access_key | string | no | - | |
+| access_secret | string | no | - | |
+| delimiter | string | no | \001 | Field delimiter, used to tell connector how to slice and dice fields when reading text files. Default `\001`, the same as hive's default delimiter. |
+| parse_partition_from_path | boolean | no | true | Control whether parse the partition keys and values from file path. For example if you read a file from path `oss://hadoop-cluster/tmp/seatunnel/parquet/name=tyrantlucifer/age=26`. Every record data from file will be added these two fields: name="tyrantlucifer", age=16 |
+| date_format | string | no | yyyy-MM-dd | Date type format, used to tell connector how to convert string to date, supported as the following formats:`yyyy-MM-dd` `yyyy.MM.dd` `yyyy/MM/dd`. default `yyyy-MM-dd` |
+| datetime_format | string | no | yyyy-MM-dd HH:mm:ss | Datetime type format, used to tell connector how to convert string to datetime, supported as the following formats:`yyyy-MM-dd HH:mm:ss` `yyyy.MM.dd HH:mm:ss` `yyyy/MM/dd HH:mm:ss` `yyyyMMddHHmmss` |
+| time_format | string | no | HH:mm:ss | Time type format, used to tell connector how to convert string to time, supported as the following formats:`HH:mm:ss` `HH:mm:ss.SSS` |
+| skip_header_row_number | long | no | 0 | Skip the first few lines, but only for the txt and csv. For example, set like following:`skip_header_row_number = 2`. Then SeaTunnel will skip the first 2 lines from source files |
+| schema | config | no | - | The schema of upstream data. |
+| sheet_name | string | no | - | Reader the sheet of the workbook,Only used when file_format is excel. |
+| compress_codec | string | no | none | Which compress codec the files used. |
+| file_filter_pattern | string | no | | `*.txt` means you only need read the files end with `.txt` |
+| common-options | config | no | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
### compress_codec [string]
@@ -228,7 +225,7 @@ Filter pattern, which used for filtering files.
### schema [config]
-Only need to be configured when the file_format_type are text, json, excel, xml or csv ( Or other format we can't read the schema from metadata).
+Only need to be configured when the file_format_type are text, json, excel or csv ( Or other format we can't read the schema from metadata).
#### fields [Config]
diff --git a/docs/en/connector-v2/source/OssJindoFile.md b/docs/en/connector-v2/source/OssJindoFile.md
index d1a28265539..27b710cfb8a 100644
--- a/docs/en/connector-v2/source/OssJindoFile.md
+++ b/docs/en/connector-v2/source/OssJindoFile.md
@@ -26,7 +26,6 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -65,8 +64,6 @@ It only supports hadoop version **2.9.X+**.
| skip_header_row_number | long | no | 0 |
| schema | config | no | - |
| sheet_name | string | no | - |
-| xml_row_tag | string | no | - |
-| xml_use_attr_format | boolean | no | - |
| file_filter_pattern | string | no | - |
| compress_codec | string | no | none |
| common-options | | no | - |
@@ -79,7 +76,7 @@ The source file path.
File type, supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
If you assign file type to `json`, you should also assign schema option to tell connector how to parse data to the row you want.
@@ -243,7 +240,7 @@ then SeaTunnel will skip the first 2 lines from source files
### schema [config]
-Only need to be configured when the file_format_type are text, json, excel, xml or csv ( Or other format we can't read the schema from metadata).
+Only need to be configured when the file_format_type are text, json, excel or csv ( Or other format we can't read the schema from metadata).
#### fields [Config]
diff --git a/docs/en/connector-v2/source/S3File.md b/docs/en/connector-v2/source/S3File.md
index 0387af044d6..7ad6f5735cc 100644
--- a/docs/en/connector-v2/source/S3File.md
+++ b/docs/en/connector-v2/source/S3File.md
@@ -26,7 +26,6 @@ Read all the data in a split in a pollNext call. What splits are read will be sa
- [x] orc
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -49,7 +48,7 @@ Read data from aws s3 file system.
Data type mapping is related to the type of file being read, We supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
### JSON File Type
@@ -198,11 +197,11 @@ If you assign file type to `parquet` `orc`, schema option not required, connecto
| name | type | required | default value | Description |
|---------------------------------|---------|----------|-------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| path | string | yes | - | The s3 path that needs to be read can have sub paths, but the sub paths need to meet certain format requirements. Specific requirements can be referred to "parse_partition_from_path" option |
-| file_format_type | string | yes | - | File type, supported as the following file types: `text` `csv` `parquet` `orc` `json` `excel` `xml` |
+| file_format_type | string | yes | - | File type, supported as the following file types: `text` `csv` `parquet` `orc` `json` `excel` |
| bucket | string | yes | - | The bucket address of s3 file system, for example: `s3n://seatunnel-test`, if you use `s3a` protocol, this parameter should be `s3a://seatunnel-test`. |
| fs.s3a.endpoint | string | yes | - | fs s3a endpoint |
| fs.s3a.aws.credentials.provider | string | yes | com.amazonaws.auth.InstanceProfileCredentialsProvider | The way to authenticate s3a. We only support `org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider` and `com.amazonaws.auth.InstanceProfileCredentialsProvider` now. More information about the credential provider you can see [Hadoop AWS Document](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#Simple_name.2Fsecret_credentials_with_SimpleAWSCredentialsProvider.2A) |
-| read_columns | list | no | - | The read column list of the data source, user can use it to implement field projection. The file type supported column projection as the following shown: `text` `csv` `parquet` `orc` `json` `excel` `xml` . If the user wants to use this feature when reading `text` `json` `csv` files, the "schema" option must be configured. |
+| read_columns | list | no | - | The read column list of the data source, user can use it to implement field projection. The file type supported column projection as the following shown: `text` `csv` `parquet` `orc` `json` `excel` . If the user wants to use this feature when reading `text` `json` `csv` files, the "schema" option must be configured. |
| access_key | string | no | - | Only used when `fs.s3a.aws.credentials.provider = org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider ` |
| access_secret | string | no | - | Only used when `fs.s3a.aws.credentials.provider = org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider ` |
| hadoop_s3_properties | map | no | - | If you need to add other option, you could add it here and refer to this [link](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html) |
@@ -214,8 +213,6 @@ If you assign file type to `parquet` `orc`, schema option not required, connecto
| skip_header_row_number | long | no | 0 | Skip the first few lines, but only for the txt and csv. For example, set like following:`skip_header_row_number = 2`. Then SeaTunnel will skip the first 2 lines from source files |
| schema | config | no | - | The schema of upstream data. |
| sheet_name | string | no | - | Reader the sheet of the workbook,Only used when file_format is excel. |
-| xml_row_tag | string | no | - | Specifies the tag name of the data rows within the XML file, only valid for XML files. |
-| xml_use_attr_format | boolean | no | - | Specifies whether to process data using the tag attribute format, only valid for XML files. |
| compress_codec | string | no | none |
| common-options | | no | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
diff --git a/docs/en/connector-v2/source/SftpFile.md b/docs/en/connector-v2/source/SftpFile.md
index 0f179749fbc..4f6e9af44bc 100644
--- a/docs/en/connector-v2/source/SftpFile.md
+++ b/docs/en/connector-v2/source/SftpFile.md
@@ -21,7 +21,6 @@
- [x] csv
- [x] json
- [x] excel
- - [x] xml
## Description
@@ -87,8 +86,6 @@ The File does not have a specific type list, and we can indicate which SeaTunnel
| skip_header_row_number | Long | No | 0 | Skip the first few lines, but only for the txt and csv. For example, set like following: `skip_header_row_number = 2` then SeaTunnel will skip the first 2 lines from source files |
| read_columns | list | no | - | The read column list of the data source, user can use it to implement field projection. |
| sheet_name | String | No | - | Reader the sheet of the workbook,Only used when file_format is excel. |
-| xml_row_tag | string | no | - | Specifies the tag name of the data rows within the XML file, only used when file_format is xml. |
-| xml_use_attr_format | boolean | no | - | Specifies whether to process data using the tag attribute format, only used when file_format is xml. |
| schema | Config | No | - | Please check #schema below |
| compress_codec | String | No | None | The compress codec of files and the details that supported as the following shown: - txt: `lzo` `None` - json: `lzo` `None` - csv: `lzo` `None` - orc: `lzo` `snappy` `lz4` `zlib` `None` - parquet: `lzo` `snappy` `lz4` `gzip` `brotli` `zstd` `None` Tips: excel type does Not support any compression format |
| common-options | | No | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
@@ -96,7 +93,7 @@ The File does not have a specific type list, and we can indicate which SeaTunnel
### file_format_type [string]
File type, supported as the following file types:
-`text` `csv` `parquet` `orc` `json` `excel` `xml`
+`text` `csv` `parquet` `orc` `json` `excel`
If you assign file type to `json`, you should also assign schema option to tell connector how to parse data to the row you want.
For example:
upstream data is the following:
diff --git a/docs/en/other-engine/flink.md b/docs/en/other-engine/flink.md
index 567bfb7ca10..f2d45383744 100644
--- a/docs/en/other-engine/flink.md
+++ b/docs/en/other-engine/flink.md
@@ -1,6 +1,6 @@
# Seatunnel runs on Flink
-Flink is a powerful high-performance distributed stream processing engine,More information about it you can,You can search for `Apache Flink`
+Flink is a powerful high-performance distributed stream processing engine,More information about it you can,You can search for `Apacke Flink`
### Set Flink configuration information in the job
diff --git a/docs/en/seatunnel-engine/rest-api.md b/docs/en/seatunnel-engine/rest-api.md
index 4a56c7da7e2..6c4a4064fcb 100644
--- a/docs/en/seatunnel-engine/rest-api.md
+++ b/docs/en/seatunnel-engine/rest-api.md
@@ -111,14 +111,6 @@ network:
}
```
-When we can't get the job info, the response will be:
-
-```json
-{
- "jobId" : ""
-}
-```
-
------------------------------------------------------------------------------------------
diff --git a/docs/en/start-v2/kubernetes/kubernetes.mdx b/docs/en/start-v2/kubernetes/kubernetes.mdx
index 15dd1f503a1..dc913478ab4 100644
--- a/docs/en/start-v2/kubernetes/kubernetes.mdx
+++ b/docs/en/start-v2/kubernetes/kubernetes.mdx
@@ -51,7 +51,7 @@ RUN wget https://dlcdn.apache.org/seatunnel/${SEATUNNEL_VERSION}/apache-seatunne
RUN tar -xzvf apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN mv apache-seatunnel-${SEATUNNEL_VERSION} ${SEATUNNEL_HOME}
-RUN cd ${SEATUNNEL_HOME} && sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
+RUN cd ${SEATUNNEL_HOME}||sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
```
Then run the following commands to build the image:
@@ -79,7 +79,7 @@ RUN wget https://dlcdn.apache.org/seatunnel/${SEATUNNEL_VERSION}/apache-seatunne
RUN tar -xzvf apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN mv apache-seatunnel-${SEATUNNEL_VERSION} ${SEATUNNEL_HOME}
-RUN cd ${SEATUNNEL_HOME} && sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
+RUN cd ${SEATUNNEL_HOME}||sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
```
Then run the following commands to build the image:
@@ -107,7 +107,7 @@ RUN wget https://dlcdn.apache.org/seatunnel/${SEATUNNEL_VERSION}/apache-seatunne
RUN tar -xzvf apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN mv apache-seatunnel-${SEATUNNEL_VERSION} ${SEATUNNEL_HOME}
RUN mkdir -p $SEATUNNEL_HOME/logs
-RUN cd ${SEATUNNEL_HOME} && sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
+RUN cd ${SEATUNNEL_HOME}||sh bin/install-plugin.sh ${SEATUNNEL_VERSION}
```
Then run the following commands to build the image:
diff --git a/docs/zh/concept/JobEnvConfig.md b/docs/zh/concept/JobEnvConfig.md
deleted file mode 100644
index c9f3cd9fda6..00000000000
--- a/docs/zh/concept/JobEnvConfig.md
+++ /dev/null
@@ -1,52 +0,0 @@
-# JobEnvConfig
-
-本文档描述了env的配置信息,公共参数可以在所有引擎中使用。为了更好的区分引擎参数,其他引擎的附加参数需要携带前缀。
-在flink引擎中,我们使用`flink.`作为前缀。在spark引擎中,我们不使用任何前缀来修改参数,因为官方的spark参数本身就是以`spark.`开头。
-
-## 公共参数
-
-以下配置参数对所有引擎通用:
-
-### job.name
-
-该参数配置任务名称。
-
-### jars
-
-第三方包可以通过`jars`加载,例如:`jars="file://local/jar1.jar;file://local/jar2.jar"`
-
-### job.mode
-
-通过`job.mode`你可以配置任务是在批处理模式还是流处理模式。例如:`job.mode = "BATCH"` 或者 `job.mode = "STREAMING"`
-
-### checkpoint.interval
-
-获取定时调度检查点的时间间隔。
-
-在`STREAMING`模式下,检查点是必须的,如果不设置,将从应用程序配置文件`seatunnel.yaml`中获取。 在`BATCH`模式下,您可以通过不设置此参数来禁用检查点。
-
-### parallelism
-
-该参数配置source和sink的并行度。
-
-### shade.identifier
-
-指定加密方式,如果您没有加密或解密配置文件的需求,此选项可以忽略。
-
-更多详细信息,您可以参考文档 [config-encryption-decryption](../../en/connector-v2/Config-Encryption-Decryption.md)
-
-## Flink 引擎参数
-
-这里列出了一些与 Flink 中名称相对应的 SeaTunnel 参数名称,并非全部,更多内容请参考官方 [flink documentation](https://flink.apache.org/) for more.
-
-| Flink 配置名称 | SeaTunnel 配置名称 |
-|---------------------------------|---------------------------------------|
-| pipeline.max-parallelism | flink.pipeline.max-parallelism |
-| execution.checkpointing.mode | flink.execution.checkpointing.mode |
-| execution.checkpointing.timeout | flink.execution.checkpointing.timeout |
-| ... | ... |
-
-## Spark 引擎参数
-
-由于spark配置项并无调整,这里就不列出来了,请参考官方 [spark documentation](https://spark.apache.org/).
-
diff --git a/docs/zh/concept/config.md b/docs/zh/concept/config.md
deleted file mode 100644
index c00425ca030..00000000000
--- a/docs/zh/concept/config.md
+++ /dev/null
@@ -1,191 +0,0 @@
----
-
-sidebar_position: 2
--------------------
-
-# 配置文件简介
-
-In SeaTunnel, the most important thing is the Config file, through which users can customize their own data
-synchronization requirements to maximize the potential of SeaTunnel. So next, I will introduce you how to
-configure the Config file.
-
-在SeaTunnel中,最重要的事情就是配置文件,尽管用户可以自定义他们自己的数据同步需求以发挥SeaTunnel最大的潜力。那么接下来,
-我将会向你介绍如何设置配置文件。
-
-The main format of the Config file is `hocon`, for more details of this format type you can refer to [HOCON-GUIDE](https://github.com/lightbend/config/blob/main/HOCON.md),
-BTW, we also support the `json` format, but you should know that the name of the config file should end with `.json`
-
-配置文件的主要格式是 `hocon`, 有关该格式类型的更多信息你可以参考[HOCON-GUIDE](https://github.com/lightbend/config/blob/main/HOCON.md),
-顺便提一下,我们也支持 `json`格式,但你应该知道配置文件的名称应该是以 `.json`结尾。
-
-## 例子
-
-在你阅读之前,你可以在发布包中的config目录[这里](https://github.com/apache/seatunnel/tree/dev/config)找到配置文件的例子。
-
-## 配置文件结构
-
-配置文件类似下面。
-
-### hocon
-
-```hocon
-env {
- job.mode = "BATCH"
-}
-
-source {
- FakeSource {
- result_table_name = "fake"
- row.num = 100
- schema = {
- fields {
- name = "string"
- age = "int"
- card = "int"
- }
- }
- }
-}
-
-transform {
- Filter {
- source_table_name = "fake"
- result_table_name = "fake1"
- fields = [name, card]
- }
-}
-
-sink {
- Clickhouse {
- host = "clickhouse:8123"
- database = "default"
- table = "seatunnel_console"
- fields = ["name", "card"]
- username = "default"
- password = ""
- source_table_name = "fake1"
- }
-}
-```
-
-### json
-
-```json
-
-{
- "env": {
- "job.mode": "batch"
- },
- "source": [
- {
- "plugin_name": "FakeSource",
- "result_table_name": "fake",
- "row.num": 100,
- "schema": {
- "fields": {
- "name": "string",
- "age": "int",
- "card": "int"
- }
- }
- }
- ],
- "transform": [
- {
- "plugin_name": "Filter",
- "source_table_name": "fake",
- "result_table_name": "fake1",
- "fields": ["name", "card"]
- }
- ],
- "sink": [
- {
- "plugin_name": "Clickhouse",
- "host": "clickhouse:8123",
- "database": "default",
- "table": "seatunnel_console",
- "fields": ["name", "card"],
- "username": "default",
- "password": "",
- "source_table_name": "fake1"
- }
- ]
-}
-
-```
-
-正如你看到的,配置文件包括几个部分:env, source, transform, sink。不同的模块有不同的功能。
-当你了解了这些模块后,你就会懂得SeaTunnel如何工作。
-
-### env
-
-用于添加引擎可选的参数,不管是什么引擎(Spark 或者 Flink),对应的可选参数应该在这里填写。
-
-注意,我们按照引擎分离了参数,对于公共参数,我们可以像以前一样配置。对于Flink和Spark引擎,其参数的具体配置规则可以参考[JobEnvConfig](./JobEnvConfig.md)。
-
-
-
-### source
-
-source用于定义SeaTunnel在哪儿检索数据,并将检索的数据用于下一步。
-可以同时定义多个source。目前支持的source请看[Source of SeaTunnel](../../en/connector-v2/source)。每种source都有自己特定的参数用来
-定义如何检索数据,SeaTunnel也抽象了每种source所使用的参数,例如 `result_table_name` 参数,用于指定当前source生成的数据的名称,
-方便后续其他模块使用。
-
-### transform
-
-当我们有了数据源之后,我们可能需要对数据进行进一步的处理,所以我们就有了transform模块。当然,这里使用了“可能”这个词,
-这意味着我们也可以直接将transform视为不存在,直接从source到sink。像下面这样。
-
-```hocon
-env {
- job.mode = "BATCH"
-}
-
-source {
- FakeSource {
- result_table_name = "fake"
- row.num = 100
- schema = {
- fields {
- name = "string"
- age = "int"
- card = "int"
- }
- }
- }
-}
-
-sink {
- Clickhouse {
- host = "clickhouse:8123"
- database = "default"
- table = "seatunnel_console"
- fields = ["name", "age", "card"]
- username = "default"
- password = ""
- source_table_name = "fake1"
- }
-}
-```
-
-与source类似, transform也有属于每个模块的特定参数。目前支持的source请看。目前支持的transform请看 [Transform V2 of SeaTunnel](../../en/transform-v2)
-
-
-
-### sink
-
-我们使用SeaTunnel的作用是将数据从一个地方同步到其它地方,所以定义数据如何写入,写入到哪里是至关重要的。通过SeaTunnel提供的
-sink模块,你可以快速高效地完成这个操作。Sink和source非常相似,区别在于读取和写入。所以去看看我们[支持的sink](../../en/connector-v2/sink)吧。
-
-### 其它
-
-你会疑惑当定义了多个source和多个sink时,每个sink读取哪些数据,每个transform读取哪些数据?我们使用`result_table_name` 和
-`source_table_name` 两个键配置。每个source模块都会配置一个`result_table_name`来指示数据源生成的数据源名称,其它transform和sink
-模块可以使用`source_table_name` 引用相应的数据源名称,表示要读取数据进行处理。然后transform,作为一个中间的处理模块,可以同时使用
-`result_table_name` 和 `source_table_name` 配置。但你会发现在上面的配置例子中,不是每个模块都配置了这些参数,因为在SeaTunnel中,
-有一个默认的约定,如果这两个参数没有配置,则使用上一个节点的最后一个模块生成的数据。当只有一个source时这是非常方便的。
-
-## 此外
-
-如果你想了解更多关于格式配置的详细信息,请查看 [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md)。
diff --git a/docs/zh/concept/connector-v2-features.md b/docs/zh/concept/connector-v2-features.md
deleted file mode 100644
index 9708eb373d1..00000000000
--- a/docs/zh/concept/connector-v2-features.md
+++ /dev/null
@@ -1,70 +0,0 @@
-# Connector V2 功能简介
-
-## Connector V2 和 Connector V1 之间的不同
-
-从 https://github.com/apache/seatunnel/issues/1608 我们添加了 Connector V2 特性。
-Connector V2 是基于SeaTunnel Connector API接口定义的连接器。不像Connector V1,Connector V2 支持如下特性:
-
-* **多引擎支持** SeaTunnel Connector API 是引擎独立的API。基于这个API开发的连接器可以在多个引擎上运行。目前支持Flink和Spark引擎,后续我们会支持其它的引擎。
-* **多引擎版本支持** 通过翻译层将连接器与引擎解耦,解决了大多数连接器需要修改代码才能支持新版本底层引擎的问题。
-* **流批一体** Connector V2 可以支持批处理和流处理。我们不需要为批和流分别开发连接器。
-* **多路复用JDBC/Log连接。** Connector V2支持JDBC资源复用和共享数据库日志解析。
-
-## Source Connector 特性
-
-Source connector有一些公共的核心特性,每个source connector在不同程度上支持它们。
-
-### 精确一次(exactly-once)
-
-如果数据源中的每条数据仅由源向下游发送一次,我们认为该source connector支持精确一次(exactly-once)。
-
-在SeaTunnel中, 我们可以保存读取的 **Split** 和 它的 **offset**(当时读取的数据被分割时的位置,例如行号, 字节大小, 偏移量等) 作为检查点时的 **StateSnapshot** 。 如果任务重新启动, 我们会得到最后的 **StateSnapshot**
-然后定位到上次读取的 **Split** 和 **offset**,继续向下游发送数据。
-
-例如 `File`, `Kafka`。
-
-### 列投影(column projection)
-
-如果连接器支持仅从数据源读取指定列(请注意,如果先读取所有列,然后通过元数据(schema)过滤不需要的列,则此方法不是真正的列投影)
-
-例如 `JDBCSource` 可以使用sql定义读取列。
-
-`KafkaSource` 从主题中读取所有内容然后使用`schema`过滤不必要的列, 这不是真正的`列投影`。
-
-### 批(batch)
-
-批处理作业模式,读取的数据是有界的,当所有数据读取完成后作业将停止。
-
-### 流(stream)
-
-流式作业模式,数据读取无界,作业永不停止。
-
-### 并行性(parallelism)
-
-并行执行的Source Connector支持配置 `parallelism`,每个并发会创建一个任务来读取数据。
-在**Parallelism Source Connector**中,source会被分割成多个split,然后枚举器会将 split 分配给 SourceReader 进行处理。
-
-### 支持用户自定义split
-
-用户可以配置分割规则。
-
-### 支持多表读取
-
-支持在一个 SeaTunnel 作业中读取多个表
-
-## Sink Connector 的特性
-
-Sink connector有一些公共的核心特性,每个sink connector在不同程度上支持它们。
-
-### 精确一次(exactly-once)
-
-当任意一条数据流入分布式系统时,如果系统在整个处理过程中仅准确处理任意一条数据一次,且处理结果正确,则认为系统满足精确一次一致性。
-
-对于sink connector,如果任何数据只写入目标一次,则sink connector支持精确一次。 通常有两种方法可以实现这一目标:
-
-* 目标数据库支持key去重。例如 `MySQL`, `Kudu`。
-* 目标支持 **XA 事务**(事务可以跨会话使用。即使创建事务的程序已经结束,新启动的程序也只需要知道最后一个事务的ID就可以重新提交或回滚事务)。 然后我们可以使用 **两阶段提交** 来确保 * 精确一次**。 例如:`File`, `MySQL`.
-
-### cdc(更改数据捕获,change data capture)
-
-如果sink connector支持基于主键写入行类型(INSERT/UPDATE_BEFORE/UPDATE_AFTER/DELETE),我们认为它支持cdc(更改数据捕获,change data capture)。
diff --git a/docs/zh/concept/schema-feature.md b/docs/zh/concept/schema-feature.md
deleted file mode 100644
index cc69b6d83ea..00000000000
--- a/docs/zh/concept/schema-feature.md
+++ /dev/null
@@ -1,263 +0,0 @@
-# Schema 特性简介
-
-## 为什么我们需要Schema
-
-某些NoSQL数据库或消息队列没有严格限制schema,因此无法通过api获取schema。
-这时需要定义一个schema来转换为TableSchema并获取数据。
-
-## SchemaOptions
-
-我们可以使用SchemaOptions定义schema, SchemaOptions包含了一些定义schema的配置。 例如:columns, primaryKey, constraintKeys。
-
-```
-schema = {
- table = "database.schema.table"
- schema_first = false
- comment = "comment"
- columns = [
- ...
- ]
- primaryKey {
- ...
- }
-
- constraintKeys {
- ...
- }
-}
-```
-
-### table
-
-schema所属的表标识符的表全名,包含数据库、schema、表名。 例如 `database.schema.table`、`database.table`、`table`。
-
-### schema_first
-
-默认是false。
-
-如果schema_first是true, schema会优先使用, 这意味着如果我们设置 `table = "a.b"`, `a` 会被解析为schema而不是数据库, 那么我们可以支持写入 `table = "schema.table"`.
-
-### comment
-
-schema所属的 CatalogTable 的注释。
-
-### Columns
-
-Columns 是用于定义模式中的列的配置列表,每列可以包含名称(name)、类型(type)、是否可空(nullable)、默认值(defaultValue)、注释(comment)字段。
-
-```
-columns = [
- {
- name = id
- type = bigint
- nullable = false
- columnLength = 20
- defaultValue = 0
- comment = "primary key id"
- }
-]
-```
-
-| 字段 | 是否必须 | 默认值 | 描述 |
-|:-------------|:-----|:-----|--------------------|
-| name | Yes | - | 列的名称 |
-| type | Yes | - | 列的数据类型 |
-| nullable | No | true | 列是否可空 |
-| columnLength | No | 0 | 列的长度,当您需要定义长度时将很有用 |
-| defaultValue | No | null | 列的默认值 |
-| comment | No | null | 列的注释 |
-
-#### 目前支持哪些类型
-
-| 数据类型 | Java中的值类型 | 描述 |
-|:----------|:---------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| string | `java.lang.String` | 字符串 |
-| boolean | `java.lang.Boolean` | 布尔 |
-| tinyint | `java.lang.Byte` | 常规-128 至 127 。 0 到 255 无符号*。 指定括号中的最大位数。 |
-| smallint | `java.lang.Short` | 常规-32768 至 32767。 0 到 65535 无符号*。 指定括号中的最大位数。 |
-| int | `java.lang.Integer` | 允许从 -2,147,483,648 到 2,147,483,647 的所有数字。 |
-| bigint | `java.lang.Long` | 允许 -9,223,372,036,854,775,808 和 9,223,372,036,854,775,807 之间的所有数字。 |
-| float | `java.lang.Float` | 从-1.79E+308 到 1.79E+308浮点精度数值数据。 |
-| double | `java.lang.Double` | 双精度浮点。 处理大多数小数。 |
-| decimal | `java.math.BigDecimal` | DOUBLE 类型存储为字符串,允许固定小数点。 |
-| null | `java.lang.Void` | null |
-| bytes | `byte[]` | 字节。 |
-| date | `java.time.LocalDate` | 仅存储日期。从0001年1月1日到9999 年 12 月 31 日。 |
-| time | `java.time.LocalTime` | 仅存储时间。精度为 100 纳秒。 |
-| timestamp | `java.time.LocalDateTime` | 存储一个唯一的编号,每当创建或修改行时都会更新该编号。 时间戳基于内部时钟,与实际时间不对应。 每个表只能有一个时间戳变量。 |
-| row | `org.apache.seatunnel.api.table.type.SeaTunnelRow` | 行类型,可以嵌套。 |
-| map | `java.util.Map` | Map 是将键映射到值的对象。 键类型包括: `int` `string` `boolean` `tinyint` `smallint` `bigint` `float` `double` `decimal` `date` `time` `timestamp` `null` , and the value type includes `int` `string` `boolean` `tinyint` `smallint` `bigint` `float` `double` `decimal` `date` `time` `timestamp` `null` `array` `map` `row`. |
-| array | `ValueType[]` | 数组是一种表示元素集合的数据类型。 元素类型包括: `int` `string` `boolean` `tinyint` `smallint` `bigint` `float` `double`. |
-
-#### 如何声明支持的类型
-
-SeaTunnel 提供了一种简单直接的方式来声明基本类型。基本类型的关键字包括:`string`, `boolean`, `tinyint`, `smallint`, `int`, `bigint`, `float`, `double`, `date`, `time`, `timestamp`, 和 `null`。基本类型的关键字名称可以直接用作类型声明,并且SeaTunnel对类型关键字不区分大小写。 例如,如果您需要声明一个整数类型的字段,您可以简单地将字段定义为`int`或`"int"`。
-
-> null 类型声明必须用双引号引起来, 例如:`"null"`。 这种方法有助于避免与 [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md) 中表示未定义的对象的 `null` 类型混淆。
-
-声明复杂类型(例如 **decimal**、**array**、**map** 和 **row**)时,请注意具体注意事项。
-- 声明decimal类型时,需要设置精度(precision)和小数位数(scale),类型定义遵循“decimal(precision, scale)”格式。 需要强调的是,十进制类型的声明必须用 `"` 括起来;不能像基本类型一样直接使用类型名称。例如,当声明精度为 10、小数位数为 2 的十进制字段时,您可以指定字段类型为`"decimal(10,2)"`。
-- 声明array类型时,需要指定元素类型,类型定义遵循 `array` 格式,其中 `T` 代表元素类型。元素类型包括`int`,`string`,`boolean`,`tinyint`,`smallint`,`bigint`,`float` 和 `double`。与十进制类型声明类似,它也用 `"` 括起来。例如,在声明具有整数数组的字段时,将字段类型指定为 `"array"`。
-- 声明map类型时,需要指定键和值类型。map类型定义遵循`map`格式,其中`K`表示键类型,`V`表示值类型。 `K`可以是任何基本类型和十进制类型,`V`可以是 SeaTunnel 支持的任何类型。 与之前的类型声明类似,map类型声明必须用双引号引起来。 例如,当声明一个map类型的字段时,键类型为字符串,值类型为整数,则可以将该字段声明为`"map"`。
-- 声明row类型时,需要定义一个 [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md) 对象来描述字段及其类型。 字段类型可以是 SeaTunnel 支持的任何类型。 例如,当声明包含整数字段“a”和字符串字段“b”的行类型时,可以将其声明为“{a = int, b = string}”。 将定义作为字符串括在 `"` 中也是可以接受的,因此 `"{a = int, b = string}"` 相当于 `{a = int, c = string}`。由于 HOCON 与 JSON 兼容, `"{\"a\":\"int\", \"b\":\"string\"}"` 等价于 `"{a = int, b = string}"`。
-
-以下是复杂类型声明的示例:
-
-```hocon
-schema {
- fields {
- c_decimal = "decimal(10, 2)"
- c_array = "array"
- c_row = {
- c_int = int
- c_string = string
- c_row = {
- c_int = int
- }
- }
- # 在泛型中Hocon风格声明行类型
- map0 = "map"
- # 在泛型中Json风格声明行类型
- map1 = "map"
- }
-}
-```
-
-### 主键(PrimaryKey)
-
-主键是用于定义模式中主键的配置,它包含name、columns字段。
-
-```
-primaryKey {
- name = id
- columns = [id]
-}
-```
-
-| 字段 | 是否必须 | 默认值 | 描述 |
-|:--------|:-----|:----|---------|
-| name | 是 | - | 主键名称 |
-| columns | 是 | - | 主键中的列列表 |
-
-### 约束键(constraintKeys)
-
-约束键是用于定义模式中约束键的配置列表,它包含constraintName,constraintType,constraintColumns字段。
-
-```
-constraintKeys = [
- {
- constraintName = "id_index"
- constraintType = KEY
- constraintColumns = [
- {
- columnName = "id"
- sortType = ASC
- }
- ]
- },
- ]
-```
-
-| 字段 | 是否必须 | 默认值 | 描述 |
-|:------------------|:-----|:----|------------------------------------------------------------------------|
-| constraintName | 是 | - | 约束键的名称 |
-| constraintType | 否 | KEY | 约束键的类型 |
-| constraintColumns | 是 | - | PrimaryKey中的列列表,每列应包含constraintType和sortType,sortType支持ASC和DESC,默认为ASC |
-
-#### 目前支持哪些约束类型
-
-| 约束类型 | 描述 |
-|:-----------|:----|
-| INDEX_KEY | 键 |
-| UNIQUE_KEY | 唯一键 |
-
-## 如何使用schema
-
-### 推荐
-
-```
-source {
- FakeSource {
- parallelism = 2
- result_table_name = "fake"
- row.num = 16
- schema {
- table = "FakeDatabase.FakeTable"
- columns = [
- {
- name = id
- type = bigint
- nullable = false
- defaultValue = 0
- comment = "primary key id"
- },
- {
- name = name
- type = "string"
- nullable = true
- comment = "name"
- },
- {
- name = age
- type = int
- nullable = true
- comment = "age"
- }
- ]
- primaryKey {
- name = "id"
- columnNames = [id]
- }
- constraintKeys = [
- {
- constraintName = "unique_name"
- constraintType = UNIQUE_KEY
- constraintColumns = [
- {
- columnName = "name"
- sortType = ASC
- }
- ]
- },
- ]
- }
- }
-}
-```
-
-### 已弃用
-
-如果你只需要定义列,你可以使用字段来定义列,这是一种简单的方式,但将来会被删除。
-
-```
-source {
- FakeSource {
- parallelism = 2
- result_table_name = "fake"
- row.num = 16
- schema = {
- fields {
- id = bigint
- c_map = "map"
- c_array = "array"
- c_string = string
- c_boolean = boolean
- c_tinyint = tinyint
- c_smallint = smallint
- c_int = int
- c_bigint = bigint
- c_float = float
- c_double = double
- c_decimal = "decimal(2, 1)"
- c_bytes = bytes
- c_date = date
- c_timestamp = timestamp
- }
- }
- }
-}
-```
-
-## 我们什么时候应该使用它,什么时候不应该使用它
-
-如果选项中有`schema`配置项目,则连接器可以自定义schema。 比如 `Fake` `Pulsar` `Http` 源连接器等。
diff --git a/docs/zh/concept/speed-limit.md b/docs/zh/concept/speed-limit.md
deleted file mode 100644
index cab8fc8bff8..00000000000
--- a/docs/zh/concept/speed-limit.md
+++ /dev/null
@@ -1,43 +0,0 @@
-# 速度控制
-
-## 介绍
-
-SeaTunnel提供了强大的速度控制功能允许你管理数据同步的速率。当你需要确保在系统之间数据传输的高效和可控这个功能是至关重要的。
-速度控制主要由两个关键参数控制:`read_limit.rows_per_second` 和 `read_limit.bytes_per_second`。
-本文档将指导您如何使用这些参数以及如何有效地利用它们。
-
-## 支持这些引擎
-
-> SeaTunnel Zeta
-> Flink
-> Spark
-
-## 配置
-
-要使用速度控制功能,你需要在job配置中设置`read_limit.rows_per_second` 或 `read_limit.bytes_per_second`参数。
-
-配置文件中env配置示例:
-
-```hocon
-env {
- job.mode=STREAMING
- job.name=SeaTunnel_Job
- read_limit.bytes_per_second=7000000
- read_limit.rows_per_second=400
-}
-source {
- MySQL-CDC {
- // ignore...
- }
-}
-transform {
-}
-sink {
- Console {
- }
-}
-```
-
-我们在`env`参数中放了`read_limit.bytes_per_second` 和 `read_limit.rows_per_second`来完成速度控制的配置。
-你可以同时配置这两个参数,或者只配置其中一个。每个`value`的值代表每个线程被限制的最大速率。
-因此,在配置各个值时,请考虑你任务的并行性。
diff --git a/docs/zh/contribution/coding-guide.md b/docs/zh/contribution/coding-guide.md
deleted file mode 100644
index f102eb68554..00000000000
--- a/docs/zh/contribution/coding-guide.md
+++ /dev/null
@@ -1,116 +0,0 @@
-# 编码指南
-
-本指南整体介绍了当前 Apache SeaTunnel 的模块和提交一个高质量 pull request 的最佳实践。
-
-## 模块概述
-
-| 模块名 | 介绍 |
-|----------------------------------------|---------------------------------------------------------------------------------------------------|
-| seatunnel-api | SeaTunnel connector V2 API 模块 |
-| seatunnel-apis | SeaTunnel connector V1 API 模块 |
-| seatunnel-common | SeaTunnel 通用模块 |
-| seatunnel-connectors | SeaTunnel connector V1 模块, 当前 connector V1 处在稳定状态, 社区会持续维护,但不会有大的特性更新 |
-| seatunnel-connectors-v2 | SeaTunnel connector V2 模块, connector V2 处于社区重点开发中 |
-| seatunnel-core/seatunnel-spark | SeaTunnel connector V1 的 spark 引擎核心启动模块 |
-| seatunnel-core/seatunnel-flink | SeaTunnel connector V1 的 flink 引擎核心启动模块 |
-| seatunnel-core/seatunnel-flink-sql | SeaTunnel connector V1 的 flink-sql 引擎核心启动模块 |
-| seatunnel-core/seatunnel-spark-starter | SeaTunnel connector V2 的 Spark 引擎核心启动模块 |
-| seatunnel-core/seatunnel-flink-starter | SeaTunnel connector V2 的 Flink 引擎核心启动模块 |
-| seatunnel-core/seatunnel-starter | SeaTunnel connector V2 的 SeaTunnel 引擎核心启动模块 |
-| seatunnel-e2e | SeaTunnel 端到端测试模块 |
-| seatunnel-examples | SeaTunnel 本地案例模块, 开发者可以用来单元测试和集成测试 |
-| seatunnel-engine | SeaTunnel 引擎模块, seatunnel-engine 是 SeaTunnel 社区新开发的计算引擎,用来实现数据同步 |
-| seatunnel-formats | SeaTunnel 格式化模块,用来提供格式化数据的能力 |
-| seatunnel-plugin-discovery | SeaTunnel 插件发现模块,用来加载类路径中的SPI插件 |
-| seatunnel-transforms-v2 | SeaTunnel transform V2 模块, transform V2 处于社区重点开发中 |
-| seatunnel-translation | SeaTunnel translation 模块, 用来适配Connector V2 和其他计算引擎, 例如Spark、Flink等 |
-
-## 如何提交一个高质量的 pull request
-
-1. 创建实体类的时候使用 `lombok` 插件的注解(`@Data` `@Getter` `@Setter` `@NonNull` 等)来减少代码量。在编码过程中优先使用 lombok 插件是一个很好的习惯。
-
-2. 如果你需要在类中使用 log4j 打印日志, 优先使用 `lombok` 中的 `@Slf4j` 注解。
-
-3. SeaTunnel 使用 Github issue 来跟踪代码问题,包括 bugs 和 改进, 并且使用 Github pull request 来管理代码的审查和合并。所以创建一个清晰的 issue 或者 pull request 能让社区更好的理解开发者的意图,最佳实践如下:
-
- > [目的] [模块名称] [子模块名称] 描述
-
- 1. Pull request 目的包含: `Hotfix`, `Feature`, `Improve`, `Docs`, `WIP`。 请注意如果选择 `WIP`, 你需要使用 github 的 draft pull request。
- 2. Issue 目的包含: `Feature`, `Bug`, `Docs`, `Discuss`。
- 3. 模块名称: 当前 pull request 或 issue 所涉及的模块名称, 例如: `Core`, `Connector-V2`, `Connector-V1`等。
- 4. 子模块名称: 当前 pull request 或 issue 所涉及的子模块名称, 例如:`File` `Redis` `Hbase`等。
- 5. 描述: 高度概括下当前 pull request 和 issue 要做的事情,尽量见名知意。
-
- 提示:**更多内容, 可以参考 [issue guide](https://seatunnel.apache.org/community/contribution_guide/contribute#issue) 和 [pull request guide](https://seatunnel.apache.org/community/contribution_guide/contribute#pull-request)**
-
-4. 代码片段不要重复。 如果一段代码被使用多次,定义多次不是好的选择,最佳实践是把它公共独立出来让其他模块使用。
-
-5. 当抛出一个异常时, 需要一起带上提示信息并且使异常的范围尽可能地小。抛出过于广泛的异常会让错误处理变得复杂并且容易包含安全问题。例如,如果你的 connector 在读数据的时候遇到 `IOException`, 合理的做法如下:
-
- ```java
- try {
- // read logic
- } catch (IOException e) {
- throw SeaTunnelORCFormatException("This orc file is corrupted, please check it", e);
- }
- ```
-
-6. Apache 项目的 license 要求很严格, 每个 Apache 项目文件都应该包含一个 license 声明。 在提交 pull request 之前请检查每个新文件都包含 `Apache License Header`。
-
- ```java
- /*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements. See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License. You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
- ```
-
-7. Apache SeaTunnel 使用 `Spotless` 管理代码风格和格式检查。你可以使用下面的命令来自动修复代码风格问题和格式。
-
- ```shell
- ./mvnw spotless:apply
- ```
-
-8. 提交 pull request 之前,确保修改后项目编译正常,使用下面命令打包整个项目:
-
- ```shell
- # 多线程编译
- ./mvnw -T 1C clean package
- ```
-
- ```shell
- # 单线程编译
- ./mvnw clean package
- ```
-
-9. 提交 pull request 之前,在本地用完整的单元测试和集成测试来检查你的功能性是否正确,最佳实践是用 `seatunnel-examples` 模块的例子去检查多引擎是否正确运行并且结果正确。
-
-10. 如果提交的 pull request 是一个新的特性, 请记得更新文档。
-
-12. 提交 connector 相关的 pull request, 可以通过写 e2e 测试保证鲁棒性,e2e 测试需要包含所有的数据类型,并且初始化尽可能小的 docker 镜像,sink 和 source 的测试用例可以写在一起减少资源的损耗。 可以参考这个不错的例子: [MongodbIT.java](https://github.com/apache/seatunnel/blob/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/connector-mongodb-e2e/src/test/java/org/apache/seatunnel/e2e/connector/v2/mongodb/MongodbIT.java)
-
-12. 类中默认的权限需要使用 `private`, 不可修改的需要设置 `final`, 特殊场景除外。
-
-13. 类中的属性和方法参数倾向于使用基本数据类型(int boolean double float...), 而不是包装类型(Integer Boolean Double Float...), 特殊情况除外。
-
-14. 开发一个 sink connector 的时候你需要知道 sink 需要被序列化,如果有不能被序列化的属性, 需要包装到一个类中,并且使用单例模式。
-
-15. 如果代码中有多个 `if` 流程判断, 尽量简化为多个 if 而不是 if-else-if。
-
-16. Pull request 具有单一职责的特点, 不允许在 pull request 包含与该功能无关的代码, 如果有这种情况, 需要在提交 pull request 之前单独处理好, 否则 Apache SeaTunnel 社区会主动关闭 pull request。
-
-17. 贡献者需要对自己的 pull request 负责。 如果 pull request 包含新的特性, 或者修改了老的特性,增加测试用例或者 e2e 用例来证明合理性和保护完整性是一个很好的做法。
-
-18. 如果你认为社区当前某部分代码不合理(尤其是核心的 `core` 和 `api` 模块),有函数需要更新修改,优先使用 `discuss issue` 和 `email` 与社区讨论是否有必要修改,社区同意后再提交 pull request, 请不要不经讨论直接提交 pull request, 社区会认为无效并且关闭。
-
diff --git a/docs/zh/contribution/contribute-plugin.md b/docs/zh/contribution/contribute-plugin.md
deleted file mode 100644
index 514355840d0..00000000000
--- a/docs/zh/contribution/contribute-plugin.md
+++ /dev/null
@@ -1,5 +0,0 @@
-# 贡献 Connector-v2 插件
-
-如果你想要贡献 Connector-V2, 可以参考下面的 Connector-V2 贡献指南。 可以帮你快速进入开发。
-
-[Connector-v2 贡献指南](https://github.com/apache/seatunnel/blob/dev/seatunnel-connectors-v2/README.md)
diff --git a/docs/zh/contribution/contribute-transform-v2-guide.md b/docs/zh/contribution/contribute-transform-v2-guide.md
deleted file mode 100644
index b9abe5da492..00000000000
--- a/docs/zh/contribution/contribute-transform-v2-guide.md
+++ /dev/null
@@ -1,321 +0,0 @@
-# 贡献 Transform 指南
-
-本文描述了如何理解、开发和贡献一个 transform。
-
-我们也提供了 [transform e2e test](../../../seatunnel-e2e/seatunnel-transforms-v2-e2e)
-来验证 transform 的数据输入和输出。
-
-## 概念
-
-在 SeaTunnel 中你可以通过 connector 读写数据, 但如果你需要在读取数据后或者写入数据前处理数据, 你需要使用 transform。
-
-使用 transform 可以简单修改数据行和字段, 例如拆分字段、修改字段的值或者删除字段。
-
-### 类型转换
-
-Transform 从上游(source 或者 transform)获取类型输入,然后给下游(sink 或者 transform)输出新的类型,这个过程就是类型转换。
-
-案例 1:删除字段
-
-```shell
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN |
-
-| A | B |
-|-----------|-----------|
-| STRING | INT |
-```
-
-案例 2:字段排序
-
-```shell
-| B | C | A |
-|-----------|-----------|-----------|
-| INT | BOOLEAN | STRING |
-
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN |
-```
-
-案例 3:修改字段类型
-
-```shell
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN |
-
-
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | STRING | STRING |
-```
-
-案例 4:添加新的字段
-
-```shell
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN |
-
-
-| A | B | C | D |
-|-----------|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN | DOUBLE |
-```
-
-### 数据转换
-
-转换类型后,Transform 会从上游(source 或者 transform)获取数据行, 使用[新的数据类型](#类型转换)编辑数据后输出到下游(sink 或者 transform)。这个过程叫数据转换。
-
-### 翻译
-
-Transform 已经从 execution engine 中解耦, 任何 transform 实现可以不需要修改和配置的适用所有引擎, 这就需要翻译层来做 transform 和 execution engine 的适配。
-
-案例:翻译数据类型和数据
-
-```shell
-原始数据:
-
-| A | B | C |
-|-----------|-----------|-----------|
-| STRING | INT | BOOLEAN |
-
-类型转换:
-
-| A | B | C |
-|-------------------|-------------------|-------------------|
-| ENGINE | ENGINE | ENGINE |
-
-数据转换:
-
-| A | B | C |
-|-------------------|-------------------|-------------------|
-| ENGINE<"test"> | ENGINE<1> | ENGINE |
-```
-
-## 核心 APIs
-
-### SeaTunnelTransform
-
-`SeaTunnelTransform` 提供了所有主要的 API, 你可以继承它实现任何转换。
-
-1. 从上游获取数据类型。
-
-```java
-/**
- * Set the data type info of input data.
- *
- * @param inputDataType The data type info of upstream input.
- */
- void setTypeInfo(SeaTunnelDataType inputDataType);
-```
-
-2. 输出新的数据类型给下游。
-
-```java
-/**
- * Get the data type of the records produced by this transform.
- *
- * @return Produced data type.
- */
-SeaTunnelDataType getProducedType();
-```
-
-3. 修改输入数据并且输出新的数据到下游。
-
-```java
-/**
- * Transform input data to {@link this#getProducedType()} types data.
- *
- * @param row the data need be transform.
- * @return transformed data.
- */
-T map(T row);
-```
-
-### SingleFieldOutputTransform
-
-`SingleFieldOutputTransform` 抽象了一个单字段修改操作
-
-1. 定义输出字段
-
-```java
-/**
- * Outputs new field
- *
- * @return
- */
-protected abstract String getOutputFieldName();
-```
-
-2. 定义输出字段类型
-
-```java
-/**
- * Outputs new field datatype
- *
- * @return
- */
-protected abstract SeaTunnelDataType getOutputFieldDataType();
-```
-
-3. 定义输出字段值
-
-```java
-/**
- * Outputs new field value
- *
- * @param inputRow The inputRow of upstream input.
- * @return
- */
-protected abstract Object getOutputFieldValue(SeaTunnelRowAccessor inputRow);
-```
-
-### MultipleFieldOutputTransform
-
-`MultipleFieldOutputTransform` 抽象了多字段修改操作
-
-1. 定义多个输出的字段
-
-```java
-/**
- * Outputs new fields
- *
- * @return
- */
-protected abstract String[] getOutputFieldNames();
-```
-
-2. 定义输出字段的类型
-
-```java
-/**
- * Outputs new fields datatype
- *
- * @return
- */
-protected abstract SeaTunnelDataType[] getOutputFieldDataTypes();
-```
-
-3. 定义输出字段的值
-
-```java
-/**
- * Outputs new fields value
- *
- * @param inputRow The inputRow of upstream input.
- * @return
- */
-protected abstract Object[] getOutputFieldValues(SeaTunnelRowAccessor inputRow);
-```
-
-### AbstractSeaTunnelTransform
-
-`AbstractSeaTunnelTransform` 抽象了数据类型和字段的修改操作
-
-1. 转换输入的行类型到新的行类型
-
-```java
-/**
- * Outputs transformed row type.
- *
- * @param inputRowType upstream input row type
- * @return
- */
-protected abstract SeaTunnelRowType transformRowType(SeaTunnelRowType inputRowType);
-```
-
-2. 转换输入的行数据到新的行数据
-
-```java
-/**
- * Outputs transformed row data.
- *
- * @param inputRow upstream input row data
- * @return
- */
-protected abstract SeaTunnelRow transformRow(SeaTunnelRow inputRow);
-```
-
-## 开发一个 Transform
-
-Transform 必须实现下面其中一个 API:
-- SeaTunnelTransform
-- AbstractSeaTunnelTransform
-- SingleFieldOutputTransform
-- MultipleFieldOutputTransform
-
-将实现类放入模块 `seatunnel-transforms-v2`。
-
-### 案例: 拷贝字段到一个新的字段
-
-```java
-@AutoService(SeaTunnelTransform.class)
-public class CopyFieldTransform extends SingleFieldOutputTransform {
-
- private String srcField;
- private int srcFieldIndex;
- private SeaTunnelDataType srcFieldDataType;
- private String destField;
-
- @Override
- public String getPluginName() {
- return "Copy";
- }
-
- @Override
- protected void setConfig(Config pluginConfig) {
- this.srcField = pluginConfig.getString("src_field");
- this.destField = pluginConfig.getString("dest_fields");
- }
-
- @Override
- protected void setInputRowType(SeaTunnelRowType inputRowType) {
- srcFieldIndex = inputRowType.indexOf(srcField);
- srcFieldDataType = inputRowType.getFieldType(srcFieldIndex);
- }
-
- @Override
- protected String getOutputFieldName() {
- return destField;
- }
-
- @Override
- protected SeaTunnelDataType getOutputFieldDataType() {
- return srcFieldDataType;
- }
-
- @Override
- protected Object getOutputFieldValue(SeaTunnelRowAccessor inputRow) {
- return inputRow.getField(srcFieldIndex);
- }
-}
-```
-
-1. `getPluginName` 方法用来定义 transform 的名字。
-2. @AutoService 注解用来自动生成 `META-INF/services/org.apache.seatunnel.api.transform.SeaTunnelTransform` 文件
-3. `setConfig` 方法用来注入用户配置。
-
-## Transform 测试工具
-
-当你添加了一个新的插件, 推荐添加一个 e2e 测试用例来测试。
-我们有 `seatunnel-e2e/seatunnel-transforms-v2-e2e` 来帮助你实现。
-
-例如, 如果你想要添加一个 `CopyFieldTransform` 的测试用例, 你可以在 `seatunnel-e2e/seatunnel-transforms-v2-e2e`
-模块中添加一个新的测试用例, 并且在用例中继承 `TestSuiteBase` 类。
-
-```java
-public class TestCopyFieldTransformIT extends TestSuiteBase {
-
- @TestTemplate
- public void testCopyFieldTransform(TestContainer container) {
- Container.ExecResult execResult = container.executeJob("/copy_transform.conf");
- Assertions.assertEquals(0, execResult.getExitCode());
- }
-}
-```
-
-一旦你的测试用例实现了 `TestSuiteBase` 接口, 并且添加 `@TestTemplate` 注解,它会在所有引擎运行作业,你只需要用你自己的 SeaTunnel 配置文件执行 executeJob 方法,
-它会提交 SeaTunnel 作业。
diff --git a/docs/zh/contribution/how-to-create-your-connector.md b/docs/zh/contribution/how-to-create-your-connector.md
deleted file mode 100644
index 3aef1b140c2..00000000000
--- a/docs/zh/contribution/how-to-create-your-connector.md
+++ /dev/null
@@ -1,4 +0,0 @@
-## 开发自己的Connector
-
-如果你想针对SeaTunnel新的连接器API开发自己的连接器(Connector V2),请查看[这里](https://github.com/apache/seatunnel/blob/dev/seatunnel-connectors-v2/README.zh.md) 。
-
diff --git a/docs/zh/contribution/new-license.md b/docs/zh/contribution/new-license.md
deleted file mode 100644
index d39019f25b7..00000000000
--- a/docs/zh/contribution/new-license.md
+++ /dev/null
@@ -1,53 +0,0 @@
-# 如何添加新的 License
-
-### ASF 第三方许可政策
-
-如果您打算向SeaTunnel(或其他Apache项目)添加新功能,并且该功能涉及到其他开源软件引用的时候,请注意目前 Apache 项目支持遵从以下协议的开源软件。
-
-[ASF 第三方许可政策](https://apache.org/legal/resolved.html)
-
-如果您所使用的第三方软件并不在以上协议之中,那么很抱歉,您的代码将无法通过审核,建议您找寻其他替代方案。
-
-### 如何在 SeaTunnel 中合法使用第三方开源软件
-
-当我们想要引入一个新的第三方软件(包含但不限于第三方的 jar、文本、CSS、js、图片、图标、音视频等及在第三方基础上做的修改)至我们的项目中的时候,除了他们所遵从的协议是 Apache 允许的,另外一点很重要,就是合法的使用。您可以参考以下文章
-
-* [COMMUNITY-LED DEVELOPMENT "THE APACHE WAY"](https://apache.org/dev/licensing-howto.html)
-
-举个例子,当我们使用了 ZooKeeper,那么我们项目就必须包含 ZooKeeper 的 NOTICE 文件(每个开源项目都会有 NOTICE 文件,一般位于根目录),用Apache的话来讲,就是 "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work.
-
-关于具体的各个开源协议使用协议,在此不做过多篇幅一一介绍,有兴趣可以自行查询了解。
-
-### SeaTunnel-License 检测规则
-
-通常情况下, 我们会为项目添加 License-check 脚本。 跟其他开源项目略有不同,SeaTunnel 使用 [SkyWalking](https://github.com/apache/skywalking) 提供的 SeaTunnel-License-Check。 总之,我们试图第一时间避免 License 问题。
-
-当我们需要添加新的 jar 包或者使用外部资源时, 我们需要按照以下步骤进行操作:
-
-* 在 known-dependencies.txt 文件中添加 jar 的名称和版本
-* 在 'seatunnel-dist/release-docs/LICENSE' 目录下添加相关 maven 仓库地址
-* 在 'seatunnel-dist/release-docs/NOTICE' 目录下添加相关的 NOTICE 文件, 并确保他们跟原来的仓库中的文件没有区别
-* 在 'seatunnel-dist/release-docs/licenses' 目录下添加相关源码协议文件, 并且文件命令遵守 license-filename.txt 规则。 例:license-zk.txt
-* 检查依赖的 license 是否出错
-
-```
---- /dev/fd/63 2020-12-03 03:08:57.191579482 +0000
-+++ /dev/fd/62 2020-12-03 03:08:57.191579482 +0000
-@@ -1,0 +2 @@
-+HikariCP-java6-2.3.13.jar
-@@ -16,0 +18 @@
-+c3p0-0.9.5.2.jar
-@@ -149,0 +152 @@
-+mchange-commons-java-0.2.11.jar
-
-- commons-lang-2.1.3.jar
-Error: Process completed with exit code 1.
-```
-
-一般来说,添加一个 jar 的工作通常不是很容易,因为 jar 通常依赖其他各种 jar, 我们还需要为这些 jar 添加相应的许可证。 在这种情况下, 我们会收到检查 license 失败的错误信息。像上面的例子,我们缺少 `HikariCP-java6-2.3.13`, `c3p0` 等的 license 声明(`+` 表示新添加,`-` 表示需要删除), 按照步骤添加 jar。
-
-### 参考
-
-* [COMMUNITY-LED DEVELOPMENT "THE APACHE WAY"](https://apache.org/dev/licensing-howto.html)
-* [ASF 第三方许可政策](https://apache.org/legal/resolved.html)
-
diff --git a/docs/zh/contribution/setup.md b/docs/zh/contribution/setup.md
deleted file mode 100644
index b94c971d75e..00000000000
--- a/docs/zh/contribution/setup.md
+++ /dev/null
@@ -1,113 +0,0 @@
-# 搭建开发环境
-
-在这个章节, 我们会向你展示如何搭建 SeaTunnel 的开发环境, 然后用 JetBrains IntelliJ IDEA 跑一个简单的示例。
-
-> 你可以用任何你喜欢的开发环境进行开发和测试,我们只是用 [JetBrains IDEA](https://www.jetbrains.com/idea/)
-> 作为示例来展示如何一步步设置环境。
-
-## 准备
-
-在设置开发环境之前, 需要做一些准备工作, 确保你安装了以下软件:
-
-* 安装 [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)。
-* 安装 [Java](https://www.java.com/en/download/) (目前只支持 JDK8/JDK11) 并且设置 `JAVA_HOME` 环境变量。
-* 安装 [Scala](https://www.scala-lang.org/download/2.11.12.html) (目前只支持 scala 2.11.12)。
-* 安装 [JetBrains IDEA](https://www.jetbrains.com/idea/)。
-
-## 设置
-
-### 克隆源码
-
-首先使用以下命令从 [GitHub](https://github.com/apache/seatunnel) 克隆 SeaTunnel 源代码。
-
-```shell
-git clone git@github.com:apache/seatunnel.git
-```
-
-### 本地安装子项目
-
-在克隆好源代码以后, 运行 `./mvnw` 命令安装子项目到 maven 本地仓库目录。 否则你的代码无法在 IDEA 中正常启动。
-
-```shell
-./mvnw install -Dmaven.test.skip
-```
-
-### 源码编译
-
-在安装 maven 以后, 可以使用下面命令进行编译和打包。
-
-```
-mvn clean package -pl seatunnel-dist -am -Dmaven.test.skip=true
-```
-
-### 编译子模块
-
-如果要单独编译子模块, 可以使用下面的命令进行编译和打包。
-
-```ssh
-# 这是一个单独构建 redis connector 的示例
-
- mvn clean package -pl seatunnel-connectors-v2/connector-redis -am -DskipTests -T 1C
-```
-
-### 安装 JetBrains IDEA Scala 插件
-
-用 JetBrains IntelliJ IDEA 打开你的源码,如果有 Scala 的代码,则需要安装 JetBrains IntelliJ IDEA's [Scala plugin](https://plugins.jetbrains.com/plugin/1347-scala)。
-可以参考 [install plugins for IDEA](https://www.jetbrains.com/help/idea/managing-plugins.html#install-plugins) 。
-
-### 安装 JetBrains IDEA Lombok 插件
-
-在运行示例之前, 安装 JetBrains IntelliJ IDEA 的 [Lombok plugin](https://plugins.jetbrains.com/plugin/6317-lombok)。
-可以参考 [install plugins for IDEA](https://www.jetbrains.com/help/idea/managing-plugins.html#install-plugins) 。
-
-### 代码风格
-
-Apache SeaTunnel 使用 `Spotless` 来统一代码风格和格式检查。可以运行下面 `Spotless` 命令自动格式化。
-
-```shell
-./mvnw spotless:apply
-```
-
-拷贝 `pre-commit hook` 文件 `/tools/spotless_check/pre-commit.sh` 到你项目的 `.git/hooks/` 目录, 这样每次你使用 `git commit` 提交代码的时候会自动调用 `Spotless` 修复格式问题。
-
-## 运行一个简单的示例
-
-完成上面所有的工作后,环境搭建已经完成, 可以直接运行我们的示例了。 所有的示例在 `seatunnel-examples` 模块里, 你可以随意选择进行编译和调试,参考 [running or debugging
-it in IDEA](https://www.jetbrains.com/help/idea/run-debug-configuration.html)。
-
-我们使用 `seatunnel-examples/seatunnel-flink-connector-v2-example/src/main/java/org/apache/seatunnel/example/flink/v2/SeaTunnelApiExample.java`
-作为示例, 运行成功后的输出如下:
-
-```log
-+I[Ricky Huo, 71]
-+I[Gary, 12]
-+I[Ricky Huo, 93]
-...
-...
-+I[Ricky Huo, 83]
-```
-
-## 更多信息
-
-所有的实例都用了简单的 source 和 sink, 这样可以使得运行更独立和更简单。
-你可以修改 `resources/examples` 中的示例的配置。 例如下面的配置使用 PostgreSQL 作为源,并且输出到控制台。
-
-```conf
-env {
- parallelism = 1
-}
-
-source {
- JdbcSource {
- driver = org.postgresql.Driver
- url = "jdbc:postgresql://host:port/database"
- username = postgres
- query = "select * from test"
- }
-}
-
-sink {
- ConsoleSink {}
-}
-```
-
diff --git a/docs/zh/other-engine/flink.md b/docs/zh/other-engine/flink.md
deleted file mode 100644
index a9aa7055a2e..00000000000
--- a/docs/zh/other-engine/flink.md
+++ /dev/null
@@ -1,83 +0,0 @@
-# Seatunnel runs on Flink
-
-Flink是一个强大的高性能分布式流处理引擎,更多关于它的信息,你可以搜索 `Apache Flink`。
-
-### 在Job中设置Flink的配置信息
-
-从 `flink` 开始:
-
-例子: 我对这个项目设置一个精确的检查点
-
-```
-env {
- parallelism = 1
- flink.execution.checkpointing.unaligned.enabled=true
-}
-```
-
-枚举类型当前还不支持,你需要在Flink的配置文件中指定它们,暂时只有这些类型的设置受支持:
-Integer/Boolean/String/Duration
-
-### 如何设置一个简单的Flink job
-
-这是一个运行在Flink中随机生成数据打印到控制台的简单job
-
-```
-env {
- # 公共参数
- parallelism = 1
- checkpoint.interval = 5000
-
- # flink特殊参数
- flink.execution.checkpointing.mode = "EXACTLY_ONCE"
- flink.execution.checkpointing.timeout = 600000
-}
-
-source {
- FakeSource {
- row.num = 16
- result_table_name = "fake_table"
- schema = {
- fields {
- c_map = "map"
- c_array = "array"
- c_string = string
- c_boolean = boolean
- c_int = int
- c_bigint = bigint
- c_double = double
- c_bytes = bytes
- c_date = date
- c_decimal = "decimal(33, 18)"
- c_timestamp = timestamp
- c_row = {
- c_map = "map"
- c_array = "array"
- c_string = string
- c_boolean = boolean
- c_int = int
- c_bigint = bigint
- c_double = double
- c_bytes = bytes
- c_date = date
- c_decimal = "decimal(33, 18)"
- c_timestamp = timestamp
- }
- }
- }
- }
-}
-
-transform {
- # 如果你想知道更多关于如何配置seatunnel的信息和查看完整的transform插件,
- # 请访问:https://seatunnel.apache.org/docs/transform-v2/sql
-}
-
-sink{
- Console{}
-}
-```
-
-### 如何在项目中运行job
-
-当你将代码拉到本地后,转到 `seatunnel-examples/seatunnel-flink-connector-v2-example` 模块,查找 `org.apache.seatunnel.example.flink.v2.SeaTunnelApiExample` 即可完成job的操作
diff --git a/docs/zh/seatunnel-engine/checkpoint-storage.md b/docs/zh/seatunnel-engine/checkpoint-storage.md
deleted file mode 100644
index 795e7bf63b5..00000000000
--- a/docs/zh/seatunnel-engine/checkpoint-storage.md
+++ /dev/null
@@ -1,187 +0,0 @@
----
-
-sidebar_position: 7
--------------------
-
-# 检查点存储
-
-## 简介
-
-检查点是一种容错恢复机制。这种机制确保程序在运行时,即使突然遇到异常,也能自行恢复。
-
-### 检查点存储
-
-检查点存储是一种存储检查点数据的存储机制。
-
-SeaTunnel Engine支持以下检查点存储类型:
-
-- HDFS (OSS,S3,HDFS,LocalFile)
-- LocalFile (本地),(已弃用: 使用Hdfs(LocalFile)替代).
-
-我们使用微内核设计模式将检查点存储模块从引擎中分离出来。这允许用户实现他们自己的检查点存储模块。
-
-`checkpoint-storage-api`是检查点存储模块API,它定义了检查点存储模块的接口。
-
-如果你想实现你自己的检查点存储模块,你需要实现`CheckpointStorage`并提供相应的`CheckpointStorageFactory`实现。
-
-### 检查点存储配置
-
-`seatunnel-server`模块的配置在`seatunnel.yaml`文件中。
-
-```yaml
-
-seatunnel:
- engine:
- checkpoint:
- storage:
- type: hdfs #检查点存储的插件名称,支持hdfs(S3, local, hdfs), 默认为localfile (本地文件), 但这种方式已弃用
- # 插件配置
- plugin-config:
- namespace: #检查点存储父路径,默认值为/seatunnel/checkpoint/
- K1: V1 # 插件其它配置
- K2: V2 # 插件其它配置
-```
-
-注意: namespace必须以"/"结尾。
-
-#### OSS
-
-阿里云oss是基于hdfs-file,所以你可以参考[hadoop oss文档](https://hadoop.apache.org/docs/stable/hadoop-aliyun/tools/hadoop-aliyun/index.html)来配置oss.
-
-除了与oss buckets交互外,oss客户端需要与buckets交互所需的凭据。
-客户端支持多种身份验证机制,并且可以配置使用哪种机制及其使用顺序。也可以使用of org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider的自定义实现。
-如果您使用AliyunCredentialsProvider(可以从阿里云访问密钥管理中获得),它们包括一个access key和一个secret key。
-你可以这样配置:
-
-```yaml
-seatunnel:
- engine:
- checkpoint:
- interval: 6000
- timeout: 7000
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: oss
- oss.bucket: your-bucket
- fs.oss.accessKeyId: your-access-key
- fs.oss.accessKeySecret: your-secret-key
- fs.oss.endpoint: endpoint address
- fs.oss.credentials.provider: org.apache.hadoop.fs.aliyun.oss.AliyunCredentialsProvider
-```
-
-有关Hadoop Credential Provider API的更多信息,请参见: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
-
-阿里云oss凭证提供程序实现见: [验证凭证提供](https://github.com/aliyun/aliyun-oss-java-sdk/tree/master/src/main/java/com/aliyun/oss/common/auth)
-
-#### S3
-
-S3基于hdfs-file,所以你可以参考[hadoop s3文档](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html)来配置s3。
-
-除了与公共S3 buckets交互之外,S3A客户端需要与buckets交互所需的凭据。
-客户端支持多种身份验证机制,并且可以配置使用哪种机制及其使用顺序。也可以使用com.amazonaws.auth.AWSCredentialsProvider的自定义实现。
-如果您使用SimpleAWSCredentialsProvider(可以从Amazon Security Token服务中获得),它们包括一个access key和一个secret key。
-您可以这样配置:
-
-```yaml
-``` yaml
-
-seatunnel:
- engine:
- checkpoint:
- interval: 6000
- timeout: 7000
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: s3
- s3.bucket: your-bucket
- fs.s3a.access.key: your-access-key
- fs.s3a.secret.key: your-secret-key
- fs.s3a.aws.credentials.provider: org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider
-
-
-```
-
-如果您使用`InstanceProfileCredentialsProvider`,它支持在EC2 VM中运行时使用实例配置文件凭据,您可以检查[iam-roles-for-amazon-ec2](https://docs.aws.amazon.com/zh_cn/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html).
-您可以这样配置:
-
-```yaml
-
-seatunnel:
- engine:
- checkpoint:
- interval: 6000
- timeout: 7000
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: s3
- s3.bucket: your-bucket
- fs.s3a.endpoint: your-endpoint
- fs.s3a.aws.credentials.provider: org.apache.hadoop.fs.s3a.InstanceProfileCredentialsProvider
-```
-
-有关Hadoop Credential Provider API的更多信息,请参见: [Credential Provider API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html).
-
-#### HDFS
-
-如果您使用HDFS,您可以这样配置:
-
-```yaml
-seatunnel:
- engine:
- checkpoint:
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: hdfs
- fs.defaultFS: hdfs://localhost:9000
- // 如果您使用kerberos,您可以这样配置:
- kerberosPrincipal: your-kerberos-principal
- kerberosKeytabFilePath: your-kerberos-keytab
-```
-
-如果HDFS是HA模式,您可以这样配置:
-
-```yaml
-seatunnel:
- engine:
- checkpoint:
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: hdfs
- fs.defaultFS: hdfs://usdp-bing
- seatunnel.hadoop.dfs.nameservices: usdp-bing
- seatunnel.hadoop.dfs.ha.namenodes.usdp-bing: nn1,nn2
- seatunnel.hadoop.dfs.namenode.rpc-address.usdp-bing.nn1: usdp-bing-nn1:8020
- seatunnel.hadoop.dfs.namenode.rpc-address.usdp-bing.nn2: usdp-bing-nn2:8020
- seatunnel.hadoop.dfs.client.failover.proxy.provider.usdp-bing: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
-
-```
-
-如果HDFS在`hdfs-site.xml`或`core-site.xml`中有其他配置,只需使用`seatunnel.hadoop.`前缀设置HDFS配置即可。
-
-#### 本地文件
-
-```yaml
-seatunnel:
- engine:
- checkpoint:
- interval: 6000
- timeout: 7000
- storage:
- type: hdfs
- max-retained: 3
- plugin-config:
- storage.type: hdfs
- fs.defaultFS: file:/// # 请确保该目录具有写权限
-
-```
-
diff --git a/docs/zh/seatunnel-engine/cluster-mode.md b/docs/zh/seatunnel-engine/cluster-mode.md
deleted file mode 100644
index a0b11cd1dfa..00000000000
--- a/docs/zh/seatunnel-engine/cluster-mode.md
+++ /dev/null
@@ -1,21 +0,0 @@
----
-
-sidebar_position: 3
--------------------
-
-# 以集群模式运行作业
-
-这是最推荐的在生产环境中使用SeaTunnel Engine的方法。此模式支持SeaTunnel Engine的全部功能,集群模式将具有更好的性能和稳定性。
-
-在集群模式下,首先需要部署SeaTunnel Engine集群,然后客户端将作业提交给SeaTunnel Engine集群运行。
-
-## 部署SeaTunnel Engine集群
-
-部署SeaTunnel Engine集群参考[SeaTunnel Engine集群部署](../../en/seatunnel-engine/deployment.md)
-
-## 提交作业
-
-```shell
-$SEATUNNEL_HOME/bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template
-```
-
diff --git a/docs/zh/seatunnel-engine/local-mode.md b/docs/zh/seatunnel-engine/local-mode.md
deleted file mode 100644
index 3738721fa79..00000000000
--- a/docs/zh/seatunnel-engine/local-mode.md
+++ /dev/null
@@ -1,25 +0,0 @@
----
-
-sidebar_position: 2
--------------------
-
-# 以本地模式运行作业
-
-仅用于测试。
-
-最推荐在生产环境中使用SeaTunnel Engine的方式为[集群模式](cluster-mode.md).
-
-## 本地模式部署SeaTunnel Engine
-
-[部署SeaTunnel Engine本地模式参考](../../en/start-v2/locally/deployment.md)
-
-## 修改SeaTunnel Engine配置
-
-将$SEATUNNEL_HOME/config/hazelcast.yaml中的自动增量更新为true
-
-## 提交作业
-
-```shell
-$SEATUNNEL_HOME/bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template -e local
-```
-
diff --git a/docs/zh/seatunnel-engine/rest-api.md b/docs/zh/seatunnel-engine/rest-api.md
deleted file mode 100644
index a3f8d10d190..00000000000
--- a/docs/zh/seatunnel-engine/rest-api.md
+++ /dev/null
@@ -1,384 +0,0 @@
----
-
-sidebar_position: 7
--------------------
-
-# REST API
-
-SeaTunnel有一个用于监控的API,可用于查询运行作业的状态和统计信息,以及最近完成的作业。监控API是REST-ful风格的,它接受HTTP请求并使用JSON数据格式进行响应。
-
-## 概述
-
-监控API是由运行的web服务提供的,它是节点运行的一部分,每个节点成员都可以提供rest API功能。
-默认情况下,该服务监听端口为5801,该端口可以在hazelcast.yaml中配置,如下所示:
-
-```yaml
-network:
- rest-api:
- enabled: true
- endpoint-groups:
- CLUSTER_WRITE:
- enabled: true
- DATA:
- enabled: true
- join:
- tcp-ip:
- enabled: true
- member-list:
- - localhost
- port:
- auto-increment: true
- port-count: 100
- port: 5801
-```
-
-## API参考
-
-### 返回所有作业及其当前状态的概览。
-
-
- GET/hazelcast/rest/maps/running-jobs(返回所有作业及其当前状态的概览。)
-
-#### 参数
-
-#### 响应
-
-```json
-[
- {
- "jobId": "",
- "jobName": "",
- "jobStatus": "",
- "envOptions": {
- },
- "createTime": "",
- "jobDag": {
- "vertices": [
- ],
- "edges": [
- ]
- },
- "pluginJarsUrls": [
- ],
- "isStartWithSavePoint": false,
- "metrics": {
- "sourceReceivedCount": "",
- "sinkWriteCount": ""
- }
- }
-]
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 返回作业的详细信息。
-
-
- GET/hazelcast/rest/maps/running-job/:jobId(返回作业的详细信息。)
-
-#### 参数
-
-> | name | type | data type | description |
-> |-------|----------|-----------|-------------|
-> | jobId | required | long | job id |
-
-#### 响应
-
-```json
-{
- "jobId": "",
- "jobName": "",
- "jobStatus": "",
- "envOptions": {
- },
- "createTime": "",
- "jobDag": {
- "vertices": [
- ],
- "edges": [
- ]
- },
- "pluginJarsUrls": [
- ],
- "isStartWithSavePoint": false,
- "metrics": {
- "sourceReceivedCount": "",
- "sinkWriteCount": ""
- }
-}
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 返回所有已完成的作业信息。
-
-
- GET/hazelcast/rest/maps/finished-jobs/:state(返回所有已完成的作业信息。)
-
-#### 参数
-
-> | name | type | data type | description |
-> |-------|----------|-----------|------------------------------------------------------------------|
-> | state | optional | string | finished job status. `FINISHED`,`CANCELED`,`FAILED`,`UNKNOWABLE` |
-
-#### 响应
-
-```json
-[
- {
- "jobId": "",
- "jobName": "",
- "jobStatus": "",
- "errorMsg": null,
- "createTime": "",
- "finishTime": "",
- "jobDag": "",
- "metrics": ""
- }
-]
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 返回系统监控信息。
-
-
- GET/hazelcast/rest/maps/system-monitoring-information(返回系统监控信息。)
-
-#### 参数
-
-#### 响应
-
-```json
-[
- {
- "processors":"8",
- "physical.memory.total":"16.0G",
- "physical.memory.free":"16.3M",
- "swap.space.total":"0",
- "swap.space.free":"0",
- "heap.memory.used":"135.7M",
- "heap.memory.free":"440.8M",
- "heap.memory.total":"576.5M",
- "heap.memory.max":"3.6G",
- "heap.memory.used/total":"23.54%",
- "heap.memory.used/max":"3.73%",
- "minor.gc.count":"6",
- "minor.gc.time":"110ms",
- "major.gc.count":"2",
- "major.gc.time":"73ms",
- "load.process":"24.78%",
- "load.system":"60.00%",
- "load.systemAverage":"2.07",
- "thread.count":"117",
- "thread.peakCount":"118",
- "cluster.timeDiff":"0",
- "event.q.size":"0",
- "executor.q.async.size":"0",
- "executor.q.client.size":"0",
- "executor.q.client.query.size":"0",
- "executor.q.client.blocking.size":"0",
- "executor.q.query.size":"0",
- "executor.q.scheduled.size":"0",
- "executor.q.io.size":"0",
- "executor.q.system.size":"0",
- "executor.q.operations.size":"0",
- "executor.q.priorityOperation.size":"0",
- "operations.completed.count":"10",
- "executor.q.mapLoad.size":"0",
- "executor.q.mapLoadAllKeys.size":"0",
- "executor.q.cluster.size":"0",
- "executor.q.response.size":"0",
- "operations.running.count":"0",
- "operations.pending.invocations.percentage":"0.00%",
- "operations.pending.invocations.count":"0",
- "proxy.count":"8",
- "clientEndpoint.count":"0",
- "connection.active.count":"2",
- "client.connection.count":"0",
- "connection.count":"0"
- }
-]
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 提交作业。
-
-
-POST/hazelcast/rest/maps/submit-job(如果作业提交成功,返回jobId和jobName。)
-
-#### 参数
-
-> | name | type | data type | description |
-> |----------------------|----------|-----------|-----------------------------------|
-> | jobId | optional | string | job id |
-> | jobName | optional | string | job name |
-> | isStartWithSavePoint | optional | string | if job is started with save point |
-
-#### 请求体
-
-```json
-{
- "env": {
- "job.mode": "batch"
- },
- "source": [
- {
- "plugin_name": "FakeSource",
- "result_table_name": "fake",
- "row.num": 100,
- "schema": {
- "fields": {
- "name": "string",
- "age": "int",
- "card": "int"
- }
- }
- }
- ],
- "transform": [
- ],
- "sink": [
- {
- "plugin_name": "Console",
- "source_table_name": ["fake"]
- }
- ]
-}
-```
-
-#### 响应
-
-```json
-{
- "jobId": 733584788375666689,
- "jobName": "rest_api_test"
-}
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 停止作业。
-
-
-POST/hazelcast/rest/maps/stop-job(如果作业成功停止,返回jobId。)
-
-#### 请求体
-
-```json
-{
- "jobId": 733584788375666689,
- "isStopWithSavePoint": false # if job is stopped with save point
-}
-```
-
-#### 响应
-
-```json
-{
-"jobId": 733584788375666689
-}
-```
-
-
-
-------------------------------------------------------------------------------------------
-
-### 加密配置。
-
-
-POST/hazelcast/rest/maps/encrypt-config(如果配置加密成功,则返回加密后的配置。)
-有关自定义加密的更多信息,请参阅文档[配置-加密-解密](../connector-v2/Config-Encryption-Decryption.md).
-
-#### 请求体
-
-```json
-{
- "env": {
- "parallelism": 1,
- "shade.identifier":"base64"
- },
- "source": [
- {
- "plugin_name": "MySQL-CDC",
- "schema" : {
- "fields": {
- "name": "string",
- "age": "int"
- }
- },
- "result_table_name": "fake",
- "parallelism": 1,
- "hostname": "127.0.0.1",
- "username": "seatunnel",
- "password": "seatunnel_password",
- "table-name": "inventory_vwyw0n"
- }
- ],
- "transform": [
- ],
- "sink": [
- {
- "plugin_name": "Clickhouse",
- "host": "localhost:8123",
- "database": "default",
- "table": "fake_all",
- "username": "seatunnel",
- "password": "seatunnel_password"
- }
- ]
-}
-```
-
-#### 响应
-
-```json
-{
- "env": {
- "parallelism": 1,
- "shade.identifier": "base64"
- },
- "source": [
- {
- "plugin_name": "MySQL-CDC",
- "schema": {
- "fields": {
- "name": "string",
- "age": "int"
- }
- },
- "result_table_name": "fake",
- "parallelism": 1,
- "hostname": "127.0.0.1",
- "username": "c2VhdHVubmVs",
- "password": "c2VhdHVubmVsX3Bhc3N3b3Jk",
- "table-name": "inventory_vwyw0n"
- }
- ],
- "transform": [],
- "sink": [
- {
- "plugin_name": "Clickhouse",
- "host": "localhost:8123",
- "database": "default",
- "table": "fake_all",
- "username": "c2VhdHVubmVs",
- "password": "c2VhdHVubmVsX3Bhc3N3b3Jk"
- }
- ]
-}
-```
-
-
-
diff --git a/docs/zh/transform-v2/common-options.md b/docs/zh/transform-v2/common-options.md
deleted file mode 100644
index 9a756760f2c..00000000000
--- a/docs/zh/transform-v2/common-options.md
+++ /dev/null
@@ -1,23 +0,0 @@
-# 转换常见选项
-
-> 源端连接器的常见参数
-
-| 参数名称 | 参数类型 | 是否必须 | 默认值 |
-|-------------------|--------|------|-----|
-| result_table_name | string | no | - |
-| source_table_name | string | no | - |
-
-### source_table_name [string]
-
-当未指定 `source_table_name` 时,当前插件在配置文件中处理由前一个插件输出的数据集 `(dataset)` ;
-
-当指定了 `source_table_name` 时,当前插件正在处理与该参数对应的数据集
-
-### result_table_name [string]
-
-当未指定 `result_table_name` 时,此插件处理的数据不会被注册为其他插件可以直接访问的数据集,也不会被称为临时表 `(table)`;
-
-当指定了 `result_table_name` 时,此插件处理的数据将被注册为其他插件可以直接访问的数据集 `(dataset)`,或者被称为临时表 `(table)`。在这里注册的数据集可以通过指定 `source_table_name` 被其他插件直接访问。
-
-## 示例
-
diff --git a/docs/zh/transform-v2/copy.md b/docs/zh/transform-v2/copy.md
deleted file mode 100644
index a4ca5c613a7..00000000000
--- a/docs/zh/transform-v2/copy.md
+++ /dev/null
@@ -1,65 +0,0 @@
-# 复制
-
-> 复制转换插件
-
-## 描述
-
-将字段复制到一个新字段。
-
-## 属性
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|--------|--------|------|-----|
-| fields | Object | yes | |
-
-### fields [config]
-
-指定输入和输出之间的字段复制关系
-
-### 常见选项 [string]
-
-转换插件的常见参数, 请参考 [Transform Plugin](common-options.md) 了解详情。
-
-## 示例
-
-从源读取的数据是这样的一个表:
-
-| name | age | card |
-|----------|-----|------|
-| Joy Ding | 20 | 123 |
-| May Ding | 20 | 123 |
-| Kin Dom | 20 | 123 |
-| Joy Dom | 20 | 123 |
-
-想要将字段 `name`、`age` 复制到新的字段 `name1`、`name2`、`age1`,我们可以像这样添加 `Copy` 转换:
-
-```
-transform {
- Copy {
- source_table_name = "fake"
- result_table_name = "fake1"
- fields {
- name1 = name
- name2 = name
- age1 = age
- }
- }
-}
-```
-
-那么结果表 `fake1` 中的数据将会像这样:
-
-| name | age | card | name1 | name2 | age1 |
-|----------|-----|------|----------|----------|------|
-| Joy Ding | 20 | 123 | Joy Ding | Joy Ding | 20 |
-| May Ding | 20 | 123 | May Ding | May Ding | 20 |
-| Kin Dom | 20 | 123 | Kin Dom | Kin Dom | 20 |
-| Joy Dom | 20 | 123 | Joy Dom | Joy Dom | 20 |
-
-## 更新日志
-
-### 新版本
-
-- 添加复制转换连接器
-- 支持将字段复制到新字段
-
diff --git a/docs/zh/transform-v2/field-mapper.md b/docs/zh/transform-v2/field-mapper.md
deleted file mode 100644
index 298d3fa72c9..00000000000
--- a/docs/zh/transform-v2/field-mapper.md
+++ /dev/null
@@ -1,64 +0,0 @@
-# 字段映射
-
-> 字段映射转换插件
-
-## 描述
-
-添加输入模式和输出模式映射
-
-## 属性
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|--------------|--------|------|-----|
-| field_mapper | Object | yes | |
-
-### field_mapper [config]
-
-指定输入和输出之间的字段映射关系
-
-### common options [config]
-
-转换插件的常见参数, 请参考 [Transform Plugin](common-options.md) 了解详情
-
-## 示例
-
-源端数据读取的表格如下:
-
-| id | name | age | card |
-|----|----------|-----|------|
-| 1 | Joy Ding | 20 | 123 |
-| 2 | May Ding | 20 | 123 |
-| 3 | Kin Dom | 20 | 123 |
-| 4 | Joy Dom | 20 | 123 |
-
-我们想要删除 `age` 字段,并更新字段顺序为 `id`、`card`、`name`,同时将 `name` 重命名为 `new_name`。我们可以像这样添加 `FieldMapper` 转换:
-
-```
-transform {
- FieldMapper {
- source_table_name = "fake"
- result_table_name = "fake1"
- field_mapper = {
- id = id
- card = card
- name = new_name
- }
- }
-}
-```
-
-那么结果表 `fake1` 中的数据将会像这样:
-
-| id | card | new_name |
-|----|------|----------|
-| 1 | 123 | Joy Ding |
-| 2 | 123 | May Ding |
-| 3 | 123 | Kin Dom |
-| 4 | 123 | Joy Dom |
-
-## 更新日志
-
-### 新版本
-
-- 添加复制转换连接器
-
diff --git a/docs/zh/transform-v2/filter-rowkind.md b/docs/zh/transform-v2/filter-rowkind.md
deleted file mode 100644
index 74d2b2d5b1e..00000000000
--- a/docs/zh/transform-v2/filter-rowkind.md
+++ /dev/null
@@ -1,68 +0,0 @@
-# 行类型过滤
-
-> 行类型转换插件
-
-## 描述
-
-按行类型过滤数据
-
-## 操作
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|---------------|-------|------|-----|
-| include_kinds | array | yes | |
-| exclude_kinds | array | yes | |
-
-### include_kinds [array]
-
-要包含的行类型
-
-### exclude_kinds [array]
-
-要排除的行类型。
-
-您只能配置 `include_kinds` 和 `exclude_kinds` 中的一个。
-
-### common options [string]
-
-转换插件的常见参数, 请参考 [Transform Plugin](common-options.md) 了解详情
-
-## 示例
-
-FakeSource 生成的数据的行类型是 `INSERT`。如果我们使用 `FilterRowKink` 转换并排除 `INSERT` 数据,我们将不会向接收器写入任何行。
-
-```yaml
-
-env {
- job.mode = "BATCH"
-}
-
-source {
- FakeSource {
- result_table_name = "fake"
- row.num = 100
- schema = {
- fields {
- id = "int"
- name = "string"
- age = "int"
- }
- }
- }
-}
-
-transform {
- FilterRowKind {
- source_table_name = "fake"
- result_table_name = "fake1"
- exclude_kinds = ["INSERT"]
- }
-}
-
-sink {
- Console {
- source_table_name = "fake1"
- }
-}
-```
-
diff --git a/docs/zh/transform-v2/filter.md b/docs/zh/transform-v2/filter.md
deleted file mode 100644
index 706a72ead12..00000000000
--- a/docs/zh/transform-v2/filter.md
+++ /dev/null
@@ -1,60 +0,0 @@
-# 过滤器
-
-> 过滤器转换插件
-
-## 描述
-
-过滤字段
-
-## 属性
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|--------|-------|------|-----|
-| fields | array | yes | |
-
-### fields [array]
-
-需要保留的字段列表。不在列表中的字段将被删除。
-
-### common options [string]
-
-转换插件的常见参数, 请参考 [Transform Plugin](common-options.md) 了解详情
-
-## 示例
-
-源端数据读取的表格如下:
-
-| name | age | card |
-|----------|-----|------|
-| Joy Ding | 20 | 123 |
-| May Ding | 20 | 123 |
-| Kin Dom | 20 | 123 |
-| Joy Dom | 20 | 123 |
-
-我们想要删除字段 `age`,我们可以像这样添加 `Filter` 转换
-
-```
-transform {
- Filter {
- source_table_name = "fake"
- result_table_name = "fake1"
- fields = [name, card]
- }
-}
-```
-
-那么结果表 `fake1` 中的数据将会像这样:
-
-| name | card |
-|----------|------|
-| Joy Ding | 123 |
-| May Ding | 123 |
-| Kin Dom | 123 |
-| Joy Dom | 123 |
-
-## 更新日志
-
-### 新版本
-
-- 添加过滤转器换连接器
-
diff --git a/docs/zh/transform-v2/jsonpath.md b/docs/zh/transform-v2/jsonpath.md
deleted file mode 100644
index 449f0f6a77f..00000000000
--- a/docs/zh/transform-v2/jsonpath.md
+++ /dev/null
@@ -1,190 +0,0 @@
-# JsonPath
-
-> JSONPath 转换插件
-
-## 描述
-
-> 支持使用 JSONPath 选择数据
-
-## 属性
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|---------|-------|------|-----|
-| Columns | Array | Yes | |
-
-### common options [string]
-
-转换插件的常见参数, 请参考 [Transform Plugin](common-options.md) 了解详情
-
-### fields[array]
-
-#### 属性
-
-| 名称 | 类型 | 是否必须 | 默认值 |
-|------------|--------|------|--------|
-| src_field | String | Yes | |
-| dest_field | String | Yes | |
-| path | String | Yes | |
-| dest_type | String | No | String |
-
-#### src_field
-
-> 要解析的 JSON 源字段
-
-支持的Seatunnel数据类型
-
-* STRING
-* BYTES
-* ARRAY
-* MAP
-* ROW
-
-#### dest_field
-
-> 使用 JSONPath 后的输出字段
-
-#### dest_type
-
-> 目标字段的类型
-
-#### path
-
-> Jsonpath
-
-## 读取 JSON 示例
-
-从源读取的数据是像这样的 JSON
-
-```json
-{
- "data": {
- "c_string": "this is a string",
- "c_boolean": true,
- "c_integer": 42,
- "c_float": 3.14,
- "c_double": 3.14,
- "c_decimal": 10.55,
- "c_date": "2023-10-29",
- "c_datetime": "16:12:43.459",
- "c_array":["item1", "item2", "item3"]
- }
-}
-```
-
-假设我们想要使用 JsonPath 提取属性。
-
-```json
-transform {
- JsonPath {
- source_table_name = "fake"
- result_table_name = "fake1"
- columns = [
- {
- "src_field" = "data"
- "path" = "$.data.c_string"
- "dest_field" = "c1_string"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_boolean"
- "dest_field" = "c1_boolean"
- "dest_type" = "boolean"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_integer"
- "dest_field" = "c1_integer"
- "dest_type" = "int"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_float"
- "dest_field" = "c1_float"
- "dest_type" = "float"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_double"
- "dest_field" = "c1_double"
- "dest_type" = "double"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_decimal"
- "dest_field" = "c1_decimal"
- "dest_type" = "decimal(4,2)"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_date"
- "dest_field" = "c1_date"
- "dest_type" = "date"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_datetime"
- "dest_field" = "c1_datetime"
- "dest_type" = "time"
- },
- {
- "src_field" = "data"
- "path" = "$.data.c_array"
- "dest_field" = "c1_array"
- "dest_type" = "array"
- }
- ]
- }
-}
-```
-
-那么数据结果表 `fake1` 将会像这样
-
-| data | c1_string | c1_boolean | c1_integer | c1_float | c1_double | c1_decimal | c1_date | c1_datetime | c1_array |
-|------------------------------|------------------|------------|------------|----------|-----------|------------|------------|--------------|-----------------------------|
-| too much content not to show | this is a string | true | 42 | 3.14 | 3.14 | 10.55 | 2023-10-29 | 16:12:43.459 | ["item1", "item2", "item3"] |
-
-## 读取 SeatunnelRow 示例
-
-假设数据行中的一列的类型是 SeatunnelRow,列的名称为 col
-
-