From 848e1a6e1dcab406bfcce6232da6b265cd7135f9 Mon Sep 17 00:00:00 2001 From: qqqdan Date: Thu, 23 Oct 2025 12:55:20 -0700 Subject: [PATCH 01/30] update-doc --- tidb-cloud/changefeed-overview-premium.md | 100 +++ ...changefeed-sink-to-apache-kafka-premium.md | 218 +++++ .../changefeed-sink-to-mysql-premium.md | 151 ++++ .../set-up-sink-private-endpoint-premium.md | 99 +++ ...sted-kafka-private-link-service-premium.md | 756 ++++++++++++++++++ tidb-cloud/tidb-cloud-billing-ticdc-rcu.md | 44 +- 6 files changed, 1366 insertions(+), 2 deletions(-) create mode 100644 tidb-cloud/changefeed-overview-premium.md create mode 100644 tidb-cloud/changefeed-sink-to-apache-kafka-premium.md create mode 100644 tidb-cloud/changefeed-sink-to-mysql-premium.md create mode 100644 tidb-cloud/set-up-sink-private-endpoint-premium.md create mode 100644 tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md diff --git a/tidb-cloud/changefeed-overview-premium.md b/tidb-cloud/changefeed-overview-premium.md new file mode 100644 index 0000000000000..4f14bbf34039b --- /dev/null +++ b/tidb-cloud/changefeed-overview-premium.md @@ -0,0 +1,100 @@ +--- +title: Changefeed +summary: TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. +--- + +# Changefeed + +TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka, MySQL, TiDB Cloud and cloud storage. + +> **Note:** +> +> - Currently, TiDB Cloud only allows up to 100 changefeeds per instance. +> - Currently, TiDB Cloud only allows up to 100 table filter rules per changefeed. +> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. + +## View the Changefeed page + +To access the changefeed feature, take the following steps: + +1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the **TiDB Instance** page. + +2. Click the name of your target instance to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. + +On the **Changefeed** page, you can create a changefeed, view a list of existing changefeeds, and operate the existing changefeeds (such as scaling, pausing, resuming, editing, and deleting a changefeed). + +## Create a changefeed + +To create a changefeed, refer to the tutorials: + +- [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) +- [Sink to MySQL](/tidb-cloud/changefeed-sink-to-mysql.md) +- [Sink to TiDB Cloud](/tidb-cloud/changefeed-sink-to-tidb-cloud.md) +- [Sink to cloud storage](/tidb-cloud/changefeed-sink-to-cloud-storage.md) + +## Query Changefeed Capacity Units + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. +3. You can see the current TiCDC Changefeed Capacity Units (CCUs) in the **Specification** area of the page. + +## Scale a changefeed + +You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changfeed. + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to scale, and click **...** > **Scale Up/Down** in the **Action** column. +3. Select a new specification. +4. Click **Submit**. + +It takes about 10 minutes to complete the scaling process (during which the changfeed works normally) and a few seconds to switch to the new specification (during which the changefeed will be paused and resumed automatically). + +## Pause or resume a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. + +## Edit a changefeed + +> **Note:** +> +> TiDB Cloud currently only allows editing changefeeds in the paused status. + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. +3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. + + TiDB Cloud populates the changefeed configuration by default. You can modify the following configurations: + + - Apache Kafka sink: all configurations. + - MySQL sink: **MySQL Connection**, **Table Filter**, and **Event Filter**. + - TiDB Cloud sink: **TiDB Cloud Connection**, **Table Filter**, and **Event Filter**. + - Cloud storage sink: **Storage Endpoint**, **Table Filter**, and **Event Filter**. + +4. After editing the configuration, click **...** > **Resume** to resume the corresponding changefeed. + +## Delete a changefeed + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. + +## Changefeed billing + +To learn the billing for changefeeds in TiDB Cloud, see [Changefeed billing](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md). + +## Changefeed states + +The state of a replication task represents the running state of the replication task. During the running process, replication tasks might fail with errors, or be manually paused or resumed. These behaviors can lead to changes of the replication task state. + +The states are described as follows: + +- `CREATING`: the replication task is being created. +- `RUNNING`: the replication task runs normally and the checkpoint-ts proceeds normally. +- `EDITING`: the replication task is being edited. +- `PAUSING`: the replication task is being paused. +- `PAUSED`: the replication task is paused. +- `RESUMING`: the replication task is being resumed. +- `DELETING`: the replication task is being deleted. +- `DELETED`: the replication task is deleted. +- `WARNING`: the replication task returns a warning. The replication cannot continue due to some recoverable errors. The changefeed in this state keeps trying to resume until the state transfers to `RUNNING`. The changefeed in this state blocks [GC operations](https://docs.pingcap.com/tidb/stable/garbage-collection-overview). +- `FAILED`: the replication task fails. Due to some errors, the replication task cannot resume and cannot be recovered automatically. If the issues are resolved before the garbage collection (GC) of the incremental data, you can manually resume the failed changefeed. The default Time-To-Live (TTL) duration for incremental data is 24 hours, which means that the GC mechanism does not delete any data within 24 hours after the changefeed is interrupted. diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md b/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md new file mode 100644 index 0000000000000..5e632b7c5c43c --- /dev/null +++ b/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md @@ -0,0 +1,218 @@ +--- +title: Sink to Apache Kafka +summary: This document explains how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. It includes restrictions, prerequisites, and steps to configure the changefeed for Apache Kafka. The process involves setting up network connections, adding permissions for Kafka ACL authorization, and configuring the changefeed specification. +--- + +# Sink to Apache Kafka + +This document describes how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. + +> **Note:** +> +> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. + +## Restrictions + +- For each TiDB Cloud instance, you can create up to 100 changefeeds. +- Currently, TiDB Cloud does not support uploading self-signed TLS certificates to connect to Kafka brokers. +- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). +- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. + +## Prerequisites + +Before creating a changefeed to stream data to Apache Kafka, you need to complete the following prerequisites: + +- Set up your network connection +- Add permissions for Kafka ACL authorization + +### Network + +Ensure that your TiDB instance can connect to the Apache Kafka service. You can choose one of the following connection methods: + +- Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). +- VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. +- Public IP: suitable for a quick setup. + + +
+ +Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. + +TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. + +- If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create a private endpoint. + +
+ +
+ +If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. + +It is **NOT** recommended to use Public IP in a production environment. + +
+
+ +To submit an VPC Peering request, perform the steps in [TiDB Cloud Support](/tidb-cloud/tidb-cloud-support.md) to contact our support team. + +
+
+ +### Kafka ACL authorization + +To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka: + +- The `Create` and `Write` permissions are added for the topic resource type in Kafka. +- The `DescribeConfigs` permission is added for the instance resource type in Kafka. + +For example, if your Kafka instance is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. + +## Step 1. Open the Changefeed page for Apache Kafka + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com). +2. Navigate to the instance overview page of the target TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. +3. Click **Create Changefeed**, and select **Kafka** as **Destination**. + +## Step 2. Configure the changefeed target + +The steps vary depending on the connectivity method you select. + + +
+ +1. In **Connectivity Method**, select **VPC Peering** or **Public IP**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. +2. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. + +3. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +4. Select a **Compression** type for the data in this changefeed. +5. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +6. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. + +
+
+ +1. In **Connectivity Method**, select **Private Link**. +2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. +3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. +4. Select an **Authentication** option according to your Kafka authentication configuration. + + - If your Kafka does not require authentication, keep the default option **Disable**. + - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. +5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. +6. Select a **Compression** type for the data in this changefeed. +7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. +8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. +9. TiDB Cloud creates the endpoint for **Private Link**, which might take several minutes. +10. Once the endpoint is created, log in to your cloud provider console and accept the connection request. +11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. + +
+ +
+ +## Step 3. Set the changefeed + +1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). + + - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. + - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. + - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. + +2. Customize **Event Filter** to filter the events that you want to replicate. + + - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. + - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: + - **Ignore event**: excludes specified event types. + - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. + - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. + - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. + - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. + - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. + +3. Customize **Column Selector** to select columns from events and send only the data changes related to those columns to the downstream. + + - **Tables matching**: specify which tables the column selector applies to. For tables that do not match any rule, all columns are sent. + - **Column Selector**: specify which columns of the matched tables will be sent to the downstream. + + For more information about the matching rules, see [Column selectors](https://docs.pingcap.com/tidb/stable/ticdc-sink-to-kafka/#column-selectors). + +4. In the **Data Format** area, select your desired format of Kafka messages. + + - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol). + - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json). + - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol). + - Debezium is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. For more information, see [Debezium data format](https://docs.pingcap.com/tidb/stable/ticdc-debezium). + +5. Enable the **TiDB Extension** option if you want to add TiDB-extension fields to the Kafka message body. + + For more information about TiDB-extension fields, see [TiDB extension fields in Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol#tidb-extension-fields) and [TiDB extension fields in Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json#tidb-extension-field). + +6. If you select **Avro** as your data format, you will see some Avro-specific configurations on the page. You can fill in these configurations as follows: + + - In the **Decimal** and **Unsigned BigInt** configurations, specify how TiDB Cloud handles the decimal and unsigned bigint data types in Kafka messages. + - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB instance endpoint and password. + +7. In the **Topic Distribution** area, select a distribution mode, and then fill in the topic name configurations according to the mode. + + If you select **Avro** as your data format, you can only choose the **Distribute changelogs by table to Kafka Topics** mode in the **Distribution Mode** drop-down list. + + The distribution mode controls how the changefeed creates Kafka topics, by table, by database, or creating one topic for all changelogs. + + - **Distribute changelogs by table to Kafka Topics** + + If you want the changefeed to create a dedicated Kafka topic for each table, choose this mode. Then, all Kafka messages of a table are sent to a dedicated Kafka topic. You can customize topic names for tables by setting a topic prefix, a separator between a database name and table name, and a suffix. For example, if you set the separator as `_`, the topic names are in the format of `_`. + + For changelogs of non-row events, such as Create Schema Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. + + - **Distribute changelogs by database to Kafka Topics** + + If you want the changefeed to create a dedicated Kafka topic for each database, choose this mode. Then, all Kafka messages of a database are sent to a dedicated Kafka topic. You can customize topic names of databases by setting a topic prefix and a suffix. + + For changelogs of non-row events, such as Resolved Ts Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. + + - **Send all changelogs to one specified Kafka Topic** + + If you want the changefeed to create one Kafka topic for all changelogs, choose this mode. Then, all Kafka messages in the changefeed will be sent to one Kafka topic. You can define the topic name in the **Topic Name** field. + +8. In the **Partition Distribution** area, you can decide which partition a Kafka message will be sent to. You can define **a single partition dispatcher for all tables**, or **different partition dispatchers for different tables**. TiDB Cloud provides four types of dispatchers: + + - **Distribute changelogs by primary key or index value to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The primary key or index value of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures row-level orderliness. + + - **Distribute changelogs by table to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to one Kafka partition, choose this distribution method. The table name of a row changelog will determine which partition the changelog is sent to. This distribution method ensures table orderliness but might cause unbalanced partitions. + + - **Distribute changelogs by timestamp to Kafka partition** + + If you want the changefeed to send Kafka messages to different Kafka partitions randomly, choose this distribution method. The commitTs of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures orderliness in each partition. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming. + + - **Distribute changelogs by column value to Kafka partition** + + If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is send to the same partition. + +9. In the **Topic Configuration** area, configure the following numbers. The changefeed will automatically create the Kafka topics according to the numbers. + + - **Replication Factor**: controls how many Kafka servers each Kafka message is replicated to. The valid value ranges from [`min.insync.replicas`](https://kafka.apache.org/33/documentation.html#brokerconfigs_min.insync.replicas) to the number of Kafka brokers. + - **Partition Number**: controls how many partitions exist in a topic. The valid value range is `[1, 10 * the number of Kafka brokers]`. + +10. In the **Split Event** area, choose whether to split `UPDATE` events into separate `DELETE` and `INSERT` events or keep as raw `UPDATE` events. For more information, see [Split primary or unique key UPDATE events for non-MySQL sinks](https://docs.pingcap.com/tidb/stable/ticdc-split-update-behavior/#split-primary-or-unique-key-update-events-for-non-mysql-sinks). + +11. Click **Next**. + +## Step 4. Configure your changefeed specification + +1. In the **Changefeed Specification** area, specify the number of Changefeed Capacity Units (CCUs) to be used by the changefeed. +2. In the **Changefeed Name** area, specify a name for the changefeed. +3. Click **Next** to check the configurations you set and go to the next page. + +## Step 5. Review the configurations + +On this page, you can review all the changefeed configurations that you set. + +If you find any error, you can go back to fix the error. If there is no error, you can click the check box at the bottom, and then click **Create** to create the changefeed. diff --git a/tidb-cloud/changefeed-sink-to-mysql-premium.md b/tidb-cloud/changefeed-sink-to-mysql-premium.md new file mode 100644 index 0000000000000..4f36d226aec93 --- /dev/null +++ b/tidb-cloud/changefeed-sink-to-mysql-premium.md @@ -0,0 +1,151 @@ +--- +title: Sink to MySQL +summary: This document explains how to stream data from TiDB Cloud to MySQL using the Sink to MySQL changefeed. It includes restrictions, prerequisites, and steps to create a MySQL sink for data replication. The process involves setting up network connections, loading existing data to MySQL, and creating target tables in MySQL. After completing the prerequisites, users can create a MySQL sink to replicate data to MySQL. +--- + +# Sink to MySQL + +This document describes how to stream data from TiDB Cloud to MySQL using the **Sink to MySQL** changefeed. + +> **Note:** +> +> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. + +## Restrictions + +- For each TiDB Cloud instance, you can create up to 100 changefeeds. +- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). +- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. + +## Prerequisites + +Before creating a changefeed, you need to complete the following prerequisites: + +- Set up your network connection +- Export and load the existing data to MySQL (optional) +- Create corresponding target tables in MySQL if you do not load the existing data and only want to replicate incremental data to MySQL + +### Network + +Make sure that your TiDB Cloud instance can connect to the MySQL service. + + +
+ +To submit an VPC Peering request, perform the steps in [TiDB Cloud Support](/tidb-cloud/tidb-cloud-support.md) to contact our support team. + +
+ +
+ +Private endpoints leverage **Private Link** or **Private Service Connect** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. + +You can connect your TiDB Cloud instance to your MySQL service securely through a private endpoint. If the private endpoint is not available for your MySQL service, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create one. + +
+ +
+ +### Load existing data (optional) + +The **Sink to MySQL** connector can only sink incremental data from your TiDB instance to MySQL after a certain timestamp. If you already have data in your TiDB instance, you can export and load the existing data of your TiDB instance into MySQL before enabling **Sink to MySQL**. + +To load the existing data: + +1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during the time is not garbage collected by TiDB. + + - The time to export and import the existing data + - The time to create **Sink to MySQL** + + For example: + + {{< copyable "sql" >}} + + ```sql + SET GLOBAL tidb_gc_life_time = '720h'; + ``` + +2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB instance, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. + +3. From the [exported files of Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#format-of-exported-files), get the start position of MySQL sink from the metadata file: + + The following is a part of an example metadata file. The `Pos` of `SHOW MASTER STATUS` is the TSO of the existing data, which is also the start position of MySQL sink. + + ``` + Started dump at: 2020-11-10 10:40:19 + SHOW MASTER STATUS: + Log: tidb-binlog + Pos: 420747102018863124 + Finished dump at: 2020-11-10 10:40:20 + ``` + +### Create target tables in MySQL + +If you do not load the existing data, you need to create corresponding target tables in MySQL manually to store the incremental data from TiDB. Otherwise, the data will not be replicated. + +## Create a MySQL sink + +After completing the prerequisites, you can sink your data to MySQL. + +1. Navigate to the instance overview page of the target TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. + +2. Click **Create Changefeed**, and select **MySQL** as **Destination**. + +3. In **Connectivity Method**, choose the method to connect to your MySQL service. + + - If you choose **VPC Peering** or **Public IP**, fill in your MySQL endpoint. + - If you choose **Private Link**, select the private endpoint that you created in the [Network](#network) section, and then fill in the MySQL port for your MySQL service. + +4. In **Authentication**, fill in the MySQL user name and password of your MySQL service. + +5. Click **Next** to test whether TiDB can connect to MySQL successfully: + + - If yes, you are directed to the next step of configuration. + - If not, a connectivity error is displayed, and you need to handle the error. After the error is resolved, click **Next** again. + +6. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). + + - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. + - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. + - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. + - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. + +7. Customize **Event Filter** to filter the events that you want to replicate. + + - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. + - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: + - **Ignore event**: excludes specified event types. + - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. + - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. + - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. + - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. + - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. + +8. In **Start Replication Position**, configure the starting position for your MySQL sink. + + - If you have [loaded the existing data](#load-existing-data-optional) using Dumpling, select **Start replication from a specific TSO** and fill in the TSO that you get from Dumpling exported metadata files. + - If you do not have any data in the upstream TiDB instance, select **Start replication from now on**. + - Otherwise, you can customize the start time point by choosing **Start replication from a specific time**. + +9. Click **Next** to configure your changefeed specification. + + - In the **Changefeed Specification** area, specify the number of Changefeed Capacity Units (CCUs) to be used by the changefeed. + - In the **Changefeed Name** area, specify a name for the changefeed. + +10. Click **Next** to review the changefeed configuration. + + If you confirm that all configurations are correct, check the compliance of cross-region replication, and click **Create**. + + If you want to modify some configurations, click **Previous** to go back to the previous configuration page. + +11. The sink starts soon, and you can see the status of the sink changes from **Creating** to **Running**. + + Click the changefeed name, and you can see more details about the changefeed, such as the checkpoint, replication latency, and other metrics. + +12. If you have [loaded the existing data](#load-existing-data-optional) using Dumpling, you need to restore the GC time to its original value (the default value is `10m`) after the sink is created: + +{{< copyable "sql" >}} + +```sql +SET GLOBAL tidb_gc_life_time = '10m'; +``` diff --git a/tidb-cloud/set-up-sink-private-endpoint-premium.md b/tidb-cloud/set-up-sink-private-endpoint-premium.md new file mode 100644 index 0000000000000..1865e2d88b798 --- /dev/null +++ b/tidb-cloud/set-up-sink-private-endpoint-premium.md @@ -0,0 +1,99 @@ +--- +title: Set Up Private Endpoint for Changefeeds +summary: Learn how to set up a private endpoint for changefeeds. +--- + +# Set Up Private Endpoint for Changefeeds + +This document describes how to create a private endpoint for changefeeds in your TiDB Cloud Premium instance, enabling you to securely stream data to self-hosted Kafka or MySQL through private connectivity. + +## Prerequisites + +- Check permissions for private endpoint creation +- Set up your network connection + +### Permissions + +Only users with any of the following roles in your organization can create private endpoints for changefeeds: + +- `Organization Owner` +- `Instance Admin` for corresponding instance + +For more information about roles in TiDB Cloud, see [User roles](/tidb-cloud/manage-user-access.md#user-roles). + +### Network + +Private endpoints leverage **Private Link** technology from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. + + +
+ +If your changefeed downstream service is hosted on AWS, collect the following information: + +- The name of the Private Endpoint Service for your downstream service +- The availability zones (AZs) where your downstream service is deployed + +If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka instance as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md#step-2-expose-the-kafka-instance-as-private-link-service) to set up the load balancer and the Private Link Service. + +
+
+ +If your changefeed downstream service is hosted on Alibaba Cloud, collect the following information: + +- The name of the Private Endpoint Service for your downstream service +- The availability zones (AZs) where your downstream service is deployed + +
+ +
+ +## Step 1. Open the Networking page for your instance + +1. Log in to the [TiDB Cloud console](https://tidbcloud.com/). + +2. On the **instances** page, click the name of your target instance to go to its overview page. + + > **Tip:** + > + > You can use the combo box in the upper-left corner to switch between organizations and instances. + +3. In the left navigation pane, click **Settings** > **Networking**. + +## Step 2. Configure the private endpoint for changefeeds + +The configuration steps vary depending on the cloud provider where your instance is deployed. + + +
+ +1. On the **Networking** page, click **Create Private Endpoint** in the **AWS Private Endpoint for Changefeed** section. +2. In the **Create Private Endpoint for Changefeed** dialog, enter a name for the private endpoint. +3. Follow the reminder to authorize the [AWS Principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html#principal-accounts) of TiDB Cloud to create an endpoint. +4. Enter the **Endpoint Service Name** that you collected in the [Network](#network) section. +5. Select the **Number of AZs**. Ensure that the number of AZs and the AZ IDs match your Kafka deployment. +6. If this private endpoint is created for Apache Kafka, enable the **Advertised Listener for Kafka** option. +7. Configure the advertised listener for Kafka using either the **TiDB Managed** domain or the **Custom** domain. + + - To use the **TiDB Managed** domain for advertised listeners, enter a unique string in the **Domain Pattern** field, and then click **Generate**. TiDB will generate broker addresses with subdomains for each availability zone. + - To use your own **Custom** domain for advertised listeners, switch the domain type to **Custom**, enter the root domain in the **Custom Domain** field, click **Check**, and then specify the broker subdomains for each availability zone. + +8. Click **Create** to validate the configurations and create the private endpoint. + +
+
+ +1. On the **Networking** page, click **Create Private Endpoint** in the **Alibaba Cloud Private Endpoint for Changefeed** section. +2. In the **Create Private Endpoint for Changefeed** dialog, enter a name for the private endpoint. +3. Follow the reminder to to whitelist TiDB Cloud's Alibaba Cloud account ID for your endpoint service to grant the TiDB Cloud VPC access. +4. Enter the **Endpoint Service Name** that you collected in the [Network](#network) section. +5. Select the **Number of AZs**. Ensure that the number of AZs and the AZ IDs match your Kafka deployment. +6. If this private endpoint is created for Apache Kafka, enable the **Advertised Listener for Kafka** option. +7. Configure the advertised listener for Kafka using either the **TiDB Managed** domain or the **Custom** domain. + + - To use the **TiDB Managed** domain for advertised listeners, enter a unique string in the **Domain Pattern** field, and then click **Generate**. TiDB will generate broker addresses with subdomains for each availability zone. + - To use your own **Custom** domain for advertised listeners, switch the domain type to **Custom**, enter the root domain in the **Custom Domain** field, click **Check**, and then specify the broker subdomains for each availability zone. + +8. Click **Create** to validate the configurations and create the private endpoint. + +
+
diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md new file mode 100644 index 0000000000000..cfb8a5e994655 --- /dev/null +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md @@ -0,0 +1,756 @@ +--- +title: Set Up Self-Hosted Kafka Private Link Service in AWS +summary: This document explains how to set up Private Link service for self-hosted Kafka in AWS and how to make it work with TiDB Cloud. +aliases: ['/tidbcloud/setup-self-hosted-kafka-private-link-service'] +--- + +# Set Up Self-Hosted Kafka Private Link Service in AWS + +This document describes how to set up Private Link service for self-hosted Kafka in AWS, and how to make it work with TiDB Cloud. + +The mechanism works as follows: + +1. The TiDB Cloud VPC connects to the Kafka VPC through private endpoints. +2. Kafka clients need to communicate directly to all Kafka brokers. +3. Each Kafka broker is mapped to a unique port of endpoints within the TiDB Cloud VPC. +4. Leverage the Kafka bootstrap mechanism and AWS resources to achieve the mapping. + +The following diagram shows the mechanism. + +![Connect to AWS Self-Hosted Kafka Private Link Service](/media/tidb-cloud/changefeed/connect-to-aws-self-hosted-kafka-privatelink-service.jpeg) + +The document provides an example of connecting to a Kafka Private Link service deployed across three availability zones (AZ) in AWS. While other configurations are possible based on similar port-mapping principles, this document covers the fundamental setup process of the Kafka Private Link service. For production environments, a more resilient Kafka Private Link service with enhanced operational maintainability and observability is recommended. + +## Prerequisites + +1. Ensure that you have the following authorization to set up a Kafka Private Link service in your own AWS account. + + - Manage EC2 nodes + - Manage VPC + - Manage subnets + - Manage security groups + - Manage load balancer + - Manage endpoint services + - Connect to EC2 nodes to configure Kafka nodes + +2. [Create a TiDB Cloud Premium instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. + +3. Get the Kafka deployment information from your TiDB Cloud Premium instance. + + 1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the instance overview page of the TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. + 2. On the overview page, find the region of the TiDB instance. Ensure that your Kafka cluster will be deployed to the same region. + 3. To create a changefeed, refer to the tutorials: + + - [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) + +Note down all the deployment information. You need to use it to configure your Kafka Private Link service later. + +The following table shows an example of the deployment information. + +| Information | Value | Note | +|--------|-----------------|---------------------------| +| Region | Oregon (`us-west-2`) | N/A | +| Principal of TiDB Cloud AWS Account | `arn:aws:iam:::root` | N/A | +| AZ IDs |
  • `usw2-az1`
  • `usw2-az2`
  • `usw2-az3`
| Align AZ IDs to AZ names in your AWS account.
Example:
  • `usw2-az1` => `us-west-2a`
  • `usw2-az2` => `us-west-2c`
  • `usw2-az3` => `us-west-2b`
| +| Kafka Advertised Listener Pattern | The unique random string: `abc`
Generated pattern for AZs:
  • `usw2-az1` => <broker_id>.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `usw2-az2` => <broker_id>.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `usw2-az3` => <broker_id>.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
| Map AZ names to AZ-specified patterns. Make sure that you configure the right pattern to the broker in a specific AZ later.
  • `us-west-2a` => <broker_id>.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `us-west-2c` => <broker_id>.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `us-west-2b` => <broker_id>.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
| + +## Step 1. Set up a Kafka cluster + +If you need to deploy a new cluster, follow the instructions in [Deploy a new Kafka cluster](#deploy-a-new-kafka-cluster). + +If you need to expose an existing cluster, follow the instructions in [Reconfigure a running Kafka cluster](#reconfigure-a-running-kafka-cluster). + +### Deploy a new Kafka cluster + +#### 1. Set up the Kafka VPC + +The Kafka VPC requires the following: + +- Three private subnets for brokers, one for each AZ. +- One public subnet in any AZ with a bastion node that can connect to the internet and three private subnets, which makes it easy to set up the Kafka cluster. In a production environment, you might have your own bastion node that can connect to the Kafka VPC. + +Before creating subnets, create subnets in AZs based on the mappings of AZ IDs and AZ names. Take the following mapping as an example. + +- `usw2-az1` => `us-west-2a` +- `usw2-az2` => `us-west-2c` +- `usw2-az3` => `us-west-2b` + +Create private subnets in the following AZs: + +- `us-west-2a` +- `us-west-2c` +- `us-west-2b` + +Take the following steps to create the Kafka VPC. + +**1.1. Create the Kafka VPC** + +1. Go to [AWS Console > VPC dashboard](https://console.aws.amazon.com/vpcconsole/home?#vpcs:), and switch to the region in which you want to deploy Kafka. + +2. Click **Create VPC**. Fill in the information on the **VPC settings** page as follows. + + 1. Select **VPC only**. + 2. Enter a tag in **Name tag**, for example, `Kafka VPC`. + 3. Select **IPv4 CIDR manual input**, and enter the IPv4 CIDR, for example, `10.0.0.0/16`. + 4. Use the default values for other options. Click **Create VPC**. + 5. On the VPC detail page, take note of the VPC ID, for example, `vpc-01f50b790fa01dffa`. + +**1.2. Create private subnets in the Kafka VPC** + +1. Go to the [Subnets Listing page](https://console.aws.amazon.com/vpcconsole/home?#subnets:). +2. Click **Create subnet**. +3. Select **VPC ID** (`vpc-01f50b790fa01dffa` in this example) that you noted down before. +4. Add three subnets with the following information. It is recommended that you put the AZ IDs in the subnet names to make it easy to configure the brokers later, because TiDB Cloud requires encoding the AZ IDs in the broker's `advertised.listener` configuration. + + - Subnet1 in `us-west-2a` + - **Subnet name**: `broker-usw2-az1` + - **Availability Zone**: `us-west-2a` + - **IPv4 subnet CIDR block**: `10.0.0.0/18` + + - Subnet2 in `us-west-2c` + - **Subnet name**: `broker-usw2-az2` + - **Availability Zone**: `us-west-2c` + - **IPv4 subnet CIDR block**: `10.0.64.0/18` + + - Subnet3 in `us-west-2b` + - **Subnet name**: `broker-usw2-az3` + - **Availability Zone**: `us-west-2b` + - **IPv4 subnet CIDR block**: `10.0.128.0/18` + +5. Click **Create subnet**. The **Subnets Listing** page is displayed. + +**1.3. Create the public subnet in the Kafka VPC** + +1. Click **Create subnet**. +2. Select **VPC ID** (`vpc-01f50b790fa01dffa` in this example) that you noted down before. +3. Add the public subnet in any AZ with the following information: + + - **Subnet name**: `bastion` + - **IPv4 subnet CIDR block**: `10.0.192.0/18` + +4. Configure the bastion subnet to the Public subnet. + + 1. Go to [VPC dashboard > Internet gateways](https://console.aws.amazon.com/vpcconsole/home#igws:). Create an Internet Gateway with the name `kafka-vpc-igw`. + 2. On the **Internet gateways Detail** page, in **Actions**, click **Attach to VPC** to attach the Internet Gateway to the Kafka VPC. + 3. Go to [VPC dashboard > Route tables](https://console.aws.amazon.com/vpcconsole/home#CreateRouteTable:). Create a route table to the Internet Gateway in Kafka VPC and add a new route with the following information: + + - **Name**: `kafka-vpc-igw-route-table` + - **VPC**: `Kafka VPC` + - **Route**: + - **Destination**: `0.0.0.0/0` + - **Target**: `Internet Gateway`, `kafka-vpc-igw` + + 4. Attach the route table to the bastion subnet. On the **Detail** page of the route table, click **Subnet associations > Edit subnet associations** to add the bastion subnet and save changes. + +#### 2. Set up Kafka brokers + +**2.1. Create a bastion node** + +Go to the [EC2 Listing page](https://console.aws.amazon.com/ec2/home#Instances:). Create the bastion node in the bastion subnet. + +- **Name**: `bastion-node` +- **Amazon Machine Image**: `Amazon linux` +- **Instance Type**: `t2.small` +- **Key pair**: `kafka-vpc-key-pair`. Create a new key pair named `kafka-vpc-key-pair`. Download **kafka-vpc-key-pair.pem** to your local for later configuration. +- Network settings + + - **VPC**: `Kafka VPC` + - **Subnet**: `bastion` + - **Auto-assign public IP**: `Enable` + - **Security Group**: create a new security group allow SSH login from anywhere. You can narrow the rule for safety in the production environment. + +**2.2. Create broker nodes** + +Go to the [EC2 Listing page](https://console.aws.amazon.com/ec2/home#Instances:). Create three broker nodes in broker subnets, one for each AZ. + +- Broker 1 in subnet `broker-usw2-az1` + + - **Name**: `broker-node1` + - **Amazon Machine Image**: `Amazon linux` + - **Instance Type**: `t2.large` + - **Key pair**: reuse `kafka-vpc-key-pair` + - Network settings + + - **VPC**: `Kafka VPC` + - **Subnet**: `broker-usw2-az1` + - **Auto-assign public IP**: `Disable` + - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. + - **Protocol**: `TCP` + - **Port range**: `0 - 65535` + - **Source**: `10.0.0.0/16` + +- Broker 2 in subnet `broker-usw2-az2` + + - **Name**: `broker-node2` + - **Amazon Machine Image**: `Amazon linux` + - **Instance Type**: `t2.large` + - **Key pair**: reuse `kafka-vpc-key-pair` + - Network settings + + - **VPC**: `Kafka VPC` + - **Subnet**: `broker-usw2-az2` + - **Auto-assign public IP**: `Disable` + - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. + - **Protocol**: `TCP` + - **Port range**: `0 - 65535` + - **Source**: `10.0.0.0/16` + +- Broker 3 in subnet `broker-usw2-az3` + + - **Name**: `broker-node3` + - **Amazon Machine Image**: `Amazon linux` + - **Instance Type**: `t2.large` + - **Key pair**: reuse `kafka-vpc-key-pair` + - Network settings + + - **VPC**: `Kafka VPC` + - **Subnet**: `broker-usw2-az3` + - **Auto-assign public IP**: `Disable` + - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. + - **Protocol**: `TCP` + - **Port range**: `0 - 65535` + - **Source**: `10.0.0.0/16` + +**2.3. Prepare Kafka runtime binaries** + +1. Go to the detail page of the bastion node. Get the **Public IPv4 address**. Use SSH to log in to the node with the previously downloaded `kafka-vpc-key-pair.pem`. + + ```shell + chmod 400 kafka-vpc-key-pair.pem + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{bastion_public_ip} # replace {bastion_public_ip} with the IP address of your bastion node, for example, 54.186.149.187 + scp -i "kafka-vpc-key-pair.pem" kafka-vpc-key-pair.pem ec2-user@{bastion_public_ip}:~/ + ``` + +2. Download binaries. + + ```shell + # Download Kafka and OpenJDK, and then extract the files. You can choose the binary version based on your preference. + wget https://archive.apache.org/dist/kafka/3.7.1/kafka_2.13-3.7.1.tgz + tar -zxf kafka_2.13-3.7.1.tgz + wget https://download.java.net/java/GA/jdk22.0.2/c9ecb94cd31b495da20a27d4581645e8/9/GPL/openjdk-22.0.2_linux-x64_bin.tar.gz + tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz + ``` + +3. Copy binaries to each broker node. + + ```shell + # Replace {broker-node1-ip} with your broker-node1 IP address + scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node1-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node1-ip} "tar -zxf kafka_2.13-3.7.1.tgz" + scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node1-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node1-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" + + # Replace {broker-node2-ip} with your broker-node2 IP address + scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node2-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node2-ip} "tar -zxf kafka_2.13-3.7.1.tgz" + scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node2-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node2-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" + + # Replace {broker-node3-ip} with your broker-node3 IP address + scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node3-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node3-ip} "tar -zxf kafka_2.13-3.7.1.tgz" + scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node3-ip}:~/ + ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node3-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" + ``` + +**2.4. Set up Kafka nodes on each broker node** + +**2.4.1 Set up a KRaft Kafka cluster with three nodes** + +Each node will act as a broker and controller role. Do the following for each broker: + +1. For the `listeners` item, all three brokers are the same and act as broker and controller roles: + + 1. Configure the same CONTROLLER listener for all **controller** role nodes. If you only want to add the **broker** role nodes, you do not need the CONTROLLER listener in `server.properties`. + 2. Configure two **broker** listeners, `INTERNAL` for internal access and `EXTERNAL` for external access from TiDB Cloud. + +2. For the `advertised.listeners` item, do the following: + + 1. Configure an INTERNAL advertised listener for every broker with the internal IP of the broker node. Advertised internal Kafka clients use this address to visit the broker. + 2. Configure an EXTERNAL advertised listener based on **Kafka Advertised Listener Pattern** you get from TiDB Cloud for each broker node to help TiDB Cloud differentiate between different brokers. Different EXTERNAL advertised listeners help the Kafka client from TiDB Cloud route requests to the right broker. + + - `` differentiates brokers from Kafka Private Link Service access points. Plan a port range for EXTERNAL advertised listeners of all brokers. These ports do not have to be actual ports listened to by brokers. They are ports listened to by the load balancer for Private Link Service that will forward requests to different brokers. + - `AZ ID` in **Kafka Advertised Listener Pattern** indicates where the broker is deployed. TiDB Cloud will route requests to different endpoint DNS names based on the AZ ID. + + It is recommended to configure different broker IDs for different brokers to make it easy for troubleshooting. + +3. The planning values are as follows: + + - **CONTROLLER port**: `29092` + - **INTERNAL port**: `9092` + - **EXTERNAL**: `39092` + - **EXTERNAL advertised listener ports range**: `9093~9095` + +**2.4.2. Create a configuration file** + +Use SSH to log in to every broker node. Create a configuration file `~/config/server.properties` with the following content. + +```properties +# brokers in usw2-az1 + +# broker-node1 ~/config/server.properties +# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. +# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. +# 2.1 The pattern for AZ(ID: usw2-az1) is ".usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:". +# 2.2 So the EXTERNAL can be "b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9093) in the port range of the EXTERNAL advertised listener. +# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. +process.roles=broker,controller +node.id=1 +controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 +listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 +inter.broker.listener.name=INTERNAL +advertised.listeners=INTERNAL://{broker-node1-ip}:9092,EXTERNAL://b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 +controller.listener.names=CONTROLLER +listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL +log.dirs=./data +``` + +```properties +# brokers in usw2-az2 + +# broker-node2 ~/config/server.properties +# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. +# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. +# 2.1 The pattern for AZ(ID: usw2-az2) is ".usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:". +# 2.2 So the EXTERNAL can be "b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9094) in the port range of the EXTERNAL advertised listener. +# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. +process.roles=broker,controller +node.id=2 +controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 +listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 +inter.broker.listener.name=INTERNAL +advertised.listeners=INTERNAL://{broker-node2-ip}:9092,EXTERNAL://b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 +controller.listener.names=CONTROLLER +listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL +log.dirs=./data +``` + +```properties +# brokers in usw2-az3 + +# broker-node3 ~/config/server.properties +# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. +# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. +# 2.1 The pattern for AZ(ID: usw2-az3) is ".usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:". +# 2.2 So the EXTERNAL can be "b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9095) in the port range of the EXTERNAL advertised listener. +# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. +process.roles=broker,controller +node.id=3 +controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 +listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 +inter.broker.listener.name=INTERNAL +advertised.listeners=INTERNAL://{broker-node3-ip}:9092,EXTERNAL://b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 +controller.listener.names=CONTROLLER +listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL +log.dirs=./data +``` + +**2.4.3 Start Kafka brokers** + +Create a script, and then execute it to start the Kafka broker in each broker node. + +```shell +#!/bin/bash + +# Get the directory of the current script +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +# Set JAVA_HOME to the Java installation within the script directory +export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" +# Define the vars +KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" +KAFKA_STORAGE_CMD=$KAFKA_DIR/kafka-storage.sh +KAFKA_START_CMD=$KAFKA_DIR/kafka-server-start.sh +KAFKA_DATA_DIR=$SCRIPT_DIR/data +KAFKA_LOG_DIR=$SCRIPT_DIR/log +KAFKA_CONFIG_DIR=$SCRIPT_DIR/config + +# Cleanup step, which makes it easy for multiple experiments +# Find all Kafka process IDs +KAFKA_PIDS=$(ps aux | grep 'kafka.Kafka' | grep -v grep | awk '{print $2}') +if [ -z "$KAFKA_PIDS" ]; then + echo "No Kafka processes are running." +else + # Kill each Kafka process + echo "Killing Kafka processes with PIDs: $KAFKA_PIDS" + for PID in $KAFKA_PIDS; do + kill -9 $PID + echo "Killed Kafka process with PID: $PID" + done + echo "All Kafka processes have been killed." +fi + +rm -rf $KAFKA_DATA_DIR +mkdir -p $KAFKA_DATA_DIR +rm -rf $KAFKA_LOG_DIR +mkdir -p $KAFKA_LOG_DIR + +# Magic id: BRl69zcmTFmiPaoaANybiw, you can use your own +$KAFKA_STORAGE_CMD format -t "BRl69zcmTFmiPaoaANybiw" -c "$KAFKA_CONFIG_DIR/server.properties" > $KAFKA_LOG_DIR/server_format.log +LOG_DIR=$KAFKA_LOG_DIR nohup $KAFKA_START_CMD "$KAFKA_CONFIG_DIR/server.properties" & +``` + +**2.5. Test the cluster setting in the bastion node** + +1. Test the Kafka bootstrap. + + ```shell + export JAVA_HOME=/home/ec2-user/jdk-22.0.2 + + # Bootstrap from INTERNAL listener + ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:9092 | grep 9092 + # Expected output (the actual order might be different) + {broker-node1-ip}:9092 (id: 1 rack: null) -> ( + {broker-node2-ip}:9092 (id: 2 rack: null) -> ( + {broker-node3-ip}:9092 (id: 3 rack: null) -> ( + + # Bootstrap from EXTERNAL listener + ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:39092 + # Expected output for the last 3 lines (the actual order might be different) + # The difference in the output from "bootstrap from INTERNAL listener" is that exceptions or errors might occur because advertised listeners cannot be resolved in Kafka VPC. + # We will make them resolvable in TiDB Cloud side and make it route to the right broker when you create a changefeed connect to this Kafka cluster by Private Link. + b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + ``` + +2. Create a producer script `produce.sh` in the bastion node. + + ```shell + #!/bin/bash + BROKER_LIST=$1 # "{broker_address1},{broker_address2}..." + + # Get the directory of the current script + SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + # Set JAVA_HOME to the Java installation within the script directory + export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" + # Define the Kafka directory + KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" + TOPIC="test-topic" + + # Create a topic if it does not exist + create_topic() { + echo "Creating topic if it does not exist..." + $KAFKA_DIR/kafka-topics.sh --create --topic $TOPIC --bootstrap-server $BROKER_LIST --if-not-exists --partitions 3 --replication-factor 3 + } + + # Produce messages to the topic + produce_messages() { + echo "Producing messages to the topic..." + for ((chrono=1; chrono <= 10; chrono++)); do + message="Test message "$chrono + echo "Create "$message + echo $message | $KAFKA_DIR/kafka-console-producer.sh --broker-list $BROKER_LIST --topic $TOPIC + done + } + create_topic + produce_messages + ``` + +3. Create a consumer script `consume.sh` in the bastion node. + + ```shell + #!/bin/bash + + BROKER_LIST=$1 # "{broker_address1},{broker_address2}..." + + # Get the directory of the current script + SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + # Set JAVA_HOME to the Java installation within the script directory + export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" + # Define the Kafka directory + KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" + TOPIC="test-topic" + CONSUMER_GROUP="test-group" + # Consume messages from the topic + consume_messages() { + echo "Consuming messages from the topic..." + $KAFKA_DIR/kafka-console-consumer.sh --bootstrap-server $BROKER_LIST --topic $TOPIC --from-beginning --timeout-ms 5000 --consumer-property group.id=$CONSUMER_GROUP + } + consume_messages + ``` + +4. Execute `produce.sh` and `consume.sh` to verify that the Kafka cluster is running. These scripts will also be reused for later network connection testing. The script will create a topic with `--partitions 3 --replication-factor 3`. Ensure that all these three brokers contain data. Ensure that the script will connect to all three brokers to guarantee that network connection will be tested. + + ```shell + # Test write message. + ./produce.sh {one_of_broker_ip}:9092 + ``` + + ```shell + # Expected output + Creating topic if it does not exist... + + Producing messages to the topic... + Create Test message 1 + >>Create Test message 2 + >>Create Test message 3 + >>Create Test message 4 + >>Create Test message 5 + >>Create Test message 6 + >>Create Test message 7 + >>Create Test message 8 + >>Create Test message 9 + >>Create Test message 10 + ``` + + ```shell + # Test read message + ./consume.sh {one_of_broker_ip}:9092 + ``` + + ```shell + # Expected example output (the actual message order might be different) + Consuming messages from the topic... + Test message 3 + Test message 4 + Test message 5 + Test message 9 + Test message 10 + Test message 6 + Test message 8 + Test message 1 + Test message 2 + Test message 7 + [2024-11-01 08:54:27,547] ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$) + org.apache.kafka.common.errors.TimeoutException + Processed a total of 10 messages + ``` + +### Reconfigure a running Kafka cluster + +Ensure that your Kafka cluster is deployed in the same region and AZs as the TiDB instance. If any brokers are in different AZs, move them to the correct ones. + +#### 1. Configure the EXTERNAL listener for brokers + +The following configuration applies to a Kafka KRaft cluster. The ZK mode configuration is similar. + +1. Plan configuration changes. + + 1. Configure an EXTERNAL **listener** for every broker for external access from TiDB Cloud. Select a unique port as the EXTERNAL port, for example, `39092`. + 2. Configure an EXTERNAL **advertised listener** based on **Kafka Advertised Listener Pattern** you get from TiDB Cloud for every broker node to help TiDB Cloud differentiate between different brokers. Different EXTERNAL advertised listeners help Kafka clients from TiDB Cloud route requests to the right broker. + + - `` differentiates brokers from Kafka Private Link Service access points. Plan a port range for EXTERNAL advertised listeners of all brokers, for example, `range from 9093`. These ports do not have to be actual ports listened to by brokers. They are ports listened to by the load balancer for Private Link Service that will forward requests to different brokers. + - `AZ ID` in **Kafka Advertised Listener Pattern** indicates where the broker is deployed. TiDB Cloud will route requests to different endpoint DNS names based on the AZ ID. + + It is recommended to configure different broker IDs for different brokers to make it easy for troubleshooting. + +2. Use SSH to log in to each broker node. Modify the configuration file of each broker with the following content: + + ```properties + # brokers in usw2-az1 + + # Add EXTERNAL listener + listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 + + # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section + # 1. The pattern for AZ(ID: usw2-az1) is ".usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:" + # 2. So the EXTERNAL can be "b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093", replace with "b" prefix plus "node.id" properties, replace with a unique port(9093) in EXTERNAL advertised listener ports range + advertised.listeners=...,EXTERNAL://b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 + + # Configure EXTERNAL map + listener.security.protocol.map=...,EXTERNAL:PLAINTEXT + ``` + + ```properties + # brokers in usw2-az2 + + # Add EXTERNAL listener + listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 + + # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section + # 1. The pattern for AZ(ID: usw2-az2) is ".usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:" + # 2. So the EXTERNAL can be "b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094". Replace with "b" prefix plus "node.id" properties, and replace with a unique port(9094) in EXTERNAL advertised listener ports range. + advertised.listeners=...,EXTERNAL://b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 + + # Configure EXTERNAL map + listener.security.protocol.map=...,EXTERNAL:PLAINTEXT + ``` + + ```properties + # brokers in usw2-az3 + + # Add EXTERNAL listener + listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 + + # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section + # 1. The pattern for AZ(ID: usw2-az3) is ".usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:" + # 2. So the EXTERNAL can be "b2.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095". Replace with "b" prefix plus "node.id" properties, and replace with a unique port(9095) in EXTERNAL advertised listener ports range. + advertised.listeners=...,EXTERNAL://b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 + + # Configure EXTERNAL map + listener.security.protocol.map=...,EXTERNAL:PLAINTEXT + ``` + +3. After you reconfigure all the brokers, restart your Kafka brokers one by one. + +#### 2. Test EXTERNAL listener settings in your internal network + +You can download the Kafka and OpenJDK in you Kafka client node. + +```shell +# Download Kafka and OpenJDK, and then extract the files. You can choose the binary version based on your preference. +wget https://archive.apache.org/dist/kafka/3.7.1/kafka_2.13-3.7.1.tgz +tar -zxf kafka_2.13-3.7.1.tgz +wget https://download.java.net/java/GA/jdk22.0.2/c9ecb94cd31b495da20a27d4581645e8/9/GPL/openjdk-22.0.2_linux-x64_bin.tar.gz +tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz +``` + +Execute the following script to test if the bootstrap works as expected. + +```shell +export JAVA_HOME=/home/ec2-user/jdk-22.0.2 + +# Bootstrap from the EXTERNAL listener +./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:39092 + +# Expected output for the last 3 lines (the actual order might be different) +# There will be some exceptions or errors because advertised listeners cannot be resolved in your Kafka network. +# We will make them resolvable in TiDB Cloud side and make it route to the right broker when you create a changefeed connect to this Kafka cluster by Private Link. +b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException +b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException +b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException +``` + +## Step 2. Expose the Kafka cluster as Private Link Service + +### 1. Set up the load balancer + +Create a network load balancer with four target groups with different ports. One target group is for bootstrap, and the others will map to different brokers. + +1. bootstrap target group => 9092 => broker-node1:39092,broker-node2:39092,broker-node3:39092 +2. broker target group 1 => 9093 => broker-node1:39092 +3. broker target group 2 => 9094 => broker-node2:39092 +4. broker target group 3 => 9095 => broker-node3:39092 + +If you have more broker role nodes, you need to add more mappings. Ensure that you have at least one node in the bootstrap target group. It is recommended to add three nodes, one for each AZ for resilience. + +Do the following to set up the load balancer: + +1. Go to [Target groups](https://console.aws.amazon.com/ec2/home#CreateTargetGroup:) to create four target groups. + + - Bootstrap target group + + - **Target type**: `Instances` + - **Target group name**: `bootstrap-target-group` + - **Protocol**: `TCP` + - **Port**: `9092` + - **IP address type**: `IPv4` + - **VPC**: `Kafka VPC` + - **Health check protocol**: `TCP` + - **Register targets**: `broker-node1:39092`, `broker-node2:39092`, `broker-node3:39092` + + - Broker target group 1 + + - **Target type**: `Instances` + - **Target group name**: `broker-target-group-1` + - **Protocol**: `TCP` + - **Port**: `9093` + - **IP address type**: `IPv4` + - **VPC**: `Kafka VPC` + - **Health check protocol**: `TCP` + - **Register targets**: `broker-node1:39092` + + - Broker target group 2 + + - **Target type**: `Instances` + - **Target group name**: `broker-target-group-2` + - **Protocol**: `TCP` + - **Port**: `9094` + - **IP address type**: `IPv4` + - **VPC**: `Kafka VPC` + - **Health check protocol**: `TCP` + - **Register targets**: `broker-node2:39092` + + - Broker target group 3 + + - **Target type**: `Instances` + - **Target group name**: `broker-target-group-3` + - **Protocol**: `TCP` + - **Port**: `9095` + - **IP address type**: `IPv4` + - **VPC**: `Kafka VPC` + - **Health check protocol**: `TCP` + - **Register targets**: `broker-node3:39092` + +2. Go to [Load balancers](https://console.aws.amazon.com/ec2/home#LoadBalancers:) to create a network load balancer. + + - **Load balancer name**: `kafka-lb` + - **Schema**: `Internal` + - **Load balancer IP address type**: `IPv4` + - **VPC**: `Kafka VPC` + - **Availability Zones**: + - `usw2-az1` with `broker-usw2-az1 subnet` + - `usw2-az2` with `broker-usw2-az2 subnet` + - `usw2-az3` with `broker-usw2-az3 subnet` + - **Security groups**: create a new security group with the following rules. + - Inbound rule allows all TCP from Kafka VPC: Type - `{ports of target groups}`, for example, `9092-9095`; Source - `{CIDR of TiDB Cloud}`. To get the CIDR of TiDB Cloud in the region, switch to your target project using the combo box in the upper-left corner of the [TiDB Cloud console](https://tidbcloud.com), click **Project Settings** > **Network Access** in the left navigation pane, and then click **Project CIDR** > **AWS**. + - Outbound rule allows all TCP to Kafka VPC: Type - `All TCP`; Destination - `Anywhere-IPv4` + - Listeners and routing: + - Protocol: `TCP`; Port: `9092`; Forward to: `bootstrap-target-group` + - Protocol: `TCP`; Port: `9093`; Forward to: `broker-target-group-1` + - Protocol: `TCP`; Port: `9094`; Forward to: `broker-target-group-2` + - Protocol: `TCP`; Port: `9095`; Forward to: `broker-target-group-3` + +3. Test the load balancer in the bastion node. This example only tests the Kafka bootstrap. Because the load balancer is listening on the Kafka EXTERNAL listener, the addresses of EXTERNAL advertised listeners can not be resolved in the bastion node. Note down the `kafka-lb` DNS name from the load balancer detail page, for example `kafka-lb-77405fa57191adcb.elb.us-west-2.amazonaws.com`. Execute the script in the bastion node. + + ```shell + # Replace {lb_dns_name} to your actual value + export JAVA_HOME=/home/ec2-user/jdk-22.0.2 + ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {lb_dns_name}:9092 + + # Expected output for the last 3 lines (the actual order might be different) + b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException + + # You can also try bootstrap in other ports 9093/9094/9095. It will succeed probabilistically because NLB in AWS resolves LB DNS to the IP address of any availability zone and disables cross-zone load balancing by default. + # If you enable cross-zone load balancing in LB, it will succeed. However, it is unnecessary and might cause additional cross-AZ traffic. + ``` + +### 2. Set up Private Link Service + +1. Go to [Endpoint service](https://console.aws.amazon.com/vpcconsole/home#EndpointServices:). Click **Create endpoint service** to create a Private Link service for the Kafka load balancer. + + - **Name**: `kafka-pl-service` + - **Load balancer type**: `Network` + - **Load balancers**: `kafka-lb` + - **Included Availability Zones**: `usw2-az1`,`usw2-az2`, `usw2-az3` + - **Require acceptance for endpoint**: `Acceptance required` + - **Enable private DNS name**: `No` + +2. Note down the **Service name**. You need to provide it to TiDB Cloud, for example `com.amazonaws.vpce.us-west-2.vpce-svc-0f49e37e1f022cd45`. + +3. On the detail page of the kafka-pl-service, click the **Allow principals** tab, and allow the AWS account of TiDB Cloud to create the endpoint. You can get the AWS account of TiDB Cloud in [Prerequisites](#prerequisites), for example, `arn:aws:iam:::root`. + +## Step 3. Connect from TiDB Cloud + +1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the instance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). + +2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. + + - **Kafka Type**: `3 AZs`. Ensure that your Kafka cluster is deployed in the same three AZs. + - **Kafka Advertised Listener Pattern**: `abc`. It is the same as the unique random string you use to generate **Kafka Advertised Listener Pattern** in [Prerequisites](#prerequisites). + - **Endpoint Service Name**: the Kafka service name. + - **Bootstrap Ports**: `9092`. A single port is sufficient because you configure a dedicated bootstrap target group behind it. + +3. Proceed with the steps in [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). + +Now you have successfully finished the task. + +## FAQ + +### How to connect to the same Kafka Private Link service from two different TiDB Cloud projects? + +If you have already followed this document to successfully set up the connection from the first project, you can connect to the same Kafka Private Link service from the second project as follows: + +1. Follow instructions from the beginning of this document. + +2. When you proceed to [Step 1. Set up a Kafka cluster](#step-1-set-up-a-kafka-cluster), follow [Reconfigure a running Kafka cluster](#reconfigure-a-running-kafka-cluster) to create another group of EXTERNAL listeners and advertised listeners. You can name it as **EXTERNAL2**. Note that the port range of **EXTERNAL2** cannot overlap with the **EXTERNAL**. + +3. After reconfiguring brokers, add another target group in the load balancer, including the bootstrap and broker target groups. + +4. Configure the TiDB Cloud connection with the following information: + + - New Bootstrap port + - New Kafka Advertised Listener Group + - The same Endpoint Service diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md index 270d6aada47ed..d037ba22b0f65 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md @@ -6,9 +6,11 @@ aliases: ['/tidbcloud/tidb-cloud-billing-tcu'] # Changefeed Billing -## RCU cost + -TiDB Cloud measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. +## RCU cost for TiDB Cloud Dedicate + +TiDB Cloud Dedicate measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. ### Number of TiCDC RCUs @@ -39,6 +41,44 @@ The following table lists the specifications and corresponding replication perfo To learn about the supported regions and the price of TiDB Cloud for each TiCDC RCU, see [Changefeed Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#changefeed-cost). + + + +## CCU cost for TiDB Cloud Premium + +TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for a instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. + +### Number of TiCDC CCUs + +The following table lists the specifications and corresponding replication performances for changefeeds: + +| Specification | Maximum replication performance | +|---------------|---------------------------------| +| 2 CCUs | 5,000 rows/s | +| 4 CCUs | 10,000 rows/s | +| 8 CCUs | 20,000 rows/s | +| 16 CCUs | 40,000 rows/s | +| 24 CCUs | 60,000 rows/s | +| 32 CCUs | 80,000 rows/s | +| 40 CCUs | 100,000 rows/s | +| 64 CCUs | 160,000 rows/s | +| 96 CCUs | 240,000 rows/s | +| 128 CCUs | 320,000 rows/s | +| 192 CCUs | 480,000 rows/s | +| 256 CCUs | 640,000 rows/s | +| 320 CCUs | 800,000 rows/s | +| 384 CCUs | 960,000 rows/s | + +> **Note:** +> +> The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). + +### Price + +As Premium is currently in private preview, you can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. + + + ## Private Data Link cost If you choose the **Private Link** or **Private Service Connect** network connectivity method, additional **Private Data Link** costs will be incurred. These charges fall under the [Data Transfer Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#data-transfer-cost) category. From 00c129a9078347500f010c90402f80d27396767a Mon Sep 17 00:00:00 2001 From: qiancai Date: Fri, 24 Oct 2025 17:44:58 +0800 Subject: [PATCH 02/30] Update setup-aws-self-hosted-kafka-private-link-service.md --- ...-self-hosted-kafka-private-link-service.md | 40 ++++++++++++++++++- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index e183587662098..59848945bdfdf 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -23,7 +23,9 @@ The document provides an example of connecting to a Kafka Private Link service d ## Prerequisites -1. Ensure that you have the following authorization to set up a Kafka Private Link service in your own AWS account. + + +1. Ensure that you have the following authorization to set up a Kafka Private Link service in your own AWS account. - Manage EC2 nodes - Manage VPC @@ -48,6 +50,31 @@ The document provides an example of connecting to a Kafka Private Link service d 1. Input a unique random string. It can only include numbers or lowercase letters. You will use it to generate **Kafka Advertised Listener Pattern** later. 2. Click **Check usage and generate** to check if the random string is unique and generate **Kafka Advertised Listener Pattern** that will be used to assemble the EXTERNAL advertised listener for Kafka brokers. + + + +1. Ensure that you have the following authorization to set up a Kafka Private Link service in your own AWS account. + + - Manage EC2 nodes + - Manage VPC + - Manage subnets + - Manage security groups + - Manage load balancer + - Manage endpoint services + - Connect to EC2 nodes to configure Kafka nodes + +2. [Create a TiDB Cloud Premium instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. + +3. Get the Kafka deployment information from your TiDB Cloud Premium instance. + + 1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the instance overview page of the TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. + 2. On the overview page, find the region of the TiDB instance. Ensure that your Kafka cluster will be deployed to the same region. + 3. To create a changefeed, refer to the tutorials: + + - [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) + + + Note down all the deployment information. You need to use it to configure your Kafka Private Link service later. The following table shows an example of the deployment information. @@ -523,8 +550,17 @@ LOG_DIR=$KAFKA_LOG_DIR nohup $KAFKA_START_CMD "$KAFKA_CONFIG_DIR/server.properti ### Reconfigure a running Kafka cluster + + Ensure that your Kafka cluster is deployed in the same region and AZs as the TiDB cluster. If any brokers are in different AZs, move them to the correct ones. + + + +Ensure that your Kafka cluster is deployed in the same region and AZs as the TiDB instance. If any brokers are in different AZs, move them to the correct ones. + + + #### 1. Configure the EXTERNAL listener for brokers The following configuration applies to a Kafka KRaft cluster. The ZK mode configuration is similar. @@ -729,7 +765,7 @@ Do the following to set up the load balancer: ## Step 3. Connect from TiDB Cloud -1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the cluster to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). +1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the clusterinstance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). 2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. From 36181cf1771ad3262d49988884e1647ef59feb39 Mon Sep 17 00:00:00 2001 From: qiancai Date: Mon, 27 Oct 2025 16:45:09 +0800 Subject: [PATCH 03/30] delete setup-aws-self-hosted-kafka-private-link-service-premium.md as the content is now included in tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md --- ...sted-kafka-private-link-service-premium.md | 756 ------------------ ...-self-hosted-kafka-private-link-service.md | 23 +- 2 files changed, 22 insertions(+), 757 deletions(-) delete mode 100644 tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md deleted file mode 100644 index cfb8a5e994655..0000000000000 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md +++ /dev/null @@ -1,756 +0,0 @@ ---- -title: Set Up Self-Hosted Kafka Private Link Service in AWS -summary: This document explains how to set up Private Link service for self-hosted Kafka in AWS and how to make it work with TiDB Cloud. -aliases: ['/tidbcloud/setup-self-hosted-kafka-private-link-service'] ---- - -# Set Up Self-Hosted Kafka Private Link Service in AWS - -This document describes how to set up Private Link service for self-hosted Kafka in AWS, and how to make it work with TiDB Cloud. - -The mechanism works as follows: - -1. The TiDB Cloud VPC connects to the Kafka VPC through private endpoints. -2. Kafka clients need to communicate directly to all Kafka brokers. -3. Each Kafka broker is mapped to a unique port of endpoints within the TiDB Cloud VPC. -4. Leverage the Kafka bootstrap mechanism and AWS resources to achieve the mapping. - -The following diagram shows the mechanism. - -![Connect to AWS Self-Hosted Kafka Private Link Service](/media/tidb-cloud/changefeed/connect-to-aws-self-hosted-kafka-privatelink-service.jpeg) - -The document provides an example of connecting to a Kafka Private Link service deployed across three availability zones (AZ) in AWS. While other configurations are possible based on similar port-mapping principles, this document covers the fundamental setup process of the Kafka Private Link service. For production environments, a more resilient Kafka Private Link service with enhanced operational maintainability and observability is recommended. - -## Prerequisites - -1. Ensure that you have the following authorization to set up a Kafka Private Link service in your own AWS account. - - - Manage EC2 nodes - - Manage VPC - - Manage subnets - - Manage security groups - - Manage load balancer - - Manage endpoint services - - Connect to EC2 nodes to configure Kafka nodes - -2. [Create a TiDB Cloud Premium instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. - -3. Get the Kafka deployment information from your TiDB Cloud Premium instance. - - 1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the instance overview page of the TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. - 2. On the overview page, find the region of the TiDB instance. Ensure that your Kafka cluster will be deployed to the same region. - 3. To create a changefeed, refer to the tutorials: - - - [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) - -Note down all the deployment information. You need to use it to configure your Kafka Private Link service later. - -The following table shows an example of the deployment information. - -| Information | Value | Note | -|--------|-----------------|---------------------------| -| Region | Oregon (`us-west-2`) | N/A | -| Principal of TiDB Cloud AWS Account | `arn:aws:iam:::root` | N/A | -| AZ IDs |
  • `usw2-az1`
  • `usw2-az2`
  • `usw2-az3`
| Align AZ IDs to AZ names in your AWS account.
Example:
  • `usw2-az1` => `us-west-2a`
  • `usw2-az2` => `us-west-2c`
  • `usw2-az3` => `us-west-2b`
| -| Kafka Advertised Listener Pattern | The unique random string: `abc`
Generated pattern for AZs:
  • `usw2-az1` => <broker_id>.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `usw2-az2` => <broker_id>.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `usw2-az3` => <broker_id>.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
| Map AZ names to AZ-specified patterns. Make sure that you configure the right pattern to the broker in a specific AZ later.
  • `us-west-2a` => <broker_id>.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `us-west-2c` => <broker_id>.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
  • `us-west-2b` => <broker_id>.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:<port>
| - -## Step 1. Set up a Kafka cluster - -If you need to deploy a new cluster, follow the instructions in [Deploy a new Kafka cluster](#deploy-a-new-kafka-cluster). - -If you need to expose an existing cluster, follow the instructions in [Reconfigure a running Kafka cluster](#reconfigure-a-running-kafka-cluster). - -### Deploy a new Kafka cluster - -#### 1. Set up the Kafka VPC - -The Kafka VPC requires the following: - -- Three private subnets for brokers, one for each AZ. -- One public subnet in any AZ with a bastion node that can connect to the internet and three private subnets, which makes it easy to set up the Kafka cluster. In a production environment, you might have your own bastion node that can connect to the Kafka VPC. - -Before creating subnets, create subnets in AZs based on the mappings of AZ IDs and AZ names. Take the following mapping as an example. - -- `usw2-az1` => `us-west-2a` -- `usw2-az2` => `us-west-2c` -- `usw2-az3` => `us-west-2b` - -Create private subnets in the following AZs: - -- `us-west-2a` -- `us-west-2c` -- `us-west-2b` - -Take the following steps to create the Kafka VPC. - -**1.1. Create the Kafka VPC** - -1. Go to [AWS Console > VPC dashboard](https://console.aws.amazon.com/vpcconsole/home?#vpcs:), and switch to the region in which you want to deploy Kafka. - -2. Click **Create VPC**. Fill in the information on the **VPC settings** page as follows. - - 1. Select **VPC only**. - 2. Enter a tag in **Name tag**, for example, `Kafka VPC`. - 3. Select **IPv4 CIDR manual input**, and enter the IPv4 CIDR, for example, `10.0.0.0/16`. - 4. Use the default values for other options. Click **Create VPC**. - 5. On the VPC detail page, take note of the VPC ID, for example, `vpc-01f50b790fa01dffa`. - -**1.2. Create private subnets in the Kafka VPC** - -1. Go to the [Subnets Listing page](https://console.aws.amazon.com/vpcconsole/home?#subnets:). -2. Click **Create subnet**. -3. Select **VPC ID** (`vpc-01f50b790fa01dffa` in this example) that you noted down before. -4. Add three subnets with the following information. It is recommended that you put the AZ IDs in the subnet names to make it easy to configure the brokers later, because TiDB Cloud requires encoding the AZ IDs in the broker's `advertised.listener` configuration. - - - Subnet1 in `us-west-2a` - - **Subnet name**: `broker-usw2-az1` - - **Availability Zone**: `us-west-2a` - - **IPv4 subnet CIDR block**: `10.0.0.0/18` - - - Subnet2 in `us-west-2c` - - **Subnet name**: `broker-usw2-az2` - - **Availability Zone**: `us-west-2c` - - **IPv4 subnet CIDR block**: `10.0.64.0/18` - - - Subnet3 in `us-west-2b` - - **Subnet name**: `broker-usw2-az3` - - **Availability Zone**: `us-west-2b` - - **IPv4 subnet CIDR block**: `10.0.128.0/18` - -5. Click **Create subnet**. The **Subnets Listing** page is displayed. - -**1.3. Create the public subnet in the Kafka VPC** - -1. Click **Create subnet**. -2. Select **VPC ID** (`vpc-01f50b790fa01dffa` in this example) that you noted down before. -3. Add the public subnet in any AZ with the following information: - - - **Subnet name**: `bastion` - - **IPv4 subnet CIDR block**: `10.0.192.0/18` - -4. Configure the bastion subnet to the Public subnet. - - 1. Go to [VPC dashboard > Internet gateways](https://console.aws.amazon.com/vpcconsole/home#igws:). Create an Internet Gateway with the name `kafka-vpc-igw`. - 2. On the **Internet gateways Detail** page, in **Actions**, click **Attach to VPC** to attach the Internet Gateway to the Kafka VPC. - 3. Go to [VPC dashboard > Route tables](https://console.aws.amazon.com/vpcconsole/home#CreateRouteTable:). Create a route table to the Internet Gateway in Kafka VPC and add a new route with the following information: - - - **Name**: `kafka-vpc-igw-route-table` - - **VPC**: `Kafka VPC` - - **Route**: - - **Destination**: `0.0.0.0/0` - - **Target**: `Internet Gateway`, `kafka-vpc-igw` - - 4. Attach the route table to the bastion subnet. On the **Detail** page of the route table, click **Subnet associations > Edit subnet associations** to add the bastion subnet and save changes. - -#### 2. Set up Kafka brokers - -**2.1. Create a bastion node** - -Go to the [EC2 Listing page](https://console.aws.amazon.com/ec2/home#Instances:). Create the bastion node in the bastion subnet. - -- **Name**: `bastion-node` -- **Amazon Machine Image**: `Amazon linux` -- **Instance Type**: `t2.small` -- **Key pair**: `kafka-vpc-key-pair`. Create a new key pair named `kafka-vpc-key-pair`. Download **kafka-vpc-key-pair.pem** to your local for later configuration. -- Network settings - - - **VPC**: `Kafka VPC` - - **Subnet**: `bastion` - - **Auto-assign public IP**: `Enable` - - **Security Group**: create a new security group allow SSH login from anywhere. You can narrow the rule for safety in the production environment. - -**2.2. Create broker nodes** - -Go to the [EC2 Listing page](https://console.aws.amazon.com/ec2/home#Instances:). Create three broker nodes in broker subnets, one for each AZ. - -- Broker 1 in subnet `broker-usw2-az1` - - - **Name**: `broker-node1` - - **Amazon Machine Image**: `Amazon linux` - - **Instance Type**: `t2.large` - - **Key pair**: reuse `kafka-vpc-key-pair` - - Network settings - - - **VPC**: `Kafka VPC` - - **Subnet**: `broker-usw2-az1` - - **Auto-assign public IP**: `Disable` - - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. - - **Protocol**: `TCP` - - **Port range**: `0 - 65535` - - **Source**: `10.0.0.0/16` - -- Broker 2 in subnet `broker-usw2-az2` - - - **Name**: `broker-node2` - - **Amazon Machine Image**: `Amazon linux` - - **Instance Type**: `t2.large` - - **Key pair**: reuse `kafka-vpc-key-pair` - - Network settings - - - **VPC**: `Kafka VPC` - - **Subnet**: `broker-usw2-az2` - - **Auto-assign public IP**: `Disable` - - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. - - **Protocol**: `TCP` - - **Port range**: `0 - 65535` - - **Source**: `10.0.0.0/16` - -- Broker 3 in subnet `broker-usw2-az3` - - - **Name**: `broker-node3` - - **Amazon Machine Image**: `Amazon linux` - - **Instance Type**: `t2.large` - - **Key pair**: reuse `kafka-vpc-key-pair` - - Network settings - - - **VPC**: `Kafka VPC` - - **Subnet**: `broker-usw2-az3` - - **Auto-assign public IP**: `Disable` - - **Security Group**: create a new security group to allow all TCP from Kafka VPC. You can narrow the rule for safety in the production environment. - - **Protocol**: `TCP` - - **Port range**: `0 - 65535` - - **Source**: `10.0.0.0/16` - -**2.3. Prepare Kafka runtime binaries** - -1. Go to the detail page of the bastion node. Get the **Public IPv4 address**. Use SSH to log in to the node with the previously downloaded `kafka-vpc-key-pair.pem`. - - ```shell - chmod 400 kafka-vpc-key-pair.pem - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{bastion_public_ip} # replace {bastion_public_ip} with the IP address of your bastion node, for example, 54.186.149.187 - scp -i "kafka-vpc-key-pair.pem" kafka-vpc-key-pair.pem ec2-user@{bastion_public_ip}:~/ - ``` - -2. Download binaries. - - ```shell - # Download Kafka and OpenJDK, and then extract the files. You can choose the binary version based on your preference. - wget https://archive.apache.org/dist/kafka/3.7.1/kafka_2.13-3.7.1.tgz - tar -zxf kafka_2.13-3.7.1.tgz - wget https://download.java.net/java/GA/jdk22.0.2/c9ecb94cd31b495da20a27d4581645e8/9/GPL/openjdk-22.0.2_linux-x64_bin.tar.gz - tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz - ``` - -3. Copy binaries to each broker node. - - ```shell - # Replace {broker-node1-ip} with your broker-node1 IP address - scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node1-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node1-ip} "tar -zxf kafka_2.13-3.7.1.tgz" - scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node1-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node1-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" - - # Replace {broker-node2-ip} with your broker-node2 IP address - scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node2-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node2-ip} "tar -zxf kafka_2.13-3.7.1.tgz" - scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node2-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node2-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" - - # Replace {broker-node3-ip} with your broker-node3 IP address - scp -i "kafka-vpc-key-pair.pem" kafka_2.13-3.7.1.tgz ec2-user@{broker-node3-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node3-ip} "tar -zxf kafka_2.13-3.7.1.tgz" - scp -i "kafka-vpc-key-pair.pem" openjdk-22.0.2_linux-x64_bin.tar.gz ec2-user@{broker-node3-ip}:~/ - ssh -i "kafka-vpc-key-pair.pem" ec2-user@{broker-node3-ip} "tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz" - ``` - -**2.4. Set up Kafka nodes on each broker node** - -**2.4.1 Set up a KRaft Kafka cluster with three nodes** - -Each node will act as a broker and controller role. Do the following for each broker: - -1. For the `listeners` item, all three brokers are the same and act as broker and controller roles: - - 1. Configure the same CONTROLLER listener for all **controller** role nodes. If you only want to add the **broker** role nodes, you do not need the CONTROLLER listener in `server.properties`. - 2. Configure two **broker** listeners, `INTERNAL` for internal access and `EXTERNAL` for external access from TiDB Cloud. - -2. For the `advertised.listeners` item, do the following: - - 1. Configure an INTERNAL advertised listener for every broker with the internal IP of the broker node. Advertised internal Kafka clients use this address to visit the broker. - 2. Configure an EXTERNAL advertised listener based on **Kafka Advertised Listener Pattern** you get from TiDB Cloud for each broker node to help TiDB Cloud differentiate between different brokers. Different EXTERNAL advertised listeners help the Kafka client from TiDB Cloud route requests to the right broker. - - - `` differentiates brokers from Kafka Private Link Service access points. Plan a port range for EXTERNAL advertised listeners of all brokers. These ports do not have to be actual ports listened to by brokers. They are ports listened to by the load balancer for Private Link Service that will forward requests to different brokers. - - `AZ ID` in **Kafka Advertised Listener Pattern** indicates where the broker is deployed. TiDB Cloud will route requests to different endpoint DNS names based on the AZ ID. - - It is recommended to configure different broker IDs for different brokers to make it easy for troubleshooting. - -3. The planning values are as follows: - - - **CONTROLLER port**: `29092` - - **INTERNAL port**: `9092` - - **EXTERNAL**: `39092` - - **EXTERNAL advertised listener ports range**: `9093~9095` - -**2.4.2. Create a configuration file** - -Use SSH to log in to every broker node. Create a configuration file `~/config/server.properties` with the following content. - -```properties -# brokers in usw2-az1 - -# broker-node1 ~/config/server.properties -# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. -# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. -# 2.1 The pattern for AZ(ID: usw2-az1) is ".usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:". -# 2.2 So the EXTERNAL can be "b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9093) in the port range of the EXTERNAL advertised listener. -# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. -process.roles=broker,controller -node.id=1 -controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 -listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 -inter.broker.listener.name=INTERNAL -advertised.listeners=INTERNAL://{broker-node1-ip}:9092,EXTERNAL://b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 -controller.listener.names=CONTROLLER -listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL -log.dirs=./data -``` - -```properties -# brokers in usw2-az2 - -# broker-node2 ~/config/server.properties -# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. -# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. -# 2.1 The pattern for AZ(ID: usw2-az2) is ".usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:". -# 2.2 So the EXTERNAL can be "b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9094) in the port range of the EXTERNAL advertised listener. -# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. -process.roles=broker,controller -node.id=2 -controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 -listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 -inter.broker.listener.name=INTERNAL -advertised.listeners=INTERNAL://{broker-node2-ip}:9092,EXTERNAL://b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 -controller.listener.names=CONTROLLER -listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL -log.dirs=./data -``` - -```properties -# brokers in usw2-az3 - -# broker-node3 ~/config/server.properties -# 1. Replace {broker-node1-ip}, {broker-node2-ip}, {broker-node3-ip} with the actual IP addresses. -# 2. Configure EXTERNAL in "advertised.listeners" based on the "Kafka Advertised Listener Pattern" in the "Prerequisites" section. -# 2.1 The pattern for AZ(ID: usw2-az3) is ".usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:". -# 2.2 So the EXTERNAL can be "b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095". Replace with "b" prefix plus "node.id" properties, and replace with a unique port (9095) in the port range of the EXTERNAL advertised listener. -# 2.3 If there are more broker role nodes in the same AZ, you can configure them in the same way. -process.roles=broker,controller -node.id=3 -controller.quorum.voters=1@{broker-node1-ip}:29092,2@{broker-node2-ip}:29092,3@{broker-node3-ip}:29092 -listeners=INTERNAL://0.0.0.0:9092,CONTROLLER://0.0.0.0:29092,EXTERNAL://0.0.0.0:39092 -inter.broker.listener.name=INTERNAL -advertised.listeners=INTERNAL://{broker-node3-ip}:9092,EXTERNAL://b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 -controller.listener.names=CONTROLLER -listener.security.protocol.map=INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL -log.dirs=./data -``` - -**2.4.3 Start Kafka brokers** - -Create a script, and then execute it to start the Kafka broker in each broker node. - -```shell -#!/bin/bash - -# Get the directory of the current script -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -# Set JAVA_HOME to the Java installation within the script directory -export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" -# Define the vars -KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" -KAFKA_STORAGE_CMD=$KAFKA_DIR/kafka-storage.sh -KAFKA_START_CMD=$KAFKA_DIR/kafka-server-start.sh -KAFKA_DATA_DIR=$SCRIPT_DIR/data -KAFKA_LOG_DIR=$SCRIPT_DIR/log -KAFKA_CONFIG_DIR=$SCRIPT_DIR/config - -# Cleanup step, which makes it easy for multiple experiments -# Find all Kafka process IDs -KAFKA_PIDS=$(ps aux | grep 'kafka.Kafka' | grep -v grep | awk '{print $2}') -if [ -z "$KAFKA_PIDS" ]; then - echo "No Kafka processes are running." -else - # Kill each Kafka process - echo "Killing Kafka processes with PIDs: $KAFKA_PIDS" - for PID in $KAFKA_PIDS; do - kill -9 $PID - echo "Killed Kafka process with PID: $PID" - done - echo "All Kafka processes have been killed." -fi - -rm -rf $KAFKA_DATA_DIR -mkdir -p $KAFKA_DATA_DIR -rm -rf $KAFKA_LOG_DIR -mkdir -p $KAFKA_LOG_DIR - -# Magic id: BRl69zcmTFmiPaoaANybiw, you can use your own -$KAFKA_STORAGE_CMD format -t "BRl69zcmTFmiPaoaANybiw" -c "$KAFKA_CONFIG_DIR/server.properties" > $KAFKA_LOG_DIR/server_format.log -LOG_DIR=$KAFKA_LOG_DIR nohup $KAFKA_START_CMD "$KAFKA_CONFIG_DIR/server.properties" & -``` - -**2.5. Test the cluster setting in the bastion node** - -1. Test the Kafka bootstrap. - - ```shell - export JAVA_HOME=/home/ec2-user/jdk-22.0.2 - - # Bootstrap from INTERNAL listener - ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:9092 | grep 9092 - # Expected output (the actual order might be different) - {broker-node1-ip}:9092 (id: 1 rack: null) -> ( - {broker-node2-ip}:9092 (id: 2 rack: null) -> ( - {broker-node3-ip}:9092 (id: 3 rack: null) -> ( - - # Bootstrap from EXTERNAL listener - ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:39092 - # Expected output for the last 3 lines (the actual order might be different) - # The difference in the output from "bootstrap from INTERNAL listener" is that exceptions or errors might occur because advertised listeners cannot be resolved in Kafka VPC. - # We will make them resolvable in TiDB Cloud side and make it route to the right broker when you create a changefeed connect to this Kafka cluster by Private Link. - b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - ``` - -2. Create a producer script `produce.sh` in the bastion node. - - ```shell - #!/bin/bash - BROKER_LIST=$1 # "{broker_address1},{broker_address2}..." - - # Get the directory of the current script - SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" - # Set JAVA_HOME to the Java installation within the script directory - export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" - # Define the Kafka directory - KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" - TOPIC="test-topic" - - # Create a topic if it does not exist - create_topic() { - echo "Creating topic if it does not exist..." - $KAFKA_DIR/kafka-topics.sh --create --topic $TOPIC --bootstrap-server $BROKER_LIST --if-not-exists --partitions 3 --replication-factor 3 - } - - # Produce messages to the topic - produce_messages() { - echo "Producing messages to the topic..." - for ((chrono=1; chrono <= 10; chrono++)); do - message="Test message "$chrono - echo "Create "$message - echo $message | $KAFKA_DIR/kafka-console-producer.sh --broker-list $BROKER_LIST --topic $TOPIC - done - } - create_topic - produce_messages - ``` - -3. Create a consumer script `consume.sh` in the bastion node. - - ```shell - #!/bin/bash - - BROKER_LIST=$1 # "{broker_address1},{broker_address2}..." - - # Get the directory of the current script - SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" - # Set JAVA_HOME to the Java installation within the script directory - export JAVA_HOME="$SCRIPT_DIR/jdk-22.0.2" - # Define the Kafka directory - KAFKA_DIR="$SCRIPT_DIR/kafka_2.13-3.7.1/bin" - TOPIC="test-topic" - CONSUMER_GROUP="test-group" - # Consume messages from the topic - consume_messages() { - echo "Consuming messages from the topic..." - $KAFKA_DIR/kafka-console-consumer.sh --bootstrap-server $BROKER_LIST --topic $TOPIC --from-beginning --timeout-ms 5000 --consumer-property group.id=$CONSUMER_GROUP - } - consume_messages - ``` - -4. Execute `produce.sh` and `consume.sh` to verify that the Kafka cluster is running. These scripts will also be reused for later network connection testing. The script will create a topic with `--partitions 3 --replication-factor 3`. Ensure that all these three brokers contain data. Ensure that the script will connect to all three brokers to guarantee that network connection will be tested. - - ```shell - # Test write message. - ./produce.sh {one_of_broker_ip}:9092 - ``` - - ```shell - # Expected output - Creating topic if it does not exist... - - Producing messages to the topic... - Create Test message 1 - >>Create Test message 2 - >>Create Test message 3 - >>Create Test message 4 - >>Create Test message 5 - >>Create Test message 6 - >>Create Test message 7 - >>Create Test message 8 - >>Create Test message 9 - >>Create Test message 10 - ``` - - ```shell - # Test read message - ./consume.sh {one_of_broker_ip}:9092 - ``` - - ```shell - # Expected example output (the actual message order might be different) - Consuming messages from the topic... - Test message 3 - Test message 4 - Test message 5 - Test message 9 - Test message 10 - Test message 6 - Test message 8 - Test message 1 - Test message 2 - Test message 7 - [2024-11-01 08:54:27,547] ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$) - org.apache.kafka.common.errors.TimeoutException - Processed a total of 10 messages - ``` - -### Reconfigure a running Kafka cluster - -Ensure that your Kafka cluster is deployed in the same region and AZs as the TiDB instance. If any brokers are in different AZs, move them to the correct ones. - -#### 1. Configure the EXTERNAL listener for brokers - -The following configuration applies to a Kafka KRaft cluster. The ZK mode configuration is similar. - -1. Plan configuration changes. - - 1. Configure an EXTERNAL **listener** for every broker for external access from TiDB Cloud. Select a unique port as the EXTERNAL port, for example, `39092`. - 2. Configure an EXTERNAL **advertised listener** based on **Kafka Advertised Listener Pattern** you get from TiDB Cloud for every broker node to help TiDB Cloud differentiate between different brokers. Different EXTERNAL advertised listeners help Kafka clients from TiDB Cloud route requests to the right broker. - - - `` differentiates brokers from Kafka Private Link Service access points. Plan a port range for EXTERNAL advertised listeners of all brokers, for example, `range from 9093`. These ports do not have to be actual ports listened to by brokers. They are ports listened to by the load balancer for Private Link Service that will forward requests to different brokers. - - `AZ ID` in **Kafka Advertised Listener Pattern** indicates where the broker is deployed. TiDB Cloud will route requests to different endpoint DNS names based on the AZ ID. - - It is recommended to configure different broker IDs for different brokers to make it easy for troubleshooting. - -2. Use SSH to log in to each broker node. Modify the configuration file of each broker with the following content: - - ```properties - # brokers in usw2-az1 - - # Add EXTERNAL listener - listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 - - # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section - # 1. The pattern for AZ(ID: usw2-az1) is ".usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:" - # 2. So the EXTERNAL can be "b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093", replace with "b" prefix plus "node.id" properties, replace with a unique port(9093) in EXTERNAL advertised listener ports range - advertised.listeners=...,EXTERNAL://b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 - - # Configure EXTERNAL map - listener.security.protocol.map=...,EXTERNAL:PLAINTEXT - ``` - - ```properties - # brokers in usw2-az2 - - # Add EXTERNAL listener - listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 - - # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section - # 1. The pattern for AZ(ID: usw2-az2) is ".usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:" - # 2. So the EXTERNAL can be "b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094". Replace with "b" prefix plus "node.id" properties, and replace with a unique port(9094) in EXTERNAL advertised listener ports range. - advertised.listeners=...,EXTERNAL://b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 - - # Configure EXTERNAL map - listener.security.protocol.map=...,EXTERNAL:PLAINTEXT - ``` - - ```properties - # brokers in usw2-az3 - - # Add EXTERNAL listener - listeners=INTERNAL:...,EXTERNAL://0.0.0.0:39092 - - # Add EXTERNAL advertised listeners based on the "Kafka Advertised Listener Pattern" in "Prerequisites" section - # 1. The pattern for AZ(ID: usw2-az3) is ".usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:" - # 2. So the EXTERNAL can be "b2.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095". Replace with "b" prefix plus "node.id" properties, and replace with a unique port(9095) in EXTERNAL advertised listener ports range. - advertised.listeners=...,EXTERNAL://b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 - - # Configure EXTERNAL map - listener.security.protocol.map=...,EXTERNAL:PLAINTEXT - ``` - -3. After you reconfigure all the brokers, restart your Kafka brokers one by one. - -#### 2. Test EXTERNAL listener settings in your internal network - -You can download the Kafka and OpenJDK in you Kafka client node. - -```shell -# Download Kafka and OpenJDK, and then extract the files. You can choose the binary version based on your preference. -wget https://archive.apache.org/dist/kafka/3.7.1/kafka_2.13-3.7.1.tgz -tar -zxf kafka_2.13-3.7.1.tgz -wget https://download.java.net/java/GA/jdk22.0.2/c9ecb94cd31b495da20a27d4581645e8/9/GPL/openjdk-22.0.2_linux-x64_bin.tar.gz -tar -zxf openjdk-22.0.2_linux-x64_bin.tar.gz -``` - -Execute the following script to test if the bootstrap works as expected. - -```shell -export JAVA_HOME=/home/ec2-user/jdk-22.0.2 - -# Bootstrap from the EXTERNAL listener -./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {one_of_broker_ip}:39092 - -# Expected output for the last 3 lines (the actual order might be different) -# There will be some exceptions or errors because advertised listeners cannot be resolved in your Kafka network. -# We will make them resolvable in TiDB Cloud side and make it route to the right broker when you create a changefeed connect to this Kafka cluster by Private Link. -b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException -b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException -b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException -``` - -## Step 2. Expose the Kafka cluster as Private Link Service - -### 1. Set up the load balancer - -Create a network load balancer with four target groups with different ports. One target group is for bootstrap, and the others will map to different brokers. - -1. bootstrap target group => 9092 => broker-node1:39092,broker-node2:39092,broker-node3:39092 -2. broker target group 1 => 9093 => broker-node1:39092 -3. broker target group 2 => 9094 => broker-node2:39092 -4. broker target group 3 => 9095 => broker-node3:39092 - -If you have more broker role nodes, you need to add more mappings. Ensure that you have at least one node in the bootstrap target group. It is recommended to add three nodes, one for each AZ for resilience. - -Do the following to set up the load balancer: - -1. Go to [Target groups](https://console.aws.amazon.com/ec2/home#CreateTargetGroup:) to create four target groups. - - - Bootstrap target group - - - **Target type**: `Instances` - - **Target group name**: `bootstrap-target-group` - - **Protocol**: `TCP` - - **Port**: `9092` - - **IP address type**: `IPv4` - - **VPC**: `Kafka VPC` - - **Health check protocol**: `TCP` - - **Register targets**: `broker-node1:39092`, `broker-node2:39092`, `broker-node3:39092` - - - Broker target group 1 - - - **Target type**: `Instances` - - **Target group name**: `broker-target-group-1` - - **Protocol**: `TCP` - - **Port**: `9093` - - **IP address type**: `IPv4` - - **VPC**: `Kafka VPC` - - **Health check protocol**: `TCP` - - **Register targets**: `broker-node1:39092` - - - Broker target group 2 - - - **Target type**: `Instances` - - **Target group name**: `broker-target-group-2` - - **Protocol**: `TCP` - - **Port**: `9094` - - **IP address type**: `IPv4` - - **VPC**: `Kafka VPC` - - **Health check protocol**: `TCP` - - **Register targets**: `broker-node2:39092` - - - Broker target group 3 - - - **Target type**: `Instances` - - **Target group name**: `broker-target-group-3` - - **Protocol**: `TCP` - - **Port**: `9095` - - **IP address type**: `IPv4` - - **VPC**: `Kafka VPC` - - **Health check protocol**: `TCP` - - **Register targets**: `broker-node3:39092` - -2. Go to [Load balancers](https://console.aws.amazon.com/ec2/home#LoadBalancers:) to create a network load balancer. - - - **Load balancer name**: `kafka-lb` - - **Schema**: `Internal` - - **Load balancer IP address type**: `IPv4` - - **VPC**: `Kafka VPC` - - **Availability Zones**: - - `usw2-az1` with `broker-usw2-az1 subnet` - - `usw2-az2` with `broker-usw2-az2 subnet` - - `usw2-az3` with `broker-usw2-az3 subnet` - - **Security groups**: create a new security group with the following rules. - - Inbound rule allows all TCP from Kafka VPC: Type - `{ports of target groups}`, for example, `9092-9095`; Source - `{CIDR of TiDB Cloud}`. To get the CIDR of TiDB Cloud in the region, switch to your target project using the combo box in the upper-left corner of the [TiDB Cloud console](https://tidbcloud.com), click **Project Settings** > **Network Access** in the left navigation pane, and then click **Project CIDR** > **AWS**. - - Outbound rule allows all TCP to Kafka VPC: Type - `All TCP`; Destination - `Anywhere-IPv4` - - Listeners and routing: - - Protocol: `TCP`; Port: `9092`; Forward to: `bootstrap-target-group` - - Protocol: `TCP`; Port: `9093`; Forward to: `broker-target-group-1` - - Protocol: `TCP`; Port: `9094`; Forward to: `broker-target-group-2` - - Protocol: `TCP`; Port: `9095`; Forward to: `broker-target-group-3` - -3. Test the load balancer in the bastion node. This example only tests the Kafka bootstrap. Because the load balancer is listening on the Kafka EXTERNAL listener, the addresses of EXTERNAL advertised listeners can not be resolved in the bastion node. Note down the `kafka-lb` DNS name from the load balancer detail page, for example `kafka-lb-77405fa57191adcb.elb.us-west-2.amazonaws.com`. Execute the script in the bastion node. - - ```shell - # Replace {lb_dns_name} to your actual value - export JAVA_HOME=/home/ec2-user/jdk-22.0.2 - ./kafka_2.13-3.7.1/bin/kafka-broker-api-versions.sh --bootstrap-server {lb_dns_name}:9092 - - # Expected output for the last 3 lines (the actual order might be different) - b1.usw2-az1.abc.us-west-2.aws.3199015.tidbcloud.com:9093 (id: 1 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - b2.usw2-az2.abc.us-west-2.aws.3199015.tidbcloud.com:9094 (id: 2 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - b3.usw2-az3.abc.us-west-2.aws.3199015.tidbcloud.com:9095 (id: 3 rack: null) -> ERROR: org.apache.kafka.common.errors.DisconnectException - - # You can also try bootstrap in other ports 9093/9094/9095. It will succeed probabilistically because NLB in AWS resolves LB DNS to the IP address of any availability zone and disables cross-zone load balancing by default. - # If you enable cross-zone load balancing in LB, it will succeed. However, it is unnecessary and might cause additional cross-AZ traffic. - ``` - -### 2. Set up Private Link Service - -1. Go to [Endpoint service](https://console.aws.amazon.com/vpcconsole/home#EndpointServices:). Click **Create endpoint service** to create a Private Link service for the Kafka load balancer. - - - **Name**: `kafka-pl-service` - - **Load balancer type**: `Network` - - **Load balancers**: `kafka-lb` - - **Included Availability Zones**: `usw2-az1`,`usw2-az2`, `usw2-az3` - - **Require acceptance for endpoint**: `Acceptance required` - - **Enable private DNS name**: `No` - -2. Note down the **Service name**. You need to provide it to TiDB Cloud, for example `com.amazonaws.vpce.us-west-2.vpce-svc-0f49e37e1f022cd45`. - -3. On the detail page of the kafka-pl-service, click the **Allow principals** tab, and allow the AWS account of TiDB Cloud to create the endpoint. You can get the AWS account of TiDB Cloud in [Prerequisites](#prerequisites), for example, `arn:aws:iam:::root`. - -## Step 3. Connect from TiDB Cloud - -1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the instance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). - -2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. - - - **Kafka Type**: `3 AZs`. Ensure that your Kafka cluster is deployed in the same three AZs. - - **Kafka Advertised Listener Pattern**: `abc`. It is the same as the unique random string you use to generate **Kafka Advertised Listener Pattern** in [Prerequisites](#prerequisites). - - **Endpoint Service Name**: the Kafka service name. - - **Bootstrap Ports**: `9092`. A single port is sufficient because you configure a dedicated bootstrap target group behind it. - -3. Proceed with the steps in [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). - -Now you have successfully finished the task. - -## FAQ - -### How to connect to the same Kafka Private Link service from two different TiDB Cloud projects? - -If you have already followed this document to successfully set up the connection from the first project, you can connect to the same Kafka Private Link service from the second project as follows: - -1. Follow instructions from the beginning of this document. - -2. When you proceed to [Step 1. Set up a Kafka cluster](#step-1-set-up-a-kafka-cluster), follow [Reconfigure a running Kafka cluster](#reconfigure-a-running-kafka-cluster) to create another group of EXTERNAL listeners and advertised listeners. You can name it as **EXTERNAL2**. Note that the port range of **EXTERNAL2** cannot overlap with the **EXTERNAL**. - -3. After reconfiguring brokers, add another target group in the load balancer, including the bootstrap and broker target groups. - -4. Configure the TiDB Cloud connection with the following information: - - - New Bootstrap port - - New Kafka Advertised Listener Group - - The same Endpoint Service diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index 59848945bdfdf..58bb4740d84c8 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -765,7 +765,26 @@ Do the following to set up the load balancer: ## Step 3. Connect from TiDB Cloud -1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the clusterinstance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). + + +1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the cluster to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). + +2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. + + - **Kafka Type**: `3 AZs`. Ensure that your Kafka cluster is deployed in the same three AZs. + - **Kafka Advertised Listener Pattern**: `abc`. It is the same as the unique random string you use to generate **Kafka Advertised Listener Pattern** in [Prerequisites](#prerequisites). + - **Endpoint Service Name**: the Kafka service name. + - **Bootstrap Ports**: `9092`. A single port is sufficient because you configure a dedicated bootstrap target group behind it. + +3. Proceed with the steps in [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). + +Now you have successfully finished the task. + + + + + +1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the instance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). 2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. @@ -778,6 +797,8 @@ Do the following to set up the load balancer: Now you have successfully finished the task. + + ## FAQ ### How to connect to the same Kafka Private Link service from two different TiDB Cloud projects? From 7dcd7c405cb2e5b7c4165c95a2438e01b7108ed7 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 27 Oct 2025 16:47:03 +0800 Subject: [PATCH 04/30] Apply suggestions from code review Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- tidb-cloud/tidb-cloud-billing-ticdc-rcu.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md index d037ba22b0f65..c4110563fa798 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md @@ -8,9 +8,11 @@ aliases: ['/tidbcloud/tidb-cloud-billing-tcu'] -## RCU cost for TiDB Cloud Dedicate +## RCU cost for TiDB Cloud Dedicated + + +TiDB Cloud Dedicated measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. -TiDB Cloud Dedicate measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. ### Number of TiCDC RCUs @@ -46,7 +48,8 @@ To learn about the supported regions and the price of TiDB Cloud for each TiCDC ## CCU cost for TiDB Cloud Premium -TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for a instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. +TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. + ### Number of TiCDC CCUs From 32a665546c266dc67ca1066d5cac9e6b22708b0e Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 27 Oct 2025 16:49:05 +0800 Subject: [PATCH 05/30] wrap ali content with custom content and make minor wording updates Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- .../set-up-sink-private-endpoint-premium.md | 20 ++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/tidb-cloud/set-up-sink-private-endpoint-premium.md b/tidb-cloud/set-up-sink-private-endpoint-premium.md index 1865e2d88b798..6f30e04ee0435 100644 --- a/tidb-cloud/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/set-up-sink-private-endpoint-premium.md @@ -5,7 +5,7 @@ summary: Learn how to set up a private endpoint for changefeeds. # Set Up Private Endpoint for Changefeeds -This document describes how to create a private endpoint for changefeeds in your TiDB Cloud Premium instance, enabling you to securely stream data to self-hosted Kafka or MySQL through private connectivity. +This document describes how to create a private endpoint for changefeeds in your TiDB Cloud Premium instances, enabling you to securely stream data to self-hosted Kafka or MySQL through private connectivity. ## Prerequisites @@ -17,13 +17,14 @@ This document describes how to create a private endpoint for changefeeds in your Only users with any of the following roles in your organization can create private endpoints for changefeeds: - `Organization Owner` -- `Instance Admin` for corresponding instance +- `Instance Admin` for the corresponding instance + For more information about roles in TiDB Cloud, see [User roles](/tidb-cloud/manage-user-access.md#user-roles). ### Network -Private endpoints leverage **Private Link** technology from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. +Private endpoints leverage the **Private Link** technology from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC.
@@ -36,6 +37,9 @@ If your changefeed downstream service is hosted on AWS, collect the following in If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka instance as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md#step-2-expose-the-kafka-instance-as-private-link-service) to set up the load balancer and the Private Link Service.
+ + +
If your changefeed downstream service is hosted on Alibaba Cloud, collect the following information: @@ -44,6 +48,7 @@ If your changefeed downstream service is hosted on Alibaba Cloud, collect the fo - The availability zones (AZs) where your downstream service is deployed
+
@@ -51,7 +56,7 @@ If your changefeed downstream service is hosted on Alibaba Cloud, collect the fo 1. Log in to the [TiDB Cloud console](https://tidbcloud.com/). -2. On the **instances** page, click the name of your target instance to go to its overview page. +2. On the [**TiDB Instances**](https://tidbcloud.com/tidbs) page, click the name of your target instance to go to its overview page. > **Tip:** > @@ -80,11 +85,15 @@ The configuration steps vary depending on the cloud provider where your instance 8. Click **Create** to validate the configurations and create the private endpoint. + + +
1. On the **Networking** page, click **Create Private Endpoint** in the **Alibaba Cloud Private Endpoint for Changefeed** section. 2. In the **Create Private Endpoint for Changefeed** dialog, enter a name for the private endpoint. -3. Follow the reminder to to whitelist TiDB Cloud's Alibaba Cloud account ID for your endpoint service to grant the TiDB Cloud VPC access. +3. Follow the reminder to whitelist TiDB Cloud's Alibaba Cloud account ID for your endpoint service to grant the TiDB Cloud VPC access. + 4. Enter the **Endpoint Service Name** that you collected in the [Network](#network) section. 5. Select the **Number of AZs**. Ensure that the number of AZs and the AZ IDs match your Kafka deployment. 6. If this private endpoint is created for Apache Kafka, enable the **Advertised Listener for Kafka** option. @@ -96,4 +105,5 @@ The configuration steps vary depending on the cloud provider where your instance 8. Click **Create** to validate the configurations and create the private endpoint.
+
From 7bd4f0e9cf56c8c9ace1a3236945c7c8c0164e08 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 27 Oct 2025 17:27:58 +0800 Subject: [PATCH 06/30] Apply suggestions from code review Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- tidb-cloud/changefeed-overview-premium.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/tidb-cloud/changefeed-overview-premium.md b/tidb-cloud/changefeed-overview-premium.md index 4f14bbf34039b..692aededc9a61 100644 --- a/tidb-cloud/changefeed-overview-premium.md +++ b/tidb-cloud/changefeed-overview-premium.md @@ -40,14 +40,16 @@ To create a changefeed, refer to the tutorials: ## Scale a changefeed -You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changfeed. +You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changefeed. + 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. 2. Locate the corresponding changefeed you want to scale, and click **...** > **Scale Up/Down** in the **Action** column. 3. Select a new specification. 4. Click **Submit**. -It takes about 10 minutes to complete the scaling process (during which the changfeed works normally) and a few seconds to switch to the new specification (during which the changefeed will be paused and resumed automatically). +It takes about 10 minutes to complete the scaling process (during which the changefeed works normally) and a few seconds to switch to the new specification (during which the changefeed will be paused and resumed automatically). + ## Pause or resume a changefeed From 547a121713618652a0bfa91d698318fa531dd8ba Mon Sep 17 00:00:00 2001 From: qiancai Date: Mon, 27 Oct 2025 17:43:55 +0800 Subject: [PATCH 07/30] Update setup-aws-self-hosted-kafka-private-link-service.md --- ...-self-hosted-kafka-private-link-service.md | 23 +------------------ 1 file changed, 1 insertion(+), 22 deletions(-) diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index 58bb4740d84c8..59848945bdfdf 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -765,26 +765,7 @@ Do the following to set up the load balancer: ## Step 3. Connect from TiDB Cloud - - -1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the cluster to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). - -2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. - - - **Kafka Type**: `3 AZs`. Ensure that your Kafka cluster is deployed in the same three AZs. - - **Kafka Advertised Listener Pattern**: `abc`. It is the same as the unique random string you use to generate **Kafka Advertised Listener Pattern** in [Prerequisites](#prerequisites). - - **Endpoint Service Name**: the Kafka service name. - - **Bootstrap Ports**: `9092`. A single port is sufficient because you configure a dedicated bootstrap target group behind it. - -3. Proceed with the steps in [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). - -Now you have successfully finished the task. - - - - - -1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the instance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). +1. Return to the [TiDB Cloud console](https://tidbcloud.com) to create a changefeed for the clusterinstance to connect to the Kafka cluster by **Private Link**. For more information, see [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md). 2. When you proceed to **Configure the changefeed target > Connectivity Method > Private Link**, fill in the following fields with corresponding values and other fields as needed. @@ -797,8 +778,6 @@ Now you have successfully finished the task. Now you have successfully finished the task. - - ## FAQ ### How to connect to the same Kafka Private Link service from two different TiDB Cloud projects? From 092d87f7e57339a96c2f57672a5006043be55be0 Mon Sep 17 00:00:00 2001 From: qiancai Date: Mon, 27 Oct 2025 18:03:41 +0800 Subject: [PATCH 08/30] Update changefeed-overview.md --- tidb-cloud/changefeed-overview.md | 37 +++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 7 deletions(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index d7343c97464d2..10f26f63164d7 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -9,7 +9,7 @@ TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data servic > **Note:** > -> - Currently, TiDB Cloud only allows up to 100 changefeeds per cluster. +> - Currently, TiDB Cloud only allows up to 100 changefeeds per clusterinstance. > - Currently, TiDB Cloud only allows up to 100 table filter rules per changefeed. > - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. @@ -17,13 +17,13 @@ TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data servic To access the changefeed feature, take the following steps: -1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project. +1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project.In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. > **Tip:** > > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. -2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. +2. Click the name of your target clusterinstance to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. On the **Changefeed** page, you can create a changefeed, view a list of existing changefeeds, and operate the existing changefeeds (such as scaling, pausing, resuming, editing, and deleting a changefeed). @@ -38,12 +38,28 @@ To create a changefeed, refer to the tutorials: ## Query Changefeed RCUs + + +For TiDB Cloud Dedicated, you can query the TiCDC Replication Capacity Units (RCUs) of a changefeed. + 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. 3. You can see the current TiCDC Replication Capacity Units (RCUs) in the **Specification** area of the page. + + + +For TiDB Cloud Premium, you can query the TiCDC Changefeed Capacity Units (CCUs) of a changefeed. + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. +3. You can see the current TiCDC Changefeed Capacity Units (CCUs) in the **Specification** area of the page. + + ## Scale a changefeed + + You can change the TiCDC Replication Capacity Units (RCUs) of a changefeed by scaling up or down the changfeed. > **Note:** @@ -51,7 +67,14 @@ You can change the TiCDC Replication Capacity Units (RCUs) of a changefeed by sc > - To scale a changefeed for a cluster, make sure that all changefeeds for this cluster are created after March 28, 2023. > - If a cluster has changefeeds created before March 28, 2023, neither the existing changefeeds nor newly created changefeeds for this cluster support scaling up or down. -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. + + + +You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changfeed. + + + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to scale, and click **...** > **Scale Up/Down** in the **Action** column. 3. Select a new specification. 4. Click **Submit**. @@ -60,7 +83,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan ## Pause or resume a changefeed -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. ## Edit a changefeed @@ -69,7 +92,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan > > TiDB Cloud currently only allows editing changefeeds in the paused status. -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. 3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. @@ -84,7 +107,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan ## Delete a changefeed -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. ## Changefeed billing From 5b88014386ae89d8ad37218ae20e303dbb812142 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 09:41:24 +0800 Subject: [PATCH 09/30] Update tidb-cloud/changefeed-overview.md --- tidb-cloud/changefeed-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index 10f26f63164d7..0ea3696dcca1c 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -36,7 +36,7 @@ To create a changefeed, refer to the tutorials: - [Sink to TiDB Cloud](/tidb-cloud/changefeed-sink-to-tidb-cloud.md) - [Sink to cloud storage](/tidb-cloud/changefeed-sink-to-cloud-storage.md) -## Query Changefeed RCUs +## Query changefeed capacity From 195d9dd9de44e65e67d10edfd6e01c5900188288 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 09:43:23 +0800 Subject: [PATCH 10/30] Delete tidb-cloud/changefeed-overview-premium.md --- tidb-cloud/changefeed-overview-premium.md | 102 ---------------------- 1 file changed, 102 deletions(-) delete mode 100644 tidb-cloud/changefeed-overview-premium.md diff --git a/tidb-cloud/changefeed-overview-premium.md b/tidb-cloud/changefeed-overview-premium.md deleted file mode 100644 index 692aededc9a61..0000000000000 --- a/tidb-cloud/changefeed-overview-premium.md +++ /dev/null @@ -1,102 +0,0 @@ ---- -title: Changefeed -summary: TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. ---- - -# Changefeed - -TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data services. Currently, TiDB Cloud supports streaming data to Apache Kafka, MySQL, TiDB Cloud and cloud storage. - -> **Note:** -> -> - Currently, TiDB Cloud only allows up to 100 changefeeds per instance. -> - Currently, TiDB Cloud only allows up to 100 table filter rules per changefeed. -> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. - -## View the Changefeed page - -To access the changefeed feature, take the following steps: - -1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the **TiDB Instance** page. - -2. Click the name of your target instance to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. - -On the **Changefeed** page, you can create a changefeed, view a list of existing changefeeds, and operate the existing changefeeds (such as scaling, pausing, resuming, editing, and deleting a changefeed). - -## Create a changefeed - -To create a changefeed, refer to the tutorials: - -- [Sink to Apache Kafka](/tidb-cloud/changefeed-sink-to-apache-kafka.md) -- [Sink to MySQL](/tidb-cloud/changefeed-sink-to-mysql.md) -- [Sink to TiDB Cloud](/tidb-cloud/changefeed-sink-to-tidb-cloud.md) -- [Sink to cloud storage](/tidb-cloud/changefeed-sink-to-cloud-storage.md) - -## Query Changefeed Capacity Units - -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. -2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. -3. You can see the current TiCDC Changefeed Capacity Units (CCUs) in the **Specification** area of the page. - -## Scale a changefeed - -You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changefeed. - - -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. -2. Locate the corresponding changefeed you want to scale, and click **...** > **Scale Up/Down** in the **Action** column. -3. Select a new specification. -4. Click **Submit**. - -It takes about 10 minutes to complete the scaling process (during which the changefeed works normally) and a few seconds to switch to the new specification (during which the changefeed will be paused and resumed automatically). - - -## Pause or resume a changefeed - -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. -2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. - -## Edit a changefeed - -> **Note:** -> -> TiDB Cloud currently only allows editing changefeeds in the paused status. - -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. -2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. -3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. - - TiDB Cloud populates the changefeed configuration by default. You can modify the following configurations: - - - Apache Kafka sink: all configurations. - - MySQL sink: **MySQL Connection**, **Table Filter**, and **Event Filter**. - - TiDB Cloud sink: **TiDB Cloud Connection**, **Table Filter**, and **Event Filter**. - - Cloud storage sink: **Storage Endpoint**, **Table Filter**, and **Event Filter**. - -4. After editing the configuration, click **...** > **Resume** to resume the corresponding changefeed. - -## Delete a changefeed - -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. -2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. - -## Changefeed billing - -To learn the billing for changefeeds in TiDB Cloud, see [Changefeed billing](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md). - -## Changefeed states - -The state of a replication task represents the running state of the replication task. During the running process, replication tasks might fail with errors, or be manually paused or resumed. These behaviors can lead to changes of the replication task state. - -The states are described as follows: - -- `CREATING`: the replication task is being created. -- `RUNNING`: the replication task runs normally and the checkpoint-ts proceeds normally. -- `EDITING`: the replication task is being edited. -- `PAUSING`: the replication task is being paused. -- `PAUSED`: the replication task is paused. -- `RESUMING`: the replication task is being resumed. -- `DELETING`: the replication task is being deleted. -- `DELETED`: the replication task is deleted. -- `WARNING`: the replication task returns a warning. The replication cannot continue due to some recoverable errors. The changefeed in this state keeps trying to resume until the state transfers to `RUNNING`. The changefeed in this state blocks [GC operations](https://docs.pingcap.com/tidb/stable/garbage-collection-overview). -- `FAILED`: the replication task fails. Due to some errors, the replication task cannot resume and cannot be recovered automatically. If the issues are resolved before the garbage collection (GC) of the incremental data, you can manually resume the failed changefeed. The default Time-To-Live (TTL) duration for incremental data is 24 hours, which means that the GC mechanism does not delete any data within 24 hours after the changefeed is interrupted. From d7b12c1e2fde9a998c85315f8cb4821d047ebb85 Mon Sep 17 00:00:00 2001 From: qiancai Date: Mon, 27 Oct 2025 18:03:41 +0800 Subject: [PATCH 11/30] Update changefeed-overview.md --- tidb-cloud/changefeed-overview.md | 37 ++++++++++++++++---- tidb-cloud/changefeed-sink-to-mysql.md | 48 +++++++++++++++++++++++--- 2 files changed, 73 insertions(+), 12 deletions(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index d7343c97464d2..10f26f63164d7 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -9,7 +9,7 @@ TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data servic > **Note:** > -> - Currently, TiDB Cloud only allows up to 100 changefeeds per cluster. +> - Currently, TiDB Cloud only allows up to 100 changefeeds per clusterinstance. > - Currently, TiDB Cloud only allows up to 100 table filter rules per changefeed. > - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. @@ -17,13 +17,13 @@ TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data servic To access the changefeed feature, take the following steps: -1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project. +1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project.In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. > **Tip:** > > You can use the combo box in the upper-left corner to switch between organizations, projects, and clusters. -2. Click the name of your target cluster to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. +2. Click the name of your target clusterinstance to go to its overview page, and then click **Data** > **Changefeed** in the left navigation pane. The changefeed page is displayed. On the **Changefeed** page, you can create a changefeed, view a list of existing changefeeds, and operate the existing changefeeds (such as scaling, pausing, resuming, editing, and deleting a changefeed). @@ -38,12 +38,28 @@ To create a changefeed, refer to the tutorials: ## Query Changefeed RCUs + + +For TiDB Cloud Dedicated, you can query the TiCDC Replication Capacity Units (RCUs) of a changefeed. + 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. 3. You can see the current TiCDC Replication Capacity Units (RCUs) in the **Specification** area of the page. + + + +For TiDB Cloud Premium, you can query the TiCDC Changefeed Capacity Units (CCUs) of a changefeed. + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. +2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. +3. You can see the current TiCDC Changefeed Capacity Units (CCUs) in the **Specification** area of the page. + + ## Scale a changefeed + + You can change the TiCDC Replication Capacity Units (RCUs) of a changefeed by scaling up or down the changfeed. > **Note:** @@ -51,7 +67,14 @@ You can change the TiCDC Replication Capacity Units (RCUs) of a changefeed by sc > - To scale a changefeed for a cluster, make sure that all changefeeds for this cluster are created after March 28, 2023. > - If a cluster has changefeeds created before March 28, 2023, neither the existing changefeeds nor newly created changefeeds for this cluster support scaling up or down. -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. + + + +You can change the TiCDC Changefeed Capacity Units (CCUs) of a changefeed by scaling up or down the changfeed. + + + +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to scale, and click **...** > **Scale Up/Down** in the **Action** column. 3. Select a new specification. 4. Click **Submit**. @@ -60,7 +83,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan ## Pause or resume a changefeed -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to pause or resume, and click **...** > **Pause/Resume** in the **Action** column. ## Edit a changefeed @@ -69,7 +92,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan > > TiDB Cloud currently only allows editing changefeeds in the paused status. -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the changefeed you want to pause, and click **...** > **Pause** in the **Action** column. 3. When the changefeed status changes to `Paused`, click **...** > **Edit** to edit the corresponding changefeed. @@ -84,7 +107,7 @@ It takes about 10 minutes to complete the scaling process (during which the chan ## Delete a changefeed -1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. +1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB clusterinstance. 2. Locate the corresponding changefeed you want to delete, and click **...** > **Delete** in the **Action** column. ## Changefeed billing diff --git a/tidb-cloud/changefeed-sink-to-mysql.md b/tidb-cloud/changefeed-sink-to-mysql.md index 136fed436889e..4201ba14701fa 100644 --- a/tidb-cloud/changefeed-sink-to-mysql.md +++ b/tidb-cloud/changefeed-sink-to-mysql.md @@ -7,14 +7,25 @@ summary: This document explains how to stream data from TiDB Cloud to MySQL usin This document describes how to stream data from TiDB Cloud to MySQL using the **Sink to MySQL** changefeed. + + > **Note:** > > - To use the changefeed feature, make sure that your TiDB Cloud Dedicated cluster version is v6.1.3 or later. > - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + +> **Note:** +> +> For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + ## Restrictions -- For each TiDB Cloud cluster, you can create up to 100 changefeeds. +- For each TiDB Cloud clusterinstance, you can create up to 100 changefeeds. - Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). - If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. @@ -28,6 +39,8 @@ Before creating a changefeed, you need to complete the following prerequisites: ### Network + + Make sure that your TiDB Cloud cluster can connect to the MySQL service. @@ -65,10 +78,35 @@ You can connect your TiDB Cloud cluster to your MySQL service securely through a + + + + +Make sure that your TiDB Cloud instance can connect to the MySQL service. + +> **Note:** +> +> Currently, the VPC Peering feature for TiDB Cloud Premium instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for TiDB Cloud Premium instance" in the **Description** field and click **Submit**. + +Private endpoints leverage **Private Link** or **Private Service Connect** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. + +You can connect your TiDB Cloud instance to your MySQL service securely through a private endpoint. If the private endpoint is not available for your MySQL service, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create one. + + + ### Load existing data (optional) + + The **Sink to MySQL** connector can only sink incremental data from your TiDB cluster to MySQL after a certain timestamp. If you already have data in your TiDB cluster, you can export and load the existing data of your TiDB cluster into MySQL before enabling **Sink to MySQL**. + + + +The **Sink to MySQL** connector can only sink incremental data from your TiDB instance to MySQL after a certain timestamp. If you already have data in your TiDB instance, you can export and load the existing data of your TiDB instance into MySQL before enabling **Sink to MySQL**. + + + To load the existing data: 1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during the time is not garbage collected by TiDB. @@ -84,7 +122,7 @@ To load the existing data: SET GLOBAL tidb_gc_life_time = '720h'; ``` -2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB cluster, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. +2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB clusterinstance, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. 3. From the [exported files of Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#format-of-exported-files), get the start position of MySQL sink from the metadata file: @@ -106,7 +144,7 @@ If you do not load the existing data, you need to create corresponding target ta After completing the prerequisites, you can sink your data to MySQL. -1. Navigate to the cluster overview page of the target TiDB cluster, and then click **Data** > **Changefeed** in the left navigation pane. +1. Navigate to the overview page of the target TiDB clusterinstance, and then click **Data** > **Changefeed** in the left navigation pane. 2. Click **Create Changefeed**, and select **MySQL** as **Destination**. @@ -143,12 +181,12 @@ After completing the prerequisites, you can sink your data to MySQL. 8. In **Start Replication Position**, configure the starting position for your MySQL sink. - If you have [loaded the existing data](#load-existing-data-optional) using Dumpling, select **Start replication from a specific TSO** and fill in the TSO that you get from Dumpling exported metadata files. - - If you do not have any data in the upstream TiDB cluster, select **Start replication from now on**. + - If you do not have any data in the upstream TiDB clusterinstance, select **Start replication from now on**. - Otherwise, you can customize the start time point by choosing **Start replication from a specific time**. 9. Click **Next** to configure your changefeed specification. - - In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs) to be used by the changefeed. + - In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs)Changefeed Capacity Units (CCUs) to be used by the changefeed. - In the **Changefeed Name** area, specify a name for the changefeed. 10. Click **Next** to review the changefeed configuration. From 93ec858034f7d66fcf124ed10a99e1a86af99ec2 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 10:07:34 +0800 Subject: [PATCH 12/30] Delete tidb-cloud/changefeed-sink-to-mysql-premium.md --- .../changefeed-sink-to-mysql-premium.md | 151 ------------------ 1 file changed, 151 deletions(-) delete mode 100644 tidb-cloud/changefeed-sink-to-mysql-premium.md diff --git a/tidb-cloud/changefeed-sink-to-mysql-premium.md b/tidb-cloud/changefeed-sink-to-mysql-premium.md deleted file mode 100644 index 4f36d226aec93..0000000000000 --- a/tidb-cloud/changefeed-sink-to-mysql-premium.md +++ /dev/null @@ -1,151 +0,0 @@ ---- -title: Sink to MySQL -summary: This document explains how to stream data from TiDB Cloud to MySQL using the Sink to MySQL changefeed. It includes restrictions, prerequisites, and steps to create a MySQL sink for data replication. The process involves setting up network connections, loading existing data to MySQL, and creating target tables in MySQL. After completing the prerequisites, users can create a MySQL sink to replicate data to MySQL. ---- - -# Sink to MySQL - -This document describes how to stream data from TiDB Cloud to MySQL using the **Sink to MySQL** changefeed. - -> **Note:** -> -> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. - -## Restrictions - -- For each TiDB Cloud instance, you can create up to 100 changefeeds. -- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). -- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. - -## Prerequisites - -Before creating a changefeed, you need to complete the following prerequisites: - -- Set up your network connection -- Export and load the existing data to MySQL (optional) -- Create corresponding target tables in MySQL if you do not load the existing data and only want to replicate incremental data to MySQL - -### Network - -Make sure that your TiDB Cloud instance can connect to the MySQL service. - - -
- -To submit an VPC Peering request, perform the steps in [TiDB Cloud Support](/tidb-cloud/tidb-cloud-support.md) to contact our support team. - -
- -
- -Private endpoints leverage **Private Link** or **Private Service Connect** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. - -You can connect your TiDB Cloud instance to your MySQL service securely through a private endpoint. If the private endpoint is not available for your MySQL service, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create one. - -
- -
- -### Load existing data (optional) - -The **Sink to MySQL** connector can only sink incremental data from your TiDB instance to MySQL after a certain timestamp. If you already have data in your TiDB instance, you can export and load the existing data of your TiDB instance into MySQL before enabling **Sink to MySQL**. - -To load the existing data: - -1. Extend the [tidb_gc_life_time](https://docs.pingcap.com/tidb/stable/system-variables#tidb_gc_life_time-new-in-v50) to be longer than the total time of the following two operations, so that historical data during the time is not garbage collected by TiDB. - - - The time to export and import the existing data - - The time to create **Sink to MySQL** - - For example: - - {{< copyable "sql" >}} - - ```sql - SET GLOBAL tidb_gc_life_time = '720h'; - ``` - -2. Use [Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview) to export data from your TiDB instance, then use community tools such as [mydumper/myloader](https://centminmod.com/mydumper.html) to load data to the MySQL service. - -3. From the [exported files of Dumpling](https://docs.pingcap.com/tidb/stable/dumpling-overview#format-of-exported-files), get the start position of MySQL sink from the metadata file: - - The following is a part of an example metadata file. The `Pos` of `SHOW MASTER STATUS` is the TSO of the existing data, which is also the start position of MySQL sink. - - ``` - Started dump at: 2020-11-10 10:40:19 - SHOW MASTER STATUS: - Log: tidb-binlog - Pos: 420747102018863124 - Finished dump at: 2020-11-10 10:40:20 - ``` - -### Create target tables in MySQL - -If you do not load the existing data, you need to create corresponding target tables in MySQL manually to store the incremental data from TiDB. Otherwise, the data will not be replicated. - -## Create a MySQL sink - -After completing the prerequisites, you can sink your data to MySQL. - -1. Navigate to the instance overview page of the target TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. - -2. Click **Create Changefeed**, and select **MySQL** as **Destination**. - -3. In **Connectivity Method**, choose the method to connect to your MySQL service. - - - If you choose **VPC Peering** or **Public IP**, fill in your MySQL endpoint. - - If you choose **Private Link**, select the private endpoint that you created in the [Network](#network) section, and then fill in the MySQL port for your MySQL service. - -4. In **Authentication**, fill in the MySQL user name and password of your MySQL service. - -5. Click **Next** to test whether TiDB can connect to MySQL successfully: - - - If yes, you are directed to the next step of configuration. - - If not, a connectivity error is displayed, and you need to handle the error. After the error is resolved, click **Next** again. - -6. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). - - - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. - - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. - -7. Customize **Event Filter** to filter the events that you want to replicate. - - - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. - - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: - - **Ignore event**: excludes specified event types. - - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. - - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. - - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. - - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. - - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. - -8. In **Start Replication Position**, configure the starting position for your MySQL sink. - - - If you have [loaded the existing data](#load-existing-data-optional) using Dumpling, select **Start replication from a specific TSO** and fill in the TSO that you get from Dumpling exported metadata files. - - If you do not have any data in the upstream TiDB instance, select **Start replication from now on**. - - Otherwise, you can customize the start time point by choosing **Start replication from a specific time**. - -9. Click **Next** to configure your changefeed specification. - - - In the **Changefeed Specification** area, specify the number of Changefeed Capacity Units (CCUs) to be used by the changefeed. - - In the **Changefeed Name** area, specify a name for the changefeed. - -10. Click **Next** to review the changefeed configuration. - - If you confirm that all configurations are correct, check the compliance of cross-region replication, and click **Create**. - - If you want to modify some configurations, click **Previous** to go back to the previous configuration page. - -11. The sink starts soon, and you can see the status of the sink changes from **Creating** to **Running**. - - Click the changefeed name, and you can see more details about the changefeed, such as the checkpoint, replication latency, and other metrics. - -12. If you have [loaded the existing data](#load-existing-data-optional) using Dumpling, you need to restore the GC time to its original value (the default value is `10m`) after the sink is created: - -{{< copyable "sql" >}} - -```sql -SET GLOBAL tidb_gc_life_time = '10m'; -``` From 6a948c322a6deb3d996ee9b32e680d5e6d7d412f Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 10:36:32 +0800 Subject: [PATCH 13/30] Update changefeed-sink-to-apache-kafka.md --- tidb-cloud/changefeed-sink-to-apache-kafka.md | 65 +++++++++++++++++-- 1 file changed, 59 insertions(+), 6 deletions(-) diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka.md b/tidb-cloud/changefeed-sink-to-apache-kafka.md index 0bdb4b2bb64ac..ad2d5c2ac2f65 100644 --- a/tidb-cloud/changefeed-sink-to-apache-kafka.md +++ b/tidb-cloud/changefeed-sink-to-apache-kafka.md @@ -7,17 +7,31 @@ summary: This document explains how to create a changefeed to stream data from T This document describes how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. + + > **Note:** > > - To use the changefeed feature, make sure that your TiDB Cloud Dedicated cluster version is v6.1.3 or later. > - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + +> **Note:** +> +> For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) clusters, the changefeed feature is unavailable. + + + ## Restrictions -- For each TiDB Cloud cluster, you can create up to 100 changefeeds. +- For each TiDB Cloud clusterinstance, you can create up to 100 changefeeds. - Currently, TiDB Cloud does not support uploading self-signed TLS certificates to connect to Kafka brokers. - Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). - If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. + + + - If you choose Private Link or Private Service Connect as the network connectivity method, ensure that your TiDB cluster version meets the following requirements: - For v6.5.x: version v6.5.9 or later @@ -30,6 +44,8 @@ This document describes how to create a changefeed to stream data from TiDB Clou - If you want to distribute changelogs by primary key or index value to Kafka partition with a specified index name, make sure the version of your TiDB cluster is v7.5.0 or later. - If you want to distribute changelogs by column value to Kafka partition, make sure the version of your TiDB cluster is v7.5.0 or later. + + ## Prerequisites Before creating a changefeed to stream data to Apache Kafka, you need to complete the following prerequisites: @@ -39,12 +55,14 @@ Before creating a changefeed to stream data to Apache Kafka, you need to complet ### Network -Ensure that your TiDB cluster can connect to the Apache Kafka service. You can choose one of the following connection methods: +Ensure that your TiDB clusterinstance can connect to the Apache Kafka service. You can choose one of the following connection methods: - Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). - VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. - Public IP: suitable for a quick setup. + +
@@ -87,20 +105,49 @@ It is **NOT** recommended to use Public IP in a production environment.
+
+ + + + +
+ +Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. + +TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. + +If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create a private endpoint. + +
+
+ +If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. + +It is **NOT** recommended to use Public IP in a production environment. + +
+ +
+ +Currently, the VPC Peering feature for TiDB Cloud Premium instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for TiDB Cloud Premium instance" in the **Description** field and click **Submit**. + +
+
+ ### Kafka ACL authorization To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka: - The `Create` and `Write` permissions are added for the topic resource type in Kafka. -- The `DescribeConfigs` permission is added for the cluster resource type in Kafka. +- The `DescribeConfigs` permission is added for the clusterinstance resource type in Kafka. For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. ## Step 1. Open the Changefeed page for Apache Kafka 1. Log in to the [TiDB Cloud console](https://tidbcloud.com). -2. Navigate to the cluster overview page of the target TiDB cluster, and then click **Data** > **Changefeed** in the left navigation pane. +2. Navigate to the overview page of the target TiDB clusterinstance, and then click **Data** > **Changefeed** in the left navigation pane. 3. Click **Create Changefeed**, and select **Kafka** as **Destination**. ## Step 2. Configure the changefeed target @@ -140,6 +187,8 @@ The steps vary depending on the connectivity method you select. 11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. + +
1. In **Connectivity Method**, select **Private Service Connect**. @@ -158,6 +207,9 @@ The steps vary depending on the connectivity method you select. 11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds.
+
+ +
1. In **Connectivity Method**, select **Private Link**. @@ -176,6 +228,7 @@ The steps vary depending on the connectivity method you select. 11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds.
+
## Step 3. Set the changefeed @@ -219,7 +272,7 @@ The steps vary depending on the connectivity method you select. 6. If you select **Avro** as your data format, you will see some Avro-specific configurations on the page. You can fill in these configurations as follows: - In the **Decimal** and **Unsigned BigInt** configurations, specify how TiDB Cloud handles the decimal and unsigned bigint data types in Kafka messages. - - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB cluster endpoint and password. + - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB clusterinstance endpoint and password. 7. In the **Topic Distribution** area, select a distribution mode, and then fill in the topic name configurations according to the mode. @@ -272,7 +325,7 @@ The steps vary depending on the connectivity method you select. ## Step 4. Configure your changefeed specification -1. In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs) to be used by the changefeed. +1. In the **Changefeed Specification** area, specify the number of Replication Capacity Units (RCUs)Changefeed Capacity Units (CCUs) to be used by the changefeed. 2. In the **Changefeed Name** area, specify a name for the changefeed. 3. Click **Next** to check the configurations you set and go to the next page. From 94def80cdcaafb6445937fa20cc157a5db5145ad Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 10:37:03 +0800 Subject: [PATCH 14/30] Delete changefeed-sink-to-apache-kafka-premium.md --- ...changefeed-sink-to-apache-kafka-premium.md | 218 ------------------ 1 file changed, 218 deletions(-) delete mode 100644 tidb-cloud/changefeed-sink-to-apache-kafka-premium.md diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md b/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md deleted file mode 100644 index 5e632b7c5c43c..0000000000000 --- a/tidb-cloud/changefeed-sink-to-apache-kafka-premium.md +++ /dev/null @@ -1,218 +0,0 @@ ---- -title: Sink to Apache Kafka -summary: This document explains how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. It includes restrictions, prerequisites, and steps to configure the changefeed for Apache Kafka. The process involves setting up network connections, adding permissions for Kafka ACL authorization, and configuring the changefeed specification. ---- - -# Sink to Apache Kafka - -This document describes how to create a changefeed to stream data from TiDB Cloud to Apache Kafka. - -> **Note:** -> -> - For [{{{ .starter }}}](/tidb-cloud/select-cluster-tier.md#starter) and [{{{ .essential }}}](/tidb-cloud/select-cluster-tier.md#essential) instances, the changefeed feature is unavailable. - -## Restrictions - -- For each TiDB Cloud instance, you can create up to 100 changefeeds. -- Currently, TiDB Cloud does not support uploading self-signed TLS certificates to connect to Kafka brokers. -- Because TiDB Cloud uses TiCDC to establish changefeeds, it has the same [restrictions as TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview#unsupported-scenarios). -- If the table to be replicated does not have a primary key or a non-null unique index, the absence of a unique constraint during replication could result in duplicated data being inserted downstream in some retry scenarios. - -## Prerequisites - -Before creating a changefeed to stream data to Apache Kafka, you need to complete the following prerequisites: - -- Set up your network connection -- Add permissions for Kafka ACL authorization - -### Network - -Ensure that your TiDB instance can connect to the Apache Kafka service. You can choose one of the following connection methods: - -- Private Connect: ideal for avoiding VPC CIDR conflicts and meeting security compliance, but incurs additional [Private Data Link Cost](/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md#private-data-link-cost). -- VPC Peering: suitable as a cost-effective option, but requires managing potential VPC CIDR conflicts and security considerations. -- Public IP: suitable for a quick setup. - - -
- -Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC. - -TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. - -- If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create a private endpoint. - -
- -
- -If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers. - -It is **NOT** recommended to use Public IP in a production environment. - -
-
- -To submit an VPC Peering request, perform the steps in [TiDB Cloud Support](/tidb-cloud/tidb-cloud-support.md) to contact our support team. - -
-
- -### Kafka ACL authorization - -To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka: - -- The `Create` and `Write` permissions are added for the topic resource type in Kafka. -- The `DescribeConfigs` permission is added for the instance resource type in Kafka. - -For example, if your Kafka instance is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. - -## Step 1. Open the Changefeed page for Apache Kafka - -1. Log in to the [TiDB Cloud console](https://tidbcloud.com). -2. Navigate to the instance overview page of the target TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. -3. Click **Create Changefeed**, and select **Kafka** as **Destination**. - -## Step 2. Configure the changefeed target - -The steps vary depending on the connectivity method you select. - - -
- -1. In **Connectivity Method**, select **VPC Peering** or **Public IP**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints. -2. Select an **Authentication** option according to your Kafka authentication configuration. - - - If your Kafka does not require authentication, keep the default option **Disable**. - - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. - -3. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. -4. Select a **Compression** type for the data in this changefeed. -5. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -6. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. - -
-
- -1. In **Connectivity Method**, select **Private Link**. -2. In **Private Endpoint**, select the private endpoint that you created in the [Network](#network) section. Make sure the AZs of the private endpoint match the AZs of the Kafka deployment. -3. Fill in the **Bootstrap Ports** that you obtained from the [Network](#network) section. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports. -4. Select an **Authentication** option according to your Kafka authentication configuration. - - - If your Kafka does not require authentication, keep the default option **Disable**. - - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication. -5. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**. -6. Select a **Compression** type for the data in this changefeed. -7. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection. -8. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page. -9. TiDB Cloud creates the endpoint for **Private Link**, which might take several minutes. -10. Once the endpoint is created, log in to your cloud provider console and accept the connection request. -11. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds. - -
- -
- -## Step 3. Set the changefeed - -1. Customize **Table Filter** to filter the tables that you want to replicate. For the rule syntax, refer to [table filter rules](/table-filter.md). - - - **Case Sensitive**: you can set whether the matching of database and table names in filter rules is case-sensitive. By default, matching is case-insensitive. - - **Filter Rules**: you can set filter rules in this column. By default, there is a rule `*.*`, which stands for replicating all tables. When you add a new rule, TiDB Cloud queries all the tables in TiDB and displays only the tables that match the rules in the box on the right. You can add up to 100 filter rules. - - **Tables with valid keys**: this column displays the tables that have valid keys, including primary keys or unique indexes. - - **Tables without valid keys**: this column shows tables that lack primary keys or unique keys. These tables present a challenge during replication because the absence of a unique identifier can result in inconsistent data when the downstream handles duplicate events. To ensure data consistency, it is recommended to add unique keys or primary keys to these tables before initiating the replication. Alternatively, you can add filter rules to exclude these tables. For example, you can exclude the table `test.tbl1` by using the rule `"!test.tbl1"`. - -2. Customize **Event Filter** to filter the events that you want to replicate. - - - **Tables matching**: you can set which tables the event filter will be applied to in this column. The rule syntax is the same as that used for the preceding **Table Filter** area. You can add up to 10 event filter rules per changefeed. - - **Event Filter**: you can use the following event filters to exclude specific events from the changefeed: - - **Ignore event**: excludes specified event types. - - **Ignore SQL**: excludes DDL events that match specified expressions. For example, `^drop` excludes statements starting with `DROP`, and `add column` excludes statements containing `ADD COLUMN`. - - **Ignore insert value expression**: excludes `INSERT` statements that meet specific conditions. For example, `id >= 100` excludes `INSERT` statements where `id` is greater than or equal to 100. - - **Ignore update new value expression**: excludes `UPDATE` statements where the new value matches a specified condition. For example, `gender = 'male'` excludes updates that result in `gender` being `male`. - - **Ignore update old value expression**: excludes `UPDATE` statements where the old value matches a specified condition. For example, `age < 18` excludes updates where the old value of `age` is less than 18. - - **Ignore delete value expression**: excludes `DELETE` statements that meet a specified condition. For example, `name = 'john'` excludes `DELETE` statements where `name` is `'john'`. - -3. Customize **Column Selector** to select columns from events and send only the data changes related to those columns to the downstream. - - - **Tables matching**: specify which tables the column selector applies to. For tables that do not match any rule, all columns are sent. - - **Column Selector**: specify which columns of the matched tables will be sent to the downstream. - - For more information about the matching rules, see [Column selectors](https://docs.pingcap.com/tidb/stable/ticdc-sink-to-kafka/#column-selectors). - -4. In the **Data Format** area, select your desired format of Kafka messages. - - - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol). - - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json). - - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol). - - Debezium is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. For more information, see [Debezium data format](https://docs.pingcap.com/tidb/stable/ticdc-debezium). - -5. Enable the **TiDB Extension** option if you want to add TiDB-extension fields to the Kafka message body. - - For more information about TiDB-extension fields, see [TiDB extension fields in Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol#tidb-extension-fields) and [TiDB extension fields in Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json#tidb-extension-field). - -6. If you select **Avro** as your data format, you will see some Avro-specific configurations on the page. You can fill in these configurations as follows: - - - In the **Decimal** and **Unsigned BigInt** configurations, specify how TiDB Cloud handles the decimal and unsigned bigint data types in Kafka messages. - - In the **Schema Registry** area, fill in your schema registry endpoint. If you enable **HTTP Authentication**, the fields for user name and password are displayed and automatically filled in with your TiDB instance endpoint and password. - -7. In the **Topic Distribution** area, select a distribution mode, and then fill in the topic name configurations according to the mode. - - If you select **Avro** as your data format, you can only choose the **Distribute changelogs by table to Kafka Topics** mode in the **Distribution Mode** drop-down list. - - The distribution mode controls how the changefeed creates Kafka topics, by table, by database, or creating one topic for all changelogs. - - - **Distribute changelogs by table to Kafka Topics** - - If you want the changefeed to create a dedicated Kafka topic for each table, choose this mode. Then, all Kafka messages of a table are sent to a dedicated Kafka topic. You can customize topic names for tables by setting a topic prefix, a separator between a database name and table name, and a suffix. For example, if you set the separator as `_`, the topic names are in the format of `_`. - - For changelogs of non-row events, such as Create Schema Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. - - - **Distribute changelogs by database to Kafka Topics** - - If you want the changefeed to create a dedicated Kafka topic for each database, choose this mode. Then, all Kafka messages of a database are sent to a dedicated Kafka topic. You can customize topic names of databases by setting a topic prefix and a suffix. - - For changelogs of non-row events, such as Resolved Ts Event, you can specify a topic name in the **Default Topic Name** field. The changefeed will create a topic accordingly to collect such changelogs. - - - **Send all changelogs to one specified Kafka Topic** - - If you want the changefeed to create one Kafka topic for all changelogs, choose this mode. Then, all Kafka messages in the changefeed will be sent to one Kafka topic. You can define the topic name in the **Topic Name** field. - -8. In the **Partition Distribution** area, you can decide which partition a Kafka message will be sent to. You can define **a single partition dispatcher for all tables**, or **different partition dispatchers for different tables**. TiDB Cloud provides four types of dispatchers: - - - **Distribute changelogs by primary key or index value to Kafka partition** - - If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The primary key or index value of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures row-level orderliness. - - - **Distribute changelogs by table to Kafka partition** - - If you want the changefeed to send Kafka messages of a table to one Kafka partition, choose this distribution method. The table name of a row changelog will determine which partition the changelog is sent to. This distribution method ensures table orderliness but might cause unbalanced partitions. - - - **Distribute changelogs by timestamp to Kafka partition** - - If you want the changefeed to send Kafka messages to different Kafka partitions randomly, choose this distribution method. The commitTs of a row changelog will determine which partition the changelog is sent to. This distribution method provides a better partition balance and ensures orderliness in each partition. However, multiple changes of a data item might be sent to different partitions and the consumer progress of different consumers might be different, which might cause data inconsistency. Therefore, the consumer needs to sort the data from multiple partitions by commitTs before consuming. - - - **Distribute changelogs by column value to Kafka partition** - - If you want the changefeed to send Kafka messages of a table to different partitions, choose this distribution method. The specified column values of a row changelog will determine which partition the changelog is sent to. This distribution method ensures orderliness in each partition and guarantees that the changelog with the same column values is send to the same partition. - -9. In the **Topic Configuration** area, configure the following numbers. The changefeed will automatically create the Kafka topics according to the numbers. - - - **Replication Factor**: controls how many Kafka servers each Kafka message is replicated to. The valid value ranges from [`min.insync.replicas`](https://kafka.apache.org/33/documentation.html#brokerconfigs_min.insync.replicas) to the number of Kafka brokers. - - **Partition Number**: controls how many partitions exist in a topic. The valid value range is `[1, 10 * the number of Kafka brokers]`. - -10. In the **Split Event** area, choose whether to split `UPDATE` events into separate `DELETE` and `INSERT` events or keep as raw `UPDATE` events. For more information, see [Split primary or unique key UPDATE events for non-MySQL sinks](https://docs.pingcap.com/tidb/stable/ticdc-split-update-behavior/#split-primary-or-unique-key-update-events-for-non-mysql-sinks). - -11. Click **Next**. - -## Step 4. Configure your changefeed specification - -1. In the **Changefeed Specification** area, specify the number of Changefeed Capacity Units (CCUs) to be used by the changefeed. -2. In the **Changefeed Name** area, specify a name for the changefeed. -3. Click **Next** to check the configurations you set and go to the next page. - -## Step 5. Review the configurations - -On this page, you can review all the changefeed configurations that you set. - -If you find any error, you can go back to fix the error. If there is no error, you can click the check box at the bottom, and then click **Create** to create the changefeed. From 9a724ae7c33196aa6340df86788fc3c046434ed9 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 10:46:33 +0800 Subject: [PATCH 15/30] split Changefeed Billing into two docs --- tidb-cloud/tidb-cloud-billing-ticdc-ccu.md | 43 ++++++++++++++++++ tidb-cloud/tidb-cloud-billing-ticdc-rcu.md | 51 ++-------------------- 2 files changed, 46 insertions(+), 48 deletions(-) create mode 100644 tidb-cloud/tidb-cloud-billing-ticdc-ccu.md diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md new file mode 100644 index 0000000000000..a0f6cfc13dbb5 --- /dev/null +++ b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md @@ -0,0 +1,43 @@ +--- +title: Changefeed Billing +summary: Learn about billing for changefeeds in TiDB Cloud. +--- + +# Changefeed Billing for TiDB Cloud Premium + +TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. + +## Number of TiCDC CCUs + +The following table lists the specifications and corresponding replication performances for changefeeds: + +| Specification | Maximum replication performance | +|---------------|---------------------------------| +| 2 CCUs | 5,000 rows/s | +| 4 CCUs | 10,000 rows/s | +| 8 CCUs | 20,000 rows/s | +| 16 CCUs | 40,000 rows/s | +| 24 CCUs | 60,000 rows/s | +| 32 CCUs | 80,000 rows/s | +| 40 CCUs | 100,000 rows/s | +| 64 CCUs | 160,000 rows/s | +| 96 CCUs | 240,000 rows/s | +| 128 CCUs | 320,000 rows/s | +| 192 CCUs | 480,000 rows/s | +| 256 CCUs | 640,000 rows/s | +| 320 CCUs | 800,000 rows/s | +| 384 CCUs | 960,000 rows/s | + +> **Note:** +> +> The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). + +## Price + +As Premium is currently in private preview, you can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. + +## Private Data Link cost + +If you choose the **Private Link** or **Private Service Connect** network connectivity method, additional **Private Data Link** costs will be incurred. These charges fall under the [Data Transfer Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#data-transfer-cost) category. + +The price of **Private Data Link** is **$0.01/GiB**, the same as **Data Processed** of [AWS Interface Endpoint pricing](https://aws.amazon.com/privatelink/pricing/#Interface_Endpoint_pricing), **Consumer data processing** of [Google Cloud Private Service Connect pricing](https://cloud.google.com/vpc/pricing#psc-forwarding-rules), and **Inbound/Outbound Data Processed** of [Azure Private Link pricing](https://azure.microsoft.com/en-us/pricing/details/private-link/). diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md index c4110563fa798..426ece00fb87e 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md @@ -4,17 +4,11 @@ summary: Learn about billing for changefeeds in TiDB Cloud. aliases: ['/tidbcloud/tidb-cloud-billing-tcu'] --- -# Changefeed Billing - - - -## RCU cost for TiDB Cloud Dedicated - +# Changefeed Billing for TiDB Cloud Dedicated TiDB Cloud Dedicated measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. - -### Number of TiCDC RCUs +## Number of TiCDC RCUs The following table lists the specifications and corresponding replication performances for changefeeds: @@ -39,49 +33,10 @@ The following table lists the specifications and corresponding replication perfo > > The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). -### Price +## Price To learn about the supported regions and the price of TiDB Cloud for each TiCDC RCU, see [Changefeed Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#changefeed-cost). - - - -## CCU cost for TiDB Cloud Premium - -TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. - - -### Number of TiCDC CCUs - -The following table lists the specifications and corresponding replication performances for changefeeds: - -| Specification | Maximum replication performance | -|---------------|---------------------------------| -| 2 CCUs | 5,000 rows/s | -| 4 CCUs | 10,000 rows/s | -| 8 CCUs | 20,000 rows/s | -| 16 CCUs | 40,000 rows/s | -| 24 CCUs | 60,000 rows/s | -| 32 CCUs | 80,000 rows/s | -| 40 CCUs | 100,000 rows/s | -| 64 CCUs | 160,000 rows/s | -| 96 CCUs | 240,000 rows/s | -| 128 CCUs | 320,000 rows/s | -| 192 CCUs | 480,000 rows/s | -| 256 CCUs | 640,000 rows/s | -| 320 CCUs | 800,000 rows/s | -| 384 CCUs | 960,000 rows/s | - -> **Note:** -> -> The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). - -### Price - -As Premium is currently in private preview, you can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. - - - ## Private Data Link cost If you choose the **Private Link** or **Private Service Connect** network connectivity method, additional **Private Data Link** costs will be incurred. These charges fall under the [Data Transfer Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#data-transfer-cost) category. From 2b2def9ee41bbf2ff9703c8cc75f0f650f5799c6 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 10:49:07 +0800 Subject: [PATCH 16/30] add the cost sections --- tidb-cloud/set-up-sink-private-endpoint-premium.md | 2 +- tidb-cloud/tidb-cloud-billing-ticdc-ccu.md | 12 ++++++++---- tidb-cloud/tidb-cloud-billing-ticdc-rcu.md | 8 ++++++-- 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/tidb-cloud/set-up-sink-private-endpoint-premium.md b/tidb-cloud/set-up-sink-private-endpoint-premium.md index 6f30e04ee0435..b04709368560d 100644 --- a/tidb-cloud/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/set-up-sink-private-endpoint-premium.md @@ -34,7 +34,7 @@ If your changefeed downstream service is hosted on AWS, collect the following in - The name of the Private Endpoint Service for your downstream service - The availability zones (AZs) where your downstream service is deployed -If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka instance as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service-premium.md#step-2-expose-the-kafka-instance-as-private-link-service) to set up the load balancer and the Private Link Service. +If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka instance as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md#step-2-expose-the-kafka-instance-as-private-link-service) to set up the load balancer and the Private Link Service. diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md index a0f6cfc13dbb5..f23d49bb27172 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md @@ -5,9 +5,13 @@ summary: Learn about billing for changefeeds in TiDB Cloud. # Changefeed Billing for TiDB Cloud Premium -TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview-premium.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview-premium.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. +This document describes the billing details for changefeeds in TiDB Cloud Premium. -## Number of TiCDC CCUs +## CCU cost + +TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. + +### Number of TiCDC CCUs The following table lists the specifications and corresponding replication performances for changefeeds: @@ -32,9 +36,9 @@ The following table lists the specifications and corresponding replication perfo > > The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). -## Price +### Price -As Premium is currently in private preview, you can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. +Currently, TiDB Cloud Premium is in private preview. You can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. ## Private Data Link cost diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md index 426ece00fb87e..4179c276a17c5 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md @@ -6,9 +6,13 @@ aliases: ['/tidbcloud/tidb-cloud-billing-tcu'] # Changefeed Billing for TiDB Cloud Dedicated +This document describes the billing details for changefeeds in TiDB Cloud Dedicated. + +## RCU cost + TiDB Cloud Dedicated measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Replication Capacity Units (RCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for a cluster, you can select an appropriate specification. The higher the RCU, the better the replication performance. You will be charged for these TiCDC changefeed RCUs. -## Number of TiCDC RCUs +### Number of TiCDC RCUs The following table lists the specifications and corresponding replication performances for changefeeds: @@ -33,7 +37,7 @@ The following table lists the specifications and corresponding replication perfo > > The preceding performance data is for reference only and might vary in different scenarios. It is strongly recommended that you conduct a real workload test before using the changefeed feature in a production environment. For further assistance, contact [TiDB Cloud support](/tidb-cloud/tidb-cloud-support.md). -## Price +### Price To learn about the supported regions and the price of TiDB Cloud for each TiCDC RCU, see [Changefeed Cost](https://www.pingcap.com/tidb-dedicated-pricing-details/#changefeed-cost). From 7a10cec98bc778f17ef57a0c3bc0e98134b19676 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 10:55:39 +0800 Subject: [PATCH 17/30] Update tidb-cloud/changefeed-overview.md --- tidb-cloud/changefeed-overview.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index 9912b268aa493..0ea3696dcca1c 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -42,10 +42,6 @@ To create a changefeed, refer to the tutorials: For TiDB Cloud Dedicated, you can query the TiCDC Replication Capacity Units (RCUs) of a changefeed. - - -For TiDB Cloud Dedicated, you can query the TiCDC Replication Capacity Units (RCUs) of a changefeed. - 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB cluster. 2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. 3. You can see the current TiCDC Replication Capacity Units (RCUs) in the **Specification** area of the page. From ffa7d69837c8d84eea6a05dfa205738f4594eb23 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 10:59:44 +0800 Subject: [PATCH 18/30] Apply suggestions from code review --- tidb-cloud/changefeed-overview.md | 1 + tidb-cloud/tidb-cloud-billing-ticdc-ccu.md | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index 0ea3696dcca1c..7a01e03166d7b 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -54,6 +54,7 @@ For TiDB Cloud Premium, you can query the TiCDC Changefeed Capacity Units (CCUs) 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. 2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. 3. You can see the current TiCDC Changefeed Capacity Units (CCUs) in the **Specification** area of the page. + ## Scale a changefeed diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md index f23d49bb27172..39fb707e038ea 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md @@ -1,6 +1,6 @@ --- -title: Changefeed Billing -summary: Learn about billing for changefeeds in TiDB Cloud. +title: Changefeed Billing for TiDB Cloud Premium +summary: Learn about billing for changefeeds in TiDB Cloud Premium. --- # Changefeed Billing for TiDB Cloud Premium From 02179a5448403b34eec2af35d89036fe182af87b Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 11:00:12 +0800 Subject: [PATCH 19/30] Update tidb-cloud-billing-ticdc-rcu.md --- tidb-cloud/tidb-cloud-billing-ticdc-rcu.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md index 4179c276a17c5..01a4a40c4b8ea 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-rcu.md @@ -1,5 +1,5 @@ --- -title: Changefeed Billing +title: Changefeed Billing for TiDB Cloud Dedicated summary: Learn about billing for changefeeds in TiDB Cloud. aliases: ['/tidbcloud/tidb-cloud-billing-tcu'] --- From 69444bb78f553164934164367beddbdb597ab2cb Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 28 Oct 2025 11:06:51 +0800 Subject: [PATCH 20/30] Update tidb-cloud/changefeed-sink-to-apache-kafka.md --- tidb-cloud/changefeed-sink-to-apache-kafka.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka.md b/tidb-cloud/changefeed-sink-to-apache-kafka.md index ad2d5c2ac2f65..997cd63796085 100644 --- a/tidb-cloud/changefeed-sink-to-apache-kafka.md +++ b/tidb-cloud/changefeed-sink-to-apache-kafka.md @@ -140,7 +140,7 @@ Currently, the VPC Peering feature for TiDB Cloud Premium instances is only avai To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka: - The `Create` and `Write` permissions are added for the topic resource type in Kafka. -- The `DescribeConfigs` permission is added for the clusterinstance resource type in Kafka. +- The `DescribeConfigs` permission is added for the cluster resource type in Kafka. For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources](https://docs.confluent.io/platform/current/kafka/authorization.html#resources) and [Adding ACLs](https://docs.confluent.io/platform/current/kafka/authorization.html#adding-acls) in Confluent documentation for more information. From 88c3d6ddacf1e7c13b61c76699e4cdd930f53832 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 13:48:42 +0800 Subject: [PATCH 21/30] Update changefeed-sink-to-apache-kafka.md --- tidb-cloud/changefeed-sink-to-apache-kafka.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka.md b/tidb-cloud/changefeed-sink-to-apache-kafka.md index 997cd63796085..270560f0bd4ca 100644 --- a/tidb-cloud/changefeed-sink-to-apache-kafka.md +++ b/tidb-cloud/changefeed-sink-to-apache-kafka.md @@ -133,7 +133,7 @@ Currently, the VPC Peering feature for TiDB Cloud Premium instances is only avai - +
### Kafka ACL authorization From 4f7bd56ccfa0fbcd9bd6c2c34dd409786112521b Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 14:36:54 +0800 Subject: [PATCH 22/30] Update changefeed-overview.md --- tidb-cloud/changefeed-overview.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index 7a01e03166d7b..7ba1efecbf1e2 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -17,7 +17,7 @@ TiDB Cloud changefeed helps you stream data from TiDB Cloud to other data servic To access the changefeed feature, take the following steps: -1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project.In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. +1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the [**Clusters**](https://tidbcloud.com/project/clusters) page of your project.navigate to the [**TiDB Instances**](https://tidbcloud.com/tidbs) page. > **Tip:** > From 9a3848fcecc9fc6bd05debd5f0a3e00e0127a1f6 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 15:09:37 +0800 Subject: [PATCH 23/30] replace with TiDB Cloud Premium with {{{ .premium }}} --- tidb-cloud/changefeed-overview.md | 2 +- tidb-cloud/changefeed-sink-to-apache-kafka.md | 2 +- tidb-cloud/changefeed-sink-to-mysql.md | 2 +- tidb-cloud/set-up-sink-private-endpoint-premium.md | 2 +- ...tup-aws-self-hosted-kafka-private-link-service.md | 4 ++-- tidb-cloud/tidb-cloud-billing-ticdc-ccu.md | 12 ++++++------ 6 files changed, 12 insertions(+), 12 deletions(-) diff --git a/tidb-cloud/changefeed-overview.md b/tidb-cloud/changefeed-overview.md index 7ba1efecbf1e2..18bf45a1b7207 100644 --- a/tidb-cloud/changefeed-overview.md +++ b/tidb-cloud/changefeed-overview.md @@ -49,7 +49,7 @@ For TiDB Cloud Dedicated, you can query the TiCDC Replication Capacity Units (RC
-For TiDB Cloud Premium, you can query the TiCDC Changefeed Capacity Units (CCUs) of a changefeed. +For {{{ .premium }}}, you can query the TiCDC Changefeed Capacity Units (CCUs) of a changefeed. 1. Navigate to the [**Changefeed**](#view-the-changefeed-page) page of your target TiDB instance. 2. Locate the corresponding changefeed you want to query, and click **...** > **View** in the **Action** column. diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka.md b/tidb-cloud/changefeed-sink-to-apache-kafka.md index 270560f0bd4ca..6b800cf463f84 100644 --- a/tidb-cloud/changefeed-sink-to-apache-kafka.md +++ b/tidb-cloud/changefeed-sink-to-apache-kafka.md @@ -129,7 +129,7 @@ It is **NOT** recommended to use Public IP in a production environment.
-Currently, the VPC Peering feature for TiDB Cloud Premium instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for TiDB Cloud Premium instance" in the **Description** field and click **Submit**. +Currently, the VPC Peering feature for {{{ .premium }}} instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for {{{ .premium }}} instance" in the **Description** field and click **Submit**.
diff --git a/tidb-cloud/changefeed-sink-to-mysql.md b/tidb-cloud/changefeed-sink-to-mysql.md index 4201ba14701fa..5eccbbdcad5b0 100644 --- a/tidb-cloud/changefeed-sink-to-mysql.md +++ b/tidb-cloud/changefeed-sink-to-mysql.md @@ -86,7 +86,7 @@ Make sure that your TiDB Cloud instance can connect to the MySQL service. > **Note:** > -> Currently, the VPC Peering feature for TiDB Cloud Premium instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for TiDB Cloud Premium instance" in the **Description** field and click **Submit**. +> Currently, the VPC Peering feature for {{{ .premium }}} instances is only available upon request. To request this feature, click **?** in the lower-right corner of the [TiDB Cloud console](https://tidbcloud.com) and click **Request Support**. Then, fill in "Apply for VPC Peering for {{{ .premium }}} instance" in the **Description** field and click **Submit**. Private endpoints leverage **Private Link** or **Private Service Connect** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. diff --git a/tidb-cloud/set-up-sink-private-endpoint-premium.md b/tidb-cloud/set-up-sink-private-endpoint-premium.md index b04709368560d..32ae362a526ae 100644 --- a/tidb-cloud/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/set-up-sink-private-endpoint-premium.md @@ -5,7 +5,7 @@ summary: Learn how to set up a private endpoint for changefeeds. # Set Up Private Endpoint for Changefeeds -This document describes how to create a private endpoint for changefeeds in your TiDB Cloud Premium instances, enabling you to securely stream data to self-hosted Kafka or MySQL through private connectivity. +This document describes how to create a private endpoint for changefeeds in your {{{ .premium }}} instances, enabling you to securely stream data to self-hosted Kafka or MySQL through private connectivity. ## Prerequisites diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index 59848945bdfdf..3fa8d2db7b4e6 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -63,9 +63,9 @@ The document provides an example of connecting to a Kafka Private Link service d - Manage endpoint services - Connect to EC2 nodes to configure Kafka nodes -2. [Create a TiDB Cloud Premium instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. +2. [Create a {{{ .premium }}} instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. -3. Get the Kafka deployment information from your TiDB Cloud Premium instance. +3. Get the Kafka deployment information from your {{{ .premium }}} instance. 1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the instance overview page of the TiDB instance, and then click **Data** > **Changefeed** in the left navigation pane. 2. On the overview page, find the region of the TiDB instance. Ensure that your Kafka cluster will be deployed to the same region. diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md index 39fb707e038ea..70be0ec5bd7c6 100644 --- a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md +++ b/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md @@ -1,15 +1,15 @@ --- -title: Changefeed Billing for TiDB Cloud Premium -summary: Learn about billing for changefeeds in TiDB Cloud Premium. +title: Changefeed Billing for {{{ .premium }}} +summary: Learn about billing for changefeeds in {{{ .premium }}}. --- -# Changefeed Billing for TiDB Cloud Premium +# Changefeed Billing for {{{ .premium }}} -This document describes the billing details for changefeeds in TiDB Cloud Premium. +This document describes the billing details for changefeeds in {{{ .premium }}}. ## CCU cost -TiDB Cloud Premium measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. +{{{ .premium }}} measures the capacity of [changefeeds](/tidb-cloud/changefeed-overview.md) in TiCDC Changefeed Capacity Units (CCUs). When you [create a changefeed](/tidb-cloud/changefeed-overview.md#create-a-changefeed) for an instance, you can select an appropriate specification. The higher the CCU, the better the replication performance. You will be charged for these TiCDC CCUs. ### Number of TiCDC CCUs @@ -38,7 +38,7 @@ The following table lists the specifications and corresponding replication perfo ### Price -Currently, TiDB Cloud Premium is in private preview. You can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. +Currently, {{{ .premium }}} is in private preview. You can [contact our sales](https://www.pingcap.com/contact-us/) for pricing details. ## Private Data Link cost From 333c54b436b7c38226e87e35f464c3f94ca76fb5 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 15:12:45 +0800 Subject: [PATCH 24/30] move the two premium docs to the premium folder --- tidb-cloud/{ => premium}/set-up-sink-private-endpoint-premium.md | 0 tidb-cloud/{ => premium}/tidb-cloud-billing-ticdc-ccu.md | 0 2 files changed, 0 insertions(+), 0 deletions(-) rename tidb-cloud/{ => premium}/set-up-sink-private-endpoint-premium.md (100%) rename tidb-cloud/{ => premium}/tidb-cloud-billing-ticdc-ccu.md (100%) diff --git a/tidb-cloud/set-up-sink-private-endpoint-premium.md b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md similarity index 100% rename from tidb-cloud/set-up-sink-private-endpoint-premium.md rename to tidb-cloud/premium/set-up-sink-private-endpoint-premium.md diff --git a/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md b/tidb-cloud/premium/tidb-cloud-billing-ticdc-ccu.md similarity index 100% rename from tidb-cloud/tidb-cloud-billing-ticdc-ccu.md rename to tidb-cloud/premium/tidb-cloud-billing-ticdc-ccu.md From 2627833202e0131319b7c2e17c54def87ba42b2d Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 15:17:02 +0800 Subject: [PATCH 25/30] Update TOC-tidb-cloud-premium.md --- TOC-tidb-cloud-premium.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index b797bf126cbf5..c77f2c52e4451 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -219,6 +219,13 @@ - [CSV Configurations for Importing Data](/tidb-cloud/csv-config-for-import-data.md) - [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md) - [Connect AWS DMS to TiDB Cloud clusters](/tidb-cloud/tidb-cloud-connect-aws-dms.md) +- Stream Data + - [Changefeed Overview](/tidb-cloud/changefeed-overview.md) + - [To MySQL Sink](/tidb-cloud/changefeed-sink-to-mysql.md) + - [To Kafka Sink](/tidb-cloud/changefeed-sink-to-apache-kafka.md) + - Reference + - [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) + - [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) - Security - [Security Overview](/tidb-cloud/security-overview.md) - Identity Access Control @@ -241,6 +248,7 @@ - [Credits](/tidb-cloud/tidb-cloud-billing.md#credits) - [Payment Method Setting](/tidb-cloud/tidb-cloud-billing.md#payment-method) - [Billing from Cloud Provider Marketplace](/tidb-cloud/tidb-cloud-billing.md#billing-from-cloud-provider-marketplace) + - [Billing for Changefeed](/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md) - [Manage Budgets](/tidb-cloud/tidb-cloud-budget.md) - Integrations - [Airbyte](/tidb-cloud/integrate-tidbcloud-with-airbyte.md) From 431e7a776617f0275d0af219cbdd4590675160b1 Mon Sep 17 00:00:00 2001 From: qiancai Date: Tue, 28 Oct 2025 15:40:46 +0800 Subject: [PATCH 26/30] fix two doc links --- TOC-tidb-cloud-premium.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/TOC-tidb-cloud-premium.md b/TOC-tidb-cloud-premium.md index c77f2c52e4451..4b46b6897b5c6 100644 --- a/TOC-tidb-cloud-premium.md +++ b/TOC-tidb-cloud-premium.md @@ -225,7 +225,7 @@ - [To Kafka Sink](/tidb-cloud/changefeed-sink-to-apache-kafka.md) - Reference - [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) - - [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) + - [Set Up Private Endpoint for Changefeeds](/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md) - Security - [Security Overview](/tidb-cloud/security-overview.md) - Identity Access Control @@ -248,7 +248,7 @@ - [Credits](/tidb-cloud/tidb-cloud-billing.md#credits) - [Payment Method Setting](/tidb-cloud/tidb-cloud-billing.md#payment-method) - [Billing from Cloud Provider Marketplace](/tidb-cloud/tidb-cloud-billing.md#billing-from-cloud-provider-marketplace) - - [Billing for Changefeed](/tidb-cloud/tidb-cloud-billing-ticdc-ccu.md) + - [Billing for Changefeed](/tidb-cloud/premium/tidb-cloud-billing-ticdc-ccu.md) - [Manage Budgets](/tidb-cloud/tidb-cloud-budget.md) - Integrations - [Airbyte](/tidb-cloud/integrate-tidbcloud-with-airbyte.md) From 9598c3caf219ec3efedea2aed0c68e88033bb805 Mon Sep 17 00:00:00 2001 From: qiancai Date: Wed, 29 Oct 2025 20:39:39 +0800 Subject: [PATCH 27/30] fix broken links and a role name --- tidb-cloud/changefeed-sink-to-apache-kafka.md | 2 +- tidb-cloud/changefeed-sink-to-mysql.md | 2 +- tidb-cloud/premium/set-up-sink-private-endpoint-premium.md | 3 +-- tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md | 2 +- 4 files changed, 4 insertions(+), 5 deletions(-) diff --git a/tidb-cloud/changefeed-sink-to-apache-kafka.md b/tidb-cloud/changefeed-sink-to-apache-kafka.md index 6b800cf463f84..9bb1195bb029a 100644 --- a/tidb-cloud/changefeed-sink-to-apache-kafka.md +++ b/tidb-cloud/changefeed-sink-to-apache-kafka.md @@ -116,7 +116,7 @@ Private Connect leverages **Private Link** or **Private Service Connect** techno TiDB Cloud currently supports Private Connect only for self-hosted Kafka. It does not support direct integration with MSK, Confluent Kafka, or other Kafka SaaS services. To connect to these Kafka SaaS services via Private Connect, you can deploy a [kafka-proxy](https://github.com/grepplabs/kafka-proxy) as an intermediary, effectively exposing the Kafka service as self-hosted Kafka. -If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create a private endpoint. +If your Apache Kafka service is hosted on AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md) to configure the network connection and obtain the **Bootstrap Ports** information, and then follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md) to create a private endpoint.
diff --git a/tidb-cloud/changefeed-sink-to-mysql.md b/tidb-cloud/changefeed-sink-to-mysql.md index 5eccbbdcad5b0..676bf3d62dafa 100644 --- a/tidb-cloud/changefeed-sink-to-mysql.md +++ b/tidb-cloud/changefeed-sink-to-mysql.md @@ -90,7 +90,7 @@ Make sure that your TiDB Cloud instance can connect to the MySQL service. Private endpoints leverage **Private Link** or **Private Service Connect** technologies from cloud providers, enabling resources in your VPC to connect to services in other VPCs through private IP addresses, as if those services were hosted directly within your VPC. -You can connect your TiDB Cloud instance to your MySQL service securely through a private endpoint. If the private endpoint is not available for your MySQL service, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/set-up-sink-private-endpoint-premium.md) to create one. +You can connect your TiDB Cloud instance to your MySQL service securely through a private endpoint. If the private endpoint is not available for your MySQL service, follow [Set Up Private Endpoint for Changefeeds](/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md) to create one. diff --git a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md index 32ae362a526ae..3e422e8afc008 100644 --- a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md @@ -17,8 +17,7 @@ This document describes how to create a private endpoint for changefeeds in your Only users with any of the following roles in your organization can create private endpoints for changefeeds: - `Organization Owner` -- `Instance Admin` for the corresponding instance - +- `Instance Administrator` for the corresponding instance For more information about roles in TiDB Cloud, see [User roles](/tidb-cloud/manage-user-access.md#user-roles). diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index 3fa8d2db7b4e6..ca6c2a320acba 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -63,7 +63,7 @@ The document provides an example of connecting to a Kafka Private Link service d - Manage endpoint services - Connect to EC2 nodes to configure Kafka nodes -2. [Create a {{{ .premium }}} instance](/tidb-cloud/create-tidb-cluster-premium.md) if you do not have one. +2. [Create a {{{ .premium }}} instance](/tidb-cloud/premium/create-tidb-cluster-premium.md) if you do not have one. 3. Get the Kafka deployment information from your {{{ .premium }}} instance. From d15565d9a44892f165451464ab8b9098047452fc Mon Sep 17 00:00:00 2001 From: qiancai Date: Wed, 29 Oct 2025 20:43:10 +0800 Subject: [PATCH 28/30] fix a wrong link --- tidb-cloud/premium/set-up-sink-private-endpoint-premium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md index 3e422e8afc008..4487a6504407d 100644 --- a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md @@ -19,7 +19,7 @@ Only users with any of the following roles in your organization can create priva - `Organization Owner` - `Instance Administrator` for the corresponding instance -For more information about roles in TiDB Cloud, see [User roles](/tidb-cloud/manage-user-access.md#user-roles). +For more information about roles in TiDB Cloud, see [User roles](/tidb-cloud/premium/manage-user-access-premium.md#user-roles). ### Network From fa1973fa13c53fd4bd9fbbf3f5492a7a9faf6493 Mon Sep 17 00:00:00 2001 From: qiancai Date: Wed, 29 Oct 2025 20:48:38 +0800 Subject: [PATCH 29/30] Update setup-aws-self-hosted-kafka-private-link-service.md --- tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md index ca6c2a320acba..9221c6c4036fb 100644 --- a/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md +++ b/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md @@ -63,7 +63,7 @@ The document provides an example of connecting to a Kafka Private Link service d - Manage endpoint services - Connect to EC2 nodes to configure Kafka nodes -2. [Create a {{{ .premium }}} instance](/tidb-cloud/premium/create-tidb-cluster-premium.md) if you do not have one. +2. [Create a {{{ .premium }}} instance](/tidb-cloud/premium/create-tidb-instance-premium.md) if you do not have one. 3. Get the Kafka deployment information from your {{{ .premium }}} instance. From 39b4242f6331d0494bfc830d7c6c168c08fe0b26 Mon Sep 17 00:00:00 2001 From: qiancai Date: Wed, 29 Oct 2025 20:53:14 +0800 Subject: [PATCH 30/30] fix a broken link --- tidb-cloud/premium/set-up-sink-private-endpoint-premium.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md index 4487a6504407d..3c9a6ecaab6a5 100644 --- a/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md +++ b/tidb-cloud/premium/set-up-sink-private-endpoint-premium.md @@ -33,7 +33,7 @@ If your changefeed downstream service is hosted on AWS, collect the following in - The name of the Private Endpoint Service for your downstream service - The availability zones (AZs) where your downstream service is deployed -If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka instance as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md#step-2-expose-the-kafka-instance-as-private-link-service) to set up the load balancer and the Private Link Service. +If the Private Endpoint Service is not available for your downstream service, follow [Step 2. Expose the Kafka cluster as Private Link Service](/tidb-cloud/setup-aws-self-hosted-kafka-private-link-service.md#step-2-expose-the-kafka-cluster-as-private-link-service) to set up the load balancer and the Private Link Service.