Skip to content

Commit

Permalink
Add docs for session/cluster/table cdc filtering for row-level ttl v2…
Browse files Browse the repository at this point in the history
…3.2 & v24.1 (#18496)
  • Loading branch information
kathancox committed Apr 30, 2024
1 parent cf21100 commit df17963
Show file tree
Hide file tree
Showing 11 changed files with 108 additions and 13 deletions.
2 changes: 1 addition & 1 deletion src/current/_includes/releases/v24.1/v24.1.0-alpha.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Release Date: March 7, 2024
- `OUT` and `INOUT` parameter classes are now supported in [user-defined functions]({% link v23.2/user-defined-functions.md %}). [#118610][#118610]
- Out-of-process SQL servers will now start exporting a new `sql.aggregated_livebytes` [metric]({% link v23.2/metrics.md %}). This metric gets updated once every 60 seconds by default, and its update interval can be configured via the `tenant_global_metrics_exporter_interval` [cluster setting]({% link v23.2/cluster-settings.md %}). [#119140][#119140]
- Added support for index hints with [`INSERT`]({% link v23.2/insert.md %}) and [`UPSERT`]({% link v23.2/upsert.md %}) statements. This allows `INSERT ... ON CONFLICT` and `UPSERT` queries to use index hints in the same way they are already supported for [`UPDATE`]({% link v23.2/update.md %}) and [`DELETE`]({% link v23.2/delete.md %}) statements. [#119104][#119104]
- Added a new `ttl_disable_changefeed_replication` table storage parameter that can be used to disable changefeed replication for [row-level TTL]({% link v23.2/row-level-ttl.md %}) on a per-table basis. [#119611][#119611]
- Added a new [`ttl_disable_changefeed_replication`]({% link v24.1/row-level-ttl.md %}#filter-changefeeds-for-tables-using-row-level-ttl) table storage parameter that can be used to disable changefeed replication for [row-level TTL]({% link v23.2/row-level-ttl.md %}) on a per-table basis. [#119611][#119611]

<h3 id="v24-1-0-alpha-1-operational-changes">Operational changes</h3>

Expand Down
20 changes: 10 additions & 10 deletions src/current/_includes/releases/v24.1/v24.1.0-alpha.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ Release Date: March 25, 2024

<h3 id="v24-1-0-alpha-4-general-changes">General changes</h3>

- The following [metrics](../v24.1/metrics.html) were added for observability of per-store disk events:
- `storage.disk.read.count`
- `storage.disk.read.bytes`
- `storage.disk.read.time`
- `storage.disk.write.count`
- `storage.disk.write.bytes`
- `storage.disk.write.time`
- `storage.disk.io.time`
- `storage.disk.weightedio.time`
- The following [metrics](../v24.1/metrics.html) were added for observability of per-store disk events:
- `storage.disk.read.count`
- `storage.disk.read.bytes`
- `storage.disk.read.time`
- `storage.disk.write.count`
- `storage.disk.write.bytes`
- `storage.disk.write.time`
- `storage.disk.io.time`
- `storage.disk.weightedio.time`
- `storage.disk.iopsinprogress`

The metrics match the definitions of the `sys.host.disk.*` system metrics. [#119885][#119885]
Expand All @@ -27,7 +27,7 @@ Release Date: March 25, 2024

- `server.controller.default_target_cluster` can now be set to any virtual cluster name by default, including a virtual cluster yet to be created or have service started. [#120080][#120080]
- The [`READ COMMITTED`](../v24.1/read-committed.html) isolation level now requires the cluster to have a valid enterprise license. [#120154][#120154]
- The new boolean changefeed option `ignore_disable_changefeed_replication`, when set to `true`, prevents the changefeed from filtering events even if CDC filtering is configured via the `disable_changefeed_replication` [session variable](../v24.1/session-variables.html), `sql.ttl.changefeed_replication.disabled` [cluster setting](../v24.1/cluster-settings.html), or the `ttl_disable_changefeed_replication` [table storage parameter](../v24.1/alter-table.html#table-storage-parameters). [#120255][#120255]
- The new boolean changefeed option [`ignore_disable_changefeed_replication`](../v24.1/create-changefeed.html#ignore-disable-changefeed), when set to `true`, prevents the changefeed from filtering events even if CDC filtering is configured via the `disable_changefeed_replication` [session variable](../v24.1/session-variables.html), `sql.ttl.changefeed_replication.disabled` [cluster setting](../v24.1/cluster-settings.html), or the `ttl_disable_changefeed_replication` [table storage parameter](../v24.1/alter-table.html#table-storage-parameters). [#120255][#120255]

<h3 id="v24-1-0-alpha-4-sql-language-changes">SQL language changes</h3>

Expand Down
1 change: 1 addition & 0 deletions src/current/_includes/v23.2/cdc/disable-replication-ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{% include_cached new-in.html version="v23.2" %} To prevent changefeeds from emitting deletes issued by all TTL jobs on a cluster, set the `sql.ttl.changefeed_replication.disabled` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to `true`.
1 change: 1 addition & 0 deletions src/current/_includes/v23.2/misc/session-vars.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
| <a id="default-transaction-quality-of-service"></a> `default_transaction_quality_of_service` | The default transaction quality of service for the current session. The supported options are `regular`, `critical`, and `background`. See [Set quality of service level]({% link {{ page.version.version }}/admission-control.md %}#set-quality-of-service-level-for-a-session). | `regular` | Yes | Yes |
| <a id="default-transaction-read-only"></a> `default_transaction_read_only` | The default transaction access mode for the current session. <br/>If set to `on`, only read operations are allowed in transactions in the current session; if set to `off`, both read and write operations are allowed. See [`SET TRANSACTION`]({% link {{ page.version.version }}/set-transaction.md %}) for more details. | `off` | Yes | Yes |
| <a id="default-transaction-use-follower-reads"></a> `default_transaction_use_follower_reads` | If set to on, all read-only transactions use [`AS OF SYSTEM TIME follower_read_timestamp()`]({% link {{ page.version.version }}/as-of-system-time.md %}) to allow the transaction to use follower reads. <br/>If set to `off`, read-only transactions will only use follower reads if an `AS OF SYSTEM TIME` clause is specified in the statement, with an interval of at least 4.8 seconds. | `off` | Yes | Yes |
| <a id="disable-changefeed-replication"></a> `disable_changefeed_replication` | When `true`, [changefeeds]({% link {{ page.version.version }}/changefeed-messages.md %}#filtering-changefeed-messages) will not emit messages for any changes (e.g., `INSERT`, `UPDATE`) issued to watched tables during that session. | `false` | Yes | Yes |
| <a id="disallow-full-table-scans"></a> `disallow_full_table_scans` | If set to `on`, queries on "large" tables with a row count greater than [`large_full_scan_rows`](#large-full-scan-rows) will not use full table or index scans. If no other query plan is possible, queries will return an error message. This setting does not apply to internal queries, which may plan full table or index scans without checking the session variable. | `off` | Yes | Yes |
| <a id="distsql"></a> `distsql` | The query distribution mode for the session. By default, CockroachDB determines which queries are faster to execute if distributed across multiple nodes, and all other queries are run through the gateway node. | `auto` | Yes | Yes |
| <a id="enable-auto-rehoming"></a> `enable_auto_rehoming` | When enabled, the [home regions]({% link {{ page.version.version }}/alter-table.md %}#crdb_region) of rows in [`REGIONAL BY ROW`]({% link {{ page.version.version }}/alter-table.md %}#set-the-table-locality-to-regional-by-row) tables are automatically set to the region of the [gateway node]({% link {{ page.version.version }}/ui-sessions-page.md %}#session-details-gateway-node) from which any [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`UPSERT`]({% link {{ page.version.version }}/upsert.md %}) statements that operate on those rows originate. | `off` | Yes | Yes |
Expand Down
26 changes: 26 additions & 0 deletions src/current/_includes/v24.1/cdc/disable-replication-ttl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{% include_cached new-in.html version="v24.1" %} Use the `ttl_disable_changefeed_replication` table storage parameter to prevent changefeeds from sending `DELETE` messages issued by row-level TTL jobs for a table. Include the storage parameter when you create or alter the table. For example:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE TABLE tbl (
id UUID PRIMARY KEY default gen_random_uuid(),
value TEXT
) WITH (ttl_expire_after = '3 weeks', ttl_job_cron = '@daily', ttl_disable_changefeed_replication = 'true');
~~~

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER TABLE events SET (ttl_expire_after = '1 year', ttl_disable_changefeed_replication = 'true');
~~~

You can also widen the scope to the cluster by setting the `sql.ttl.changefeed_replication.disabled` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to `true`. This will prevent changefeeds from emitting deletes issued by all TTL jobs on a cluster.

If you want to have a changefeed ignore the storage parameter or cluster setting that disables changefeed replication, you can set the changefeed option `ignore_disable_changefeed_replication` to `true`:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE CHANGEFEED FOR TABLE table_name INTO 'external://changefeed-sink'
WITH resolved, ignore_disable_changefeed_replication = true;
~~~

This is useful when you have multiple use cases for different changefeeds on the same table. For example, you have a table with a changefeed streaming changes to another database for analytics workflows in which you do not want to reflect row-level TTL deletes. Secondly, you have a changefeed on the same table for audit-logging purposes for which you need to persist every change through the changefeed.
1 change: 1 addition & 0 deletions src/current/_includes/v24.1/misc/session-vars.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
| <a id="default-transaction-quality-of-service"></a> `default_transaction_quality_of_service` | The default transaction quality of service for the current session. The supported options are `regular`, `critical`, and `background`. See [Set quality of service level]({% link {{ page.version.version }}/admission-control.md %}#set-quality-of-service-level-for-a-session). | `regular` | Yes | Yes |
| <a id="default-transaction-read-only"></a> `default_transaction_read_only` | The default transaction access mode for the current session. <br/>If set to `on`, only read operations are allowed in transactions in the current session; if set to `off`, both read and write operations are allowed. See [`SET TRANSACTION`]({% link {{ page.version.version }}/set-transaction.md %}) for more details. | `off` | Yes | Yes |
| <a id="default-transaction-use-follower-reads"></a> `default_transaction_use_follower_reads` | If set to on, all read-only transactions use [`AS OF SYSTEM TIME follower_read_timestamp()`]({% link {{ page.version.version }}/as-of-system-time.md %}) to allow the transaction to use follower reads. <br/>If set to `off`, read-only transactions will only use follower reads if an `AS OF SYSTEM TIME` clause is specified in the statement, with an interval of at least 4.8 seconds. | `off` | Yes | Yes |
| <a id="disable-changefeed-replication"></a> `disable_changefeed_replication` | When `true`, [changefeeds]({% link {{ page.version.version }}/changefeed-messages.md %}#filtering-changefeed-messages) will not emit messages for any changes (e.g., `INSERT`, `UPDATE`) issued to watched tables during that session. | `false` | Yes | Yes |
| <a id="disallow-full-table-scans"></a> `disallow_full_table_scans` | If set to `on`, queries on "large" tables with a row count greater than [`large_full_scan_rows`](#large-full-scan-rows) will not use full table or index scans. If no other query plan is possible, queries will return an error message. This setting does not apply to internal queries, which may plan full table or index scans without checking the session variable. | `off` | Yes | Yes || <a id="distsql"></a> `distsql` | The query distribution mode for the session. By default, CockroachDB determines which queries are faster to execute if distributed across multiple nodes, and all other queries are run through the gateway node. | `auto` | Yes | Yes |
| <a id="enable-auto-rehoming"></a> `enable_auto_rehoming` | When enabled, the [home regions]({% link {{ page.version.version }}/alter-table.md %}#crdb_region) of rows in [`REGIONAL BY ROW`]({% link {{ page.version.version }}/alter-table.md %}#set-the-table-locality-to-regional-by-row) tables are automatically set to the region of the [gateway node]({% link {{ page.version.version }}/ui-sessions-page.md %}#session-details-gateway-node) from which any [`UPDATE`]({% link {{ page.version.version }}/update.md %}) or [`UPSERT`]({% link {{ page.version.version }}/upsert.md %}) statements that operate on those rows originate. | `off` | Yes | Yes |
| <a id="enable-durable-locking-for-serializable"></a> `enable_durable_locking_for_serializable` | Indicates whether CockroachDB replicates [`FOR UPDATE` and `FOR SHARE`]({% link {{ page.version.version }}/select-for-update.md %}#lock-strengths) locks via [Raft]({% link {{ page.version.version }}/architecture/replication-layer.md %}#raft), allowing locks to be preserved when leases are transferred. Note that replicating `FOR UPDATE` and `FOR SHARE` locks will add latency to those statements. This setting only affects `SERIALIZABLE` transactions and matches the default `READ COMMITTED` behavior when enabled. | `off` | Yes | Yes |
Expand Down
26 changes: 26 additions & 0 deletions src/current/v23.2/changefeed-messages.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ This page describes the format and behavior of changefeed messages. You will fin
- [Resolved messages](#resolved-messages): The resolved timestamp option and how to configure it.
- [Duplicate messages](#duplicate-messages): The causes of duplicate messages from a changefeed.
- [Schema changes](#schema-changes): The effect of schema changes on a changefeed.
- [Filtering changefeed messages](#filtering-changefeed-messages): The settings and syntax to prevent and filter the messages that changefeeds emit.
- [Message formats](#message-formats): The limitations and type mapping when creating a changefeed with different message formats.

{{site.data.alerts.callout_info}}
Expand Down Expand Up @@ -478,6 +479,31 @@ Refer to the [`CREATE CHANGEFEED` option table]({% link {{ page.version.version
{% include {{ page.version.version }}/cdc/virtual-computed-column-cdc.md %}
{{site.data.alerts.end}}

## Filtering changefeed messages

There are several ways to define messages, filter different types of message, or prevent all changefeed messages from emitting to the sink. The following sections outline configurable settings and SQL syntax to handle different use cases.

### Prevent changefeeds from emitting row-level TTL deletes

{% include_cached new-in.html version="v23.2" %} To prevent changefeeds from emitting deletes issued by all [TTL jobs]({% link {{ page.version.version }}/row-level-ttl.md %}) on a cluster, set the `sql.ttl.changefeed_replication.disabled` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) to `true`.

### Disable changefeeds from emitting messages

{% include_cached new-in.html version="v23.2" %} To prevent changefeeds from emitting messages for any changes (e.g., `INSERT`, `UPDATE`) issued to watched tables during that session, set the `disable_changefeed_replication` [session variable]({% link {{ page.version.version }}/session-variables.md %}) to `true`.

### Define the change data emitted to a sink

When you create a changefeed, use change data capture queries to define the change data emitted to your sink.

For example:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE CHANGEFEED INTO 'scheme://sink-URI' WITH updated AS SELECT column, column FROM table;
~~~

For details on syntax and examples, refer to the [Change Data Capture Queries]({% link {{ page.version.version }}/cdc-queries.md %}) page.

## Message formats

{% include {{ page.version.version }}/cdc/message-format-list.md %}
Expand Down
4 changes: 3 additions & 1 deletion src/current/v23.2/row-level-ttl.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ SHOW CREATE TABLE ttl_test_per_table;

The settings that control the behavior of Row-Level TTL are provided using [storage parameters]({% link {{ page.version.version }}/sql-grammar.md %}#opt_with_storage_parameter_list). These parameters can be set during table creation using [`CREATE TABLE`](#create-a-table-with-a-ttl_expiration_expression), added to an existing table using the [`ALTER TABLE`](#add-or-update-the-row-level-ttl-for-an-existing-table) statement, or [reset to default values](#reset-a-storage-parameter-to-its-default-value).

| Description | Option | Associated cluster setting |
| Option | Description | Associated cluster setting |
|----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------|
| `ttl_expiration_expression` <a name="param-ttl-expiration-expression"></a> | **Recommended in v22.2+**. SQL expression that defines the TTL expiration. Must evaluate to a [`TIMESTAMPTZ`]({% link {{ page.version.version }}/timestamp.md %}). This and/or [`ttl_expire_after`](#param-ttl-expire-after) are required to enable TTL. This parameter is useful when you want to set the TTL for individual rows in the table. For an example, see [Create a table with a `ttl_expiration_expression`](#create-a-table-with-a-ttl_expiration_expression). | N/A |
| `ttl_expire_after` <a name="param-ttl-expire-after"></a> | The [interval]({% link {{ page.version.version }}/interval.md %}) when a TTL will expire. This and/or [`ttl_expiration_expression`](#param-ttl-expiration-expression) are required to enable TTL. Minimum value: `'1 microsecond'`. | N/A |
Expand Down Expand Up @@ -556,6 +556,8 @@ Row-level TTL interacts with [changefeeds]({% link {{ page.version.version }}/cr

- When expired rows are deleted, a [changefeed delete message]({% link {{ page.version.version }}/changefeed-messages.md %}#delete-messages) is emitted.

{% include {{ page.version.version }}/cdc/disable-replication-ttl.md %}

For guidance on how to filter changefeed messages to emit row-level TTL deletes only, refer to [Change Data Capture Queries]({% link {{ page.version.version }}/cdc-queries.md %}#reference-ttl-in-a-cdc-query).

## Backup and restore
Expand Down
Loading

0 comments on commit df17963

Please sign in to comment.