Skip to content

Commit

Permalink
Read Committed 40001 errors; lock promotion (#18522)
Browse files Browse the repository at this point in the history
* add enterprise callout; remove preview callout
* read committed txns don't need client-side retry handling
* lock promotion
* Read Committed cluster setting is true by default
* update various docs for Read Committed
* check whether transactions are being upgraded
  • Loading branch information
taroface committed May 14, 2024
1 parent fa1aba9 commit e255837
Show file tree
Hide file tree
Showing 28 changed files with 141 additions and 77 deletions.
2 changes: 2 additions & 0 deletions src/current/_includes/v23.2/metric-names.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,8 @@ Name | Description
`sql.txn.begin.count` | Number of SQL transaction BEGIN statements
`sql.txn.commit.count` | Number of SQL transaction COMMIT statements
`sql.txn.contended.count` | Number of SQL transactions that experienced contention
`sql.txn.isolation.executed_at.read_committed` | Number of times a [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) transaction was executed.
`sql.txn.isolation.upgraded_from.read_committed` | Number of times a [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) transaction was automatically upgraded to a stronger isolation level.
`sql.txn.rollback.count` | Number of SQL transaction ROLLBACK statements
`sql.update.count` | Number of SQL UPDATE statements
`storage.l0-level-score` | Compaction score of level 0
Expand Down
2 changes: 1 addition & 1 deletion src/current/_includes/v24.1/app/retry-errors.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{{site.data.alerts.callout_info}}
Your application should [use a retry loop to handle transaction errors]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#transaction-retry-errors) that can occur under [contention]({{ link_prefix }}performance-best-practices-overview.html#understanding-and-avoiding-transaction-contention).
When running under the default [`SERIALIZABLE`]({% link {{ page.version.version }}/demo-serializable.md %}) isolation level, your application should [use a retry loop to handle transaction errors]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#transaction-retry-errors) that can occur under [contention]({{ link_prefix }}performance-best-practices-overview.html#understanding-and-avoiding-transaction-contention). Client-side retry handling is **not** necessary under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) isolation.
{{site.data.alerts.end}}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
CockroachDB requires moderate levels of clock synchronization to preserve data consistency. For this reason, when a node detects that its clock is out of sync with at least half of the other nodes in the cluster by 80% of the maximum offset allowed, it spontaneously shuts down. This offset defaults to 500ms but can be changed via the [`--max-offset`]({% link {{ page.version.version }}/cockroach-start.md %}#flags-max-offset) flag when starting each node.

While [serializable consistency](https://wikipedia.org/wiki/Serializability) is maintained regardless of clock skew, skew outside the configured clock offset bounds can result in violations of single-key linearizability between causally dependent transactions. It's therefore important to prevent clocks from drifting too far by running [NTP](http://www.ntp.org/) or other clock synchronization software on each node.
Regardless of clock skew, [`SERIALIZABLE`]({% link {{ page.version.version }}/demo-serializable.md %}) and [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) transactions both serve globally consistent ("non-stale") reads and [commit atomically]({% link {{ page.version.version }}/developer-basics.md %}#how-transactions-work-in-cockroachdb). However, skew outside the configured clock offset bounds can result in violations of single-key linearizability between causally dependent transactions. It's therefore important to prevent clocks from drifting too far by running [NTP](http://www.ntp.org/) or other clock synchronization software on each node.

In very rare cases, CockroachDB can momentarily run with a stale clock. This can happen when using vMotion, which can suspend a VM running CockroachDB, migrate it to different hardware, and resume it. This will cause CockroachDB to be out of sync for a short period before it jumps to the correct time. During this window, it would be possible for a client to read stale data and write data derived from stale reads. By enabling the `server.clock.forward_jump_check_enabled` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}), you can be alerted when the CockroachDB clock jumps forward, indicating it had been running with a stale clock. To protect against this on vMotion, however, use the [`--clock-device`](cockroach-start.html#general) flag to specify a [PTP hardware clock](https://www.kernel.org/doc/html/latest/driver-api/ptp.html) for CockroachDB to use when querying the current time. When doing so, you should not enable `server.clock.forward_jump_check_enabled` because forward jumps will be expected and harmless. For more information on how `--clock-device` interacts with vMotion, see [this blog post](https://core.vmware.com/blog/cockroachdb-vmotion-support-vsphere-7-using-precise-timekeeping).

Expand Down
2 changes: 2 additions & 0 deletions src/current/_includes/v24.1/metric-names.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,8 @@ Name | Description
`sql.txn.begin.count` | Number of SQL transaction BEGIN statements
`sql.txn.commit.count` | Number of SQL transaction COMMIT statements
`sql.txn.contended.count` | Number of SQL transactions that experienced contention
`sql.txn.isolation.executed_at.read_committed` | Number of times a [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) transaction was executed.
`sql.txn.isolation.upgraded_from.read_committed` | Number of times a [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) transaction was automatically upgraded to a stronger isolation level.
`sql.txn.rollback.count` | Number of SQL transaction ROLLBACK statements
`sql.update.count` | Number of SQL UPDATE statements
`storage.l0-level-score` | Compaction score of level 0
Expand Down
2 changes: 1 addition & 1 deletion src/current/_includes/v24.1/misc/database-terms.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ A set of operations performed on a database that satisfy the requirements of [AC
<a name="architecture-overview-contention"></a> A [state of conflict]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) that occurs when:

- A [transaction]({% link {{ page.version.version }}/transactions.md %}) is unable to complete due to another concurrent or recent transaction attempting to write to the same data. This is also called *lock contention*.
- A transaction is [automatically retried]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) because it could not be placed into a [serializable ordering]({% link {{ page.version.version }}/demo-serializable.md %}) among all of the currently executing transactions. This is also called a *serializability conflict*. If the automatic retry is not possible or fails, a [*transaction retry error*](../transaction-retry-error-reference.html) is emitted to the client, requiring the client application to [retry the transaction](../transaction-retry-error-reference.html#client-side-retry-handling).
- A transaction is [automatically retried]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) because it could not be placed into a [serializable ordering]({% link {{ page.version.version }}/demo-serializable.md %}) among all of the currently executing transactions. This is also called a *serialization conflict*. If the automatic retry is not possible or fails, a [*transaction retry error*](../transaction-retry-error-reference.html) is emitted to the client, requiring a client application running under `SERIALIZABLE` isolation to [retry the transaction](../transaction-retry-error-reference.html#client-side-retry-handling).

Steps should be taken to [reduce transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#reduce-transaction-contention) in the first place.

Expand Down
1 change: 1 addition & 0 deletions src/current/_includes/v24.1/misc/enterprise-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

Feature | Description
--------+-------------------------
[Read Committed isolation]({% link {{ page.version.version }}/read-committed.md %}) | Achieve predictable query performance at high workload concurrencies, but without guaranteed transaction serializability.
[Follower Reads]({% link {{ page.version.version }}/follower-reads.md %}) | Reduce read latency in multi-region deployments by using the closest replica at the expense of reading slightly historical data.
[Multi-Region Capabilities]({% link {{ page.version.version }}/multiregion-overview.md %}) | Row-level control over where your data is stored to help you reduce read and write latency and meet regulatory requirements.
[PL/pgSQL]({% link {{ page.version.version }}/plpgsql.md %}) | Use a procedural language in [user-defined functions]({% link {{ page.version.version }}/user-defined-functions.md %}) and [stored procedures]({% link {{ page.version.version }}/stored-procedures.md %}) to improve performance and enable more complex queries.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
- [Send statements in transactions as a single batch]({% link {{ page.version.version }}/transactions.md %}#batched-statements). Batching allows CockroachDB to [automatically retry]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) a transaction when [previous reads are invalidated]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#read-refreshing) at a [pushed timestamp]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#timestamp-cache). When a multi-statement transaction is not batched, and takes more than a single round trip, CockroachDB cannot automatically retry the transaction. For an example showing how to break up large transactions in an application, see [Break up large transactions into smaller units of work](build-a-python-app-with-cockroachdb-sqlalchemy.html#break-up-large-transactions-into-smaller-units-of-work).

<a id="result-buffer-size"></a>

- Limit the size of the result sets of your transactions to under 16KB, so that CockroachDB is more likely to [automatically retry]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) when [previous reads are invalidated]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#read-refreshing) at a [pushed timestamp]({% link {{ page.version.version }}/architecture/transaction-layer.md %}#timestamp-cache). When a transaction returns a result set over 16KB, even if that transaction has been sent as a single batch, CockroachDB cannot automatically retry the transaction. You can change the results buffer size for all new sessions using the `sql.defaults.results_buffer.size` [cluster setting](cluster-settings.html), or for a specific session using the `results_buffer_size` [session variable](set-vars.html).
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
In most cases, the correct actions to take when encountering transaction retry errors are:

1. Update your application to support [client-side retry handling]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#client-side-retry-handling) when transaction retry errors are encountered. Follow the guidance for the [specific error type]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#transaction-retry-error-reference).
1. Under `SERIALIZABLE` isolation, update your application to support [client-side retry handling]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#client-side-retry-handling) when transaction retry errors are encountered. Follow the guidance for the [specific error type]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#transaction-retry-error-reference).

1. Take steps to [minimize transaction retry errors]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#minimize-transaction-retry-errors) in the first place. This means reducing transaction contention overall, and increasing the likelihood that CockroachDB can [automatically retry]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) a failed transaction.
4 changes: 2 additions & 2 deletions src/current/_includes/v24.1/sql/isolation-levels.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Isolation is an element of [ACID transactions](https://en.wikipedia.org/wiki/ACID) that determines how concurrency is controlled, and ultimately guarantees consistency. CockroachDB offers two transaction isolation levels: [`SERIALIZABLE`]({% link {{ page.version.version }}/demo-serializable.md %}) and [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}).

By default, CockroachDB executes all transactions at the strongest ANSI transaction isolation level: `SERIALIZABLE`, which permits no concurrency anomalies. To place all transactions in a serializable ordering, `SERIALIZABLE` isolation may require [transaction restarts]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}). For a demonstration of how `SERIALIZABLE` prevents write skew anomalies, see [Serializable Transactions]({% link {{ page.version.version }}/demo-serializable.md %}).
By default, CockroachDB executes all transactions at the strongest ANSI transaction isolation level: `SERIALIZABLE`, which permits no concurrency anomalies. To place all transactions in a serializable ordering, `SERIALIZABLE` isolation may require [transaction restarts]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}) and [client-side retry handling]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#client-side-retry-handling). For a demonstration of how `SERIALIZABLE` prevents anomalies such as write skew, see [Serializable Transactions]({% link {{ page.version.version }}/demo-serializable.md %}).

CockroachDB can be configured to execute transactions at [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) instead of `SERIALIZABLE` isolation. If [enabled]({% link {{ page.version.version }}/read-committed.md %}#enable-read-committed-isolation), `READ COMMITTED` is no longer an alias for `SERIALIZABLE` . `READ COMMITTED` permits some concurrency anomalies in exchange for minimizing transaction aborts and [retries]({% link {{ page.version.version }}/developer-basics.md %}#transaction-retries). Depending on your workload requirements, this may be desirable. For more information, see [Read Committed Transactions]({% link {{ page.version.version }}/read-committed.md %}).
CockroachDB can be configured to execute transactions at [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) instead of `SERIALIZABLE` isolation. If [enabled]({% link {{ page.version.version }}/read-committed.md %}#enable-read-committed-isolation), `READ COMMITTED` is no longer an alias for `SERIALIZABLE` . `READ COMMITTED` permits some concurrency anomalies in exchange for minimizing transaction aborts and removing the need for client-side retries. Depending on your workload requirements, this may be desirable. For more information, see [Read Committed Transactions]({% link {{ page.version.version }}/read-committed.md %}).
8 changes: 5 additions & 3 deletions src/current/v23.2/read-committed.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,14 @@ To make `READ COMMITTED` isolation available to use on a cluster, enable the fol
SET CLUSTER SETTING sql.txn.read_committed_isolation.enabled = 'true';
~~~

After you enable the cluster setting, you can set `READ COMMITTED` as the [default isolation level](#set-the-default-isolation-level-to-read-committed) or [begin a transaction](#set-the-current-transaction-to-read-committed) as `READ COMMITTED`.
In v23.2, `sql.txn.read_committed_isolation.enabled` is `false` by default. As a result, `READ COMMITTED` transactions are [automatically upgraded to `SERIALIZABLE`]({% link {{ page.version.version }}/transactions.md %}#aliases) unless this setting is enabled. **This differs in v24.1 and later**, where `sql.txn.read_committed_isolation.enabled` is `true` by default.

{{site.data.alerts.callout_info}}
If the cluster setting is not enabled, `READ COMMITTED` transactions will run as `SERIALIZABLE`.
{{site.data.alerts.callout_success}}
Because of this change, upgrading to a later CockroachDB version may affect your application behavior. Check the [**Upgrades of SQL Transaction Isolation Level**]({% link {{ page.version.version }}/ui-sql-dashboard.md %}#upgrades-of-sql-transaction-isolation-level) graph in the DB Console to see whether any transactions are being upgraded to `SERIALIZABLE`. On v24.1 and later, `READ COMMITTED` transactions will run as `READ COMMITTED` unless you set `sql.txn.read_committed_isolation.enabled` explicitly to `false`.
{{site.data.alerts.end}}

After you enable the cluster setting, you can set `READ COMMITTED` as the [default isolation level](#set-the-default-isolation-level-to-read-committed) or [begin a transaction](#set-the-current-transaction-to-read-committed) as `READ COMMITTED`.

### Set the default isolation level to `READ COMMITTED`

To set all future transactions to run at `READ COMMITTED` isolation, use one of the following options:
Expand Down
8 changes: 8 additions & 0 deletions src/current/v23.2/ui-sql-dashboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@ The **SQL Connection Rate** is an average of the number of connection attempts p

- In the cluster view, the graph shows the rate of SQL connection attempts to all nodes, with lines for each node.

## Upgrades of SQL Transaction Isolation Level

- In the node view, the graph shows the total number of times a SQL transaction was upgraded to a stronger isolation level on the selected node.

- In the cluster view, the graph shows the total number of times a SQL transaction was upgraded to a stronger isolation level across all nodes.

If this metric is non-zero, then transactions at weaker isolation levels (such as [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %})) are being upgraded to [`SERIALIZABLE`]({% link {{ page.version.version }}/demo-serializable.md %}) instead. To ensure that `READ COMMITTED` transactions run as `READ COMMITTED`, see [Enable `READ COMMITTED` isolation]({% link {{ page.version.version }}/read-committed.md %}#enable-read-committed-isolation).

## Open SQL Transactions

- In the node view, the graph shows the total number of open SQL transactions on the node.
Expand Down
4 changes: 4 additions & 0 deletions src/current/v24.1/advanced-client-side-transaction-retries.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ toc: true
docs_area: develop
---

{{site.data.alerts.callout_info}}
Client-side retry handling is **not** necessary under [`READ COMMITTED`]({% link {{ page.version.version }}/read-committed.md %}) isolation.
{{site.data.alerts.end}}

This page has instructions for authors of [database drivers and ORMs]({% link {{ page.version.version }}/install-client-drivers.md %}) who would like to implement client-side retries in their database driver or ORM for maximum efficiency and ease of use by application developers.

{{site.data.alerts.callout_info}}
Expand Down

0 comments on commit e255837

Please sign in to comment.