Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs on storage engine WAL failover #18511

Merged
merged 15 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 1 addition & 2 deletions src/current/_includes/releases/v24.1/v24.1.0-alpha.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,7 @@ Release Date: March 25, 2024

<h3 id="v24-1-0-alpha-4-operational-changes">Operational changes</h3>

- The [cluster setting](../v24.1/cluster-settings.html) `admission.wal.failover.unlimited_tokens.enabled` can be set to `true` to cause unlimited admission tokens during write-ahead log (WAL) failover. This should not be changed without significant expertise, since the default, which preserves the token counts from the preceding non-WAL-failover interval, is expected to be safer. [#120135][#120135]
- The new [`cockroach start`](../v24.1/cockroach-start.html) option `--wal-failover=among-stores` or `COCKROACH_WAL_FAILOVER=among-stores` environment variable will configure a multi-store CockroachDB node to fail over a store's write-ahead log (WAL) to another store's data directory. Failing over the write-ahead log may allow some operations against a store to continue to complete even if the underlying storage is temporarily unavailable. [#120509][#120509]
- The new [`cockroach start`](../v24.1/cockroach-start.html) option [`--wal-failover=among-stores` or `COCKROACH_WAL_FAILOVER=among-stores`]({% link v24.1/cockroach-start.md %}#write-ahead-log-wal-failover) environment variable will configure a multi-store CockroachDB node to fail over a store's write-ahead log (WAL) to another store's data directory. Failing over the write-ahead log may allow some operations against a store to continue completing, even if the underlying storage is temporarily unavailable. This feature is in [preview]({% link v24.1/cockroachdb-feature-availability.md %}#features-in-preview). [#120509][#120509]
- The new `storage.wal_failover.unhealthy_op_threshold` [cluster setting](../v24.1/cluster-settings.html) allows configuring the latency threshold at which a WAL write is considered unhealthy. [#120509][#120509]
- Two new metrics track the status of the SQL Activity Update job, which pre-aggregates top K information within the SQL statistics subsytem and writes the results to `system.statement_activity` and `system.transaction_activity`:
- `sql.stats.activity.updates.successful`: Number of successful updates made by the SQL activity updater job.
Expand Down
58 changes: 58 additions & 0 deletions src/current/v24.1/cockroach-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ Flag | Description
<a name="flags-max-tsdb-memory"></a>`--max-tsdb-memory` | Maximum memory capacity available to store temporary data for use by the time-series database to display metrics in the [DB Console]({% link {{ page.version.version }}/ui-overview.md %}). Consider raising this value if your cluster is comprised of a large number of nodes where individual nodes have very limited memory available (e.g., under `8 GiB`). Insufficient memory capacity for the time-series database can constrain the ability of the DB Console to process the time-series queries used to render metrics for the entire cluster. This capacity constraint does not affect SQL query execution. This flag accepts numbers interpreted as bytes, size suffixes (e.g., `1GB` and `1GiB`) or a percentage of physical memory (e.g., `0.01`).<br><br>**Note:** The sum of `--cache`, `--max-sql-memory`, and `--max-tsdb-memory` should not exceed 75% of the memory available to the `cockroach` process.<br><br>**Default:** `0.01` (i.e., 1%) of physical memory or `64 MiB`, whichever is greater.
`--pid-file` | The file to which the node's process ID will be written as soon as the node is ready to accept connections. When `--background` is used, this happens before the process detaches from the terminal. When this flag is not set, the process ID is not written to file.
<a name="flags-store"></a> `--store`<br>`-s` | The file path to a storage device and, optionally, store attributes and maximum size. When using multiple storage devices for a node, this flag must be specified separately for each device, for example: <br><br>`--store=/mnt/ssd01 --store=/mnt/ssd02` <br><br>For more details, see [Store](#store) below.
`--wal-failover` <a name="flag-wal-failover"></a> | Used to configure [WAL failover](#write-ahead-log-wal-failover) on [nodes]({% link {{ page.version.version }}/architecture/overview.md %}#node) with [multiple stores](#store). To enable WAL failover, pass `--wal-failover=among-stores`. To disable, pass `--wal-failover=disabled` on [node restart]({% link {{ page.version.version }}/node-shutdown.md %}#stop-and-restart-a-node). This feature is in [preview]({% link {{page.version.version}}/cockroachdb-feature-availability.md %}#features-in-preview).
<a name="flags-spatial-libs"></a>`--spatial-libs` | The location on disk where CockroachDB looks for [spatial]({% link {{ page.version.version }}/spatial-data-overview.md %}) libraries.<br/><br/>**Defaults:** <br/><ul><li>`/usr/local/lib/cockroach`</li><li>A `lib` subdirectory of the CockroachDB binary's current directory.</li></ul><br/>
`--temp-dir` <a name="temp-dir"></a> | The path of the node's temporary store directory. On node start up, the location for the temporary files is printed to the standard output. <br><br>**Default:** Subdirectory of the first [store](#store)

Expand Down Expand Up @@ -188,6 +189,10 @@ For more information about how to use CockroachDB's multi-region capabilities, s

### Storage

- [Storage engine](#storage-engine)
- [Store](#store)
- [Write Ahead Log (WAL) Failover](#write-ahead-log-wal-failover)

#### Storage engine

The `--storage-engine` flag is used to choose the storage engine used by the node. Note that this setting applies to all [stores](#store) on the node, including the [temp store](#temp-dir).
Expand Down Expand Up @@ -226,6 +231,59 @@ Field | Description
<a name="fields-ballast-size"></a> `ballast-size` | Configure the size of the automatically created emergency ballast file. Accepts the same value formats as the [`size` field](#store-size). For more details, see [Automatic ballast files]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#automatic-ballast-files).<br><br>To disable automatic ballast file creation, set the value to `0`:<br><br>`--store=path=/mnt/ssd01,ballast-size=0`
<a name="store-provisioned-rate"></a> `provisioned-rate` | A mapping of a store name to a bandwidth limit, expressed in bytes per second. This constrains the bandwidth used for [admission control]({% link {{ page.version.version }}/admission-control.md %}) for operations on the store. The disk name is separated from the bandwidth value by a colon (`:`). A value of `0` (the default) represents unlimited bandwidth. For example: <br /><br />`--store=provisioned-rate=disk-name=/mnt/ssd01:200`<br /><br />**Default:** 0<br /><br />If the bandwidth value is omitted, bandwidth is limited to the value of the [`kv.store.admission.provisioned_bandwidth` cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}#settings). <strong>Modify this setting only in consultation with your <a href="https://support.cockroachlabs.com/hc/en-us">support team</a>.</strong>

#### Write Ahead Log (WAL) Failover

{% include_cached new-in.html version="v24.1" %} On a CockroachDB [node]({% link {{ page.version.version }}/architecture/overview.md %}#node) with [multiple stores](#store), you can mitigate some effects of [disk stalls]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disk-stalls) by configuring the node to failover each store's [write-ahead log (WAL)]({% link {{ page.version.version }}/architecture/storage-layer.md %}#memtable-and-write-ahead-log) to another store's data directory using the `--wal-failover` flag.

Failing over the WAL may allow some operations against a store to continue to complete despite temporary unavailability of the underlying storage. For example, if the node's primary store is stalled, and the node can't read from or write to it, the node can still write to the WAL on another store. This can give the node a chance to eventually catch up once the disk stall has been resolved.

When WAL failover is enabled, CockroachDB will take the the following actions:

- At node startup, each store is assigned another store to be its failover destination.
- CockroachDB will begin monitoring the latency of all WAL writes. If latency to the WAL exceeds the value of the [cluster setting `storage.wal_failover.unhealthy_op_threshold`]({% link {{page.version.version}}/cluster-settings.md %}#setting-storage-wal-failover-unhealthy-op-threshold), the node will attempt to write WAL entries to a secondary store's volume.
- CockroachDB will update the [store status endpoint]({% link {{ page.version.version }}/monitoring-and-alerting.md %}#store-status-endpoint) at `/_status/stores` so you can monitor the store's status.

{{site.data.alerts.callout_info}}
{% include feature-phases/preview.md %}

When this feature exits preview status and is generally available (GA), it will be an [Enterprise feature]({% link {{ page.version.version }}/enterprise-licensing.md %}).
rmloveland marked this conversation as resolved.
Show resolved Hide resolved
{{site.data.alerts.end}}

##### Enable WAL failover

To enable WAL failover, you must take one of the following actions:

- Pass [`--wal-failover=among-stores`](#flag-wal-failover) to `cockroach start`, or
- Set the environment variable `COCKROACH_WAL_FAILOVER=among-stores` before starting the node.

[Writing log files to local disk]({% link {{ page.version.version }}/configure-logs.md %}#output-to-files) using the default configuration can lead to cluster instability in the event of a [disk stall]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disk-stalls). So It's not enough to failover your WAL writes to another disk, you must also write your log files in such a way that the forward progress of your cluster is not stalled due to disk unavailability.

Therefore, if you enable WAL failover, you must also update your [logging]({% link {{page.version.version}}/logging-overview.md %}) configuration as follows:

- (**Recommended**) Configure [remote log sinks]({% link {{page.version.version}}/logging-use-cases.md %}#network-logging) that are not correlated with the availability of your cluster's local disks.
- If you must log to local disks:
1. Disable [audit logging]({% link {{ page.version.version }}/sql-audit-logging.md %}). File-based audit logging and the WAL failover feature cannot coexist. File-based audit logging provides guarantees that every log message makes it to disk, otherwise CockroachDB needs to shut down. Because of this, resuming operations in the face of disk unavailability is not compatible with audit logging.
rmloveland marked this conversation as resolved.
Show resolved Hide resolved
1. Enable asynchronous buffering of [`file-groups` log sinks]({% link {{ page.version.version }}/configure-logs.md %}#output-to-files) using the `buffering` configuration option. The `buffering` configuration can be applied to [`file-defaults`]({% link {{ page.version.version }}/configure-logs.md %}#configure-logging-defaults) or individual `file-groups` as needed. Note that enabling asynchronous buffering of `file-groups` log sinks is in [preview]({% link v24.1/cockroachdb-feature-availability.md %}#features-in-preview).
1. Set `max-staleness: 1s` and `flush-trigger-size: 256KiB`.
1. When `buffering` is enabled, `buffered-writes` must be explicitly disabled as shown below. This is necessary because `buffered-writes` does not provide true asynchronous disk access, but rather a small buffer. If the small buffer fills up, it can cause internal routines performing logging operations to hang. This in turn will cause internal routines doing other important work to hang, potentially affecting cluster stability.
1. The recommended logging configuration for using file-based logging with WAL failover is as follows:

~~~
file-defaults:
rmloveland marked this conversation as resolved.
Show resolved Hide resolved
buffered-writes: false
buffering:
max-staleness: 1s
flush-trigger-size: 256KiB
max-buffer-size: 50MiB
~~~

##### Disable WAL failover

To disable WAL failover, you must [restart the node]({% link {{ page.version.version }}/node-shutdown.md %}#stop-and-restart-a-node) and either:

- Pass the [`--wal-failover=disabled`](#flag-wal-failover) flag to `cockroach start`, or
- Set the environment variable `COCKROACH_WAL_FAILOVER=disabled` before restarting the node.

### Logging

By [default]({% link {{ page.version.version }}/configure-logs.md %}#default-logging-configuration), `cockroach start` writes all messages to log files, and prints nothing to `stderr`. This includes events with `INFO` [severity]({% link {{ page.version.version }}/logging.md %}#logging-levels-severities) and higher. However, you can [customize the logging behavior]({% link {{ page.version.version }}/configure-logs.md %}) of this command by using the `--log` flag:
Expand Down
6 changes: 6 additions & 0 deletions src/current/v24.1/cockroachdb-feature-availability.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,12 @@ The [`EXPERIMENTAL CHANGEFEED FOR`]({% link {{ page.version.version }}/changefee

The multiple active portals feature of the Postgres wire protocol (pgwire) is available, with limitations. For more information, see [Multiple active portals]({% link {{ page.version.version }}/postgresql-compatibility.md %}#multiple-active-portals).

### Write Ahead Log (WAL) Failover

When a CockroachDB [node]({% link {{ page.version.version }}/architecture/overview.md %}#node) is configured to run with [multiple stores]({% link {{ page.version.version }}/cockroach-start.md %}#store), you can mitigate some effects of [disk stalls]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#disk-stalls) by configuring the node to failover each store's [write-ahead log (WAL)]({% link {{ page.version.version }}/architecture/storage-layer.md %}#memtable-and-write-ahead-log) to another store's data directory.

For more information, see [Write Ahead Log (WAL Failover)]({% link {{ page.version.version }}/cockroach-start.md %}#write-ahead-log-wal-failover).

## See Also

- [`SHOW {session variable}`]({% link {{ page.version.version }}/show-vars.md %})
Expand Down
43 changes: 43 additions & 0 deletions src/current/v24.1/monitoring-and-alerting.md
Original file line number Diff line number Diff line change
Expand Up @@ -978,6 +978,49 @@ curl -X POST http://localhost:8080/_status/critical_nodes
}
~~~

### Store status endpoint

The store status endpoint at `/_status/stores` provides information about the [stores]({% link {{ page.version.version }}/cockroach-start.md %}#store) attached to each [node]({% link {{ page.version.version }}/architecture/overview.md %}#node) of your cluster.

The response is a JSON object containing a `stores` array of objects. Each store object has the following fields:

Field | Description
------|------------
`storeId` | The [store ID]({% link {{ page.version.version }}/alter-range.md %}#find-the-cluster-store-ids) associated with this [store]({% link {{ page.version.version }}/cockroach-start.md %}#store).
`nodeId` | The [node ID]({% link {{ page.version.version }}/cockroach-node.md %}#list-node-ids) associated with this [store]({% link {{ page.version.version }}/cockroach-start.md %}#store).
`encryptionStatus` | The [encryption status]({% link {{ page.version.version }}/encryption.md %}#checking-encryption-status) of this [store]({% link {{ page.version.version }}/cockroach-start.md %}#store).
`totalFiles` | If the store is [encrypted]({% link {{ page.version.version }}/encryption.md %}), the total number of encrypted files on the store.
`totalBytes` | If the store is [encrypted]({% link {{ page.version.version }}/encryption.md %}), the total number of encrypted bytes on the store.
`activeKeyFiles` | If the store is [encrypted]({% link {{ page.version.version }}/encryption.md %}),, the number of files using the [active data key]({% link {{ page.version.version }}/encryption.md %}#changing-encryption-algorithm-or-keys).
`activeKeyBytes` | If the store is [encrypted]({% link {{ page.version.version }}/encryption.md %}),, the number of bytes using the [active data key]({% link {{ page.version.version }}/encryption.md %}#changing-encryption-algorithm-or-keys).
`dir` | The directory on disk where the [store]({% link {{ page.version.version }}/cockroach-start.md %}#store) is located.
`walFailoverPath` | If [WAL failover is enabled]({% link {{ page.version.version }}/cockroach-start.md %}#enable-wal-failover), this field encodes the path to the secondary WAL directory used for failover in the event of high write latency to the primary WAL.

For example, to get the status of the stores of nodeID `1`, use the following:

{% include_cached copy-clipboard.html %}
~~~ shell
curl http://localhost:8080/_status/stores/1
~~~
rmloveland marked this conversation as resolved.
Show resolved Hide resolved

~~~ json
{
"stores": [
{
"storeId": 1,
"nodeId": 1,
"encryptionStatus": null,
"totalFiles": "0",
"totalBytes": "0",
"activeKeyFiles": "0",
"activeKeyBytes": "0",
"dir": "/tmp/node0",
"walFailoverPath": ""
}
]
}
~~~

## Alerting tools

In addition to actively monitoring the overall health and performance of a cluster, it is also essential to configure alerting rules that promptly send notifications when CockroachDB experiences events that require investigation or intervention.
Expand Down