Skip to content

Commit

Permalink
storage: change the default ssttable compression algorithm to zstd
Browse files Browse the repository at this point in the history
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used
as the codec for SSTables. Initial testing has shown this algorithm to
provide a better compression ratio than snappy, at the cost of a minor
increase in CPU. This unlocks benefits such as an improved node density,
as more data can reside in each store.

Alter the default SSTable compression algorithm to `zstd`. Note that
this change only applies to SSTs written directly into a local Pebble
store, or generated to send over the wire for ingestion into a remote
store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left
unchanged.

Closes cockroachdb#123953.

Release note (performance improvement): The default SSTable compression
algorithm was changed from snappy to zstd. The latter has been shown to
improve a store's compression ratio at the cost of a minor CPU increase.
Existing clusters can opt into this new compression algorithm by setting
`storage.sstable.compression_algorithm` to `zstd`. Existing SSTables
compressed with the snappy compression algorithm will NOT be actively
re-written. Instead, as SSTables are re-written over time with the new
compression algorithm as existing SSTables are compacted.
  • Loading branch information
nicktrav committed May 15, 2024
1 parent 31f3095 commit 76b201f
Show file tree
Hide file tree
Showing 7 changed files with 7 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ sql.txn.read_committed_isolation.enabled boolean true set to true to allow trans
sql.txn_fingerprint_id_cache.capacity integer 100 the maximum number of txn fingerprint IDs stored application
storage.max_sync_duration duration 20s maximum duration for disk operations; any operations that take longer than this setting trigger a warning log entry or process crash system-visible
storage.max_sync_duration.fatal.enabled boolean true if true, fatal the process when a disk operation exceeds storage.max_sync_duration application
storage.sstable.compression_algorithm enumeration snappy "determines the compression algorithm to use when compressing sstable data blocks for use in a Pebble store; supported values: ""snappy"", ""zstd"" [snappy = 1, zstd = 2]" system-visible
storage.sstable.compression_algorithm enumeration zstd "determines the compression algorithm to use when compressing sstable data blocks for use in a Pebble store; supported values: ""snappy"", ""zstd"" [snappy = 1, zstd = 2]" system-visible
storage.sstable.compression_algorithm_backup_storage enumeration snappy "determines the compression algorithm to use when compressing sstable data blocks for backup row data storage; supported values: ""snappy"", ""zstd"" [snappy = 1, zstd = 2]" system-visible
storage.sstable.compression_algorithm_backup_transport enumeration snappy "determines the compression algorithm to use when compressing sstable data blocks for backup transport; supported values: ""snappy"", ""zstd"" [snappy = 1, zstd = 2]" system-visible
timeseries.storage.resolution_10s.ttl duration 240h0m0s the maximum age of time series data stored at the 10 second resolution. Data older than this is subject to rollup and deletion. system-visible
Expand Down
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@
<tr><td><div id="setting-storage-ingest-split-enabled" class="anchored"><code>storage.ingest_split.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>set to false to disable ingest-time splitting that lowers write-amplification</td><td>Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-storage-max-sync-duration" class="anchored"><code>storage.max_sync_duration</code></div></td><td>duration</td><td><code>20s</code></td><td>maximum duration for disk operations; any operations that take longer than this setting trigger a warning log entry or process crash</td><td>Dedicated/Self-hosted (read-write); Serverless (read-only)</td></tr>
<tr><td><div id="setting-storage-max-sync-duration-fatal-enabled" class="anchored"><code>storage.max_sync_duration.fatal.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>if true, fatal the process when a disk operation exceeds storage.max_sync_duration</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
<tr><td><div id="setting-storage-sstable-compression-algorithm" class="anchored"><code>storage.sstable.compression_algorithm</code></div></td><td>enumeration</td><td><code>snappy</code></td><td>determines the compression algorithm to use when compressing sstable data blocks for use in a Pebble store; supported values: &#34;snappy&#34;, &#34;zstd&#34; [snappy = 1, zstd = 2]</td><td>Dedicated/Self-hosted (read-write); Serverless (read-only)</td></tr>
<tr><td><div id="setting-storage-sstable-compression-algorithm" class="anchored"><code>storage.sstable.compression_algorithm</code></div></td><td>enumeration</td><td><code>zstd</code></td><td>determines the compression algorithm to use when compressing sstable data blocks for use in a Pebble store; supported values: &#34;snappy&#34;, &#34;zstd&#34; [snappy = 1, zstd = 2]</td><td>Dedicated/Self-hosted (read-write); Serverless (read-only)</td></tr>
<tr><td><div id="setting-storage-sstable-compression-algorithm-backup-storage" class="anchored"><code>storage.sstable.compression_algorithm_backup_storage</code></div></td><td>enumeration</td><td><code>snappy</code></td><td>determines the compression algorithm to use when compressing sstable data blocks for backup row data storage; supported values: &#34;snappy&#34;, &#34;zstd&#34; [snappy = 1, zstd = 2]</td><td>Dedicated/Self-hosted (read-write); Serverless (read-only)</td></tr>
<tr><td><div id="setting-storage-sstable-compression-algorithm-backup-transport" class="anchored"><code>storage.sstable.compression_algorithm_backup_transport</code></div></td><td>enumeration</td><td><code>snappy</code></td><td>determines the compression algorithm to use when compressing sstable data blocks for backup transport; supported values: &#34;snappy&#34;, &#34;zstd&#34; [snappy = 1, zstd = 2]</td><td>Dedicated/Self-hosted (read-write); Serverless (read-only)</td></tr>
<tr><td><div id="setting-storage-wal-failover-unhealthy-op-threshold" class="anchored"><code>storage.wal_failover.unhealthy_op_threshold</code></div></td><td>duration</td><td><code>100ms</code></td><td>the latency of a WAL write considered unhealthy and triggers a failover to a secondary WAL location</td><td>Dedicated/Self-Hosted</td></tr>
Expand Down
2 changes: 1 addition & 1 deletion pkg/kv/kvnemesis/testdata/TestApplier/addsstable
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
echo
----
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1004 bytes (as writes)
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1009 bytes (as writes)
// ^-- tk(1) -> sv(s1): /Table/100/"0000000000000001"/<ts> -> /BYTES/v1
// ^-- tk(2) -> sv(s1): /Table/100/"0000000000000002"/<ts> -> /<empty>
// ^-- [tk(3), tk(4)) -> sv(s1): /Table/100/"000000000000000{3"-4"} -> /<empty>
2 changes: 1 addition & 1 deletion pkg/kv/kvnemesis/testdata/TestOperationsFormat/4
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
echo
----
···db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1004 bytes (as writes)
···db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1009 bytes (as writes)
···// ^-- tk(1) -> sv(s1): /Table/100/"0000000000000001"/0.000000001,0 -> /BYTES/v1
···// ^-- tk(2) -> sv(s1): /Table/100/"0000000000000002"/0.000000001,0 -> /<empty>
···// ^-- [tk(3), tk(4)) -> sv(s1): /Table/100/"000000000000000{3"-4"} -> /<empty>
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
echo
----
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1004 bytes
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s1 */) // 1009 bytes
// ^-- tk(1) -> sv(s1): /Table/100/"0000000000000001"/0.000000001,0 -> /BYTES/v1
// ^-- tk(2) -> sv(s1): /Table/100/"0000000000000002"/0.000000001,0 -> /<empty>
// ^-- [tk(3), tk(4)) -> sv(s1): /Table/100/"000000000000000{3"-4"} -> /<empty>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ db0.Put(ctx, tk(1), sv(1)) // @0.000000001,0 <nil>
db0.Put(ctx, tk(2), sv(1)) // @0.000000001,0 <nil>
db0.Put(ctx, tk(3), sv(1)) // @0.000000001,0 <nil>
db0.Put(ctx, tk(4), sv(1)) // @0.000000001,0 <nil>
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s2 */) // 1005 bytes
db0.AddSSTable(ctx, tk(1), tk(4), ... /* @s2 */) // 1009 bytes
// ^-- tk(1) -> sv(s2): /Table/100/"0000000000000001"/0.000000001,0 -> /BYTES/v2
// ^-- tk(2) -> sv(s2): /Table/100/"0000000000000002"/0.000000001,0 -> /<empty>
// ^-- [tk(3), tk(4)) -> sv(s2): /Table/100/"000000000000000{3"-4"} -> /<empty>
Expand Down
2 changes: 1 addition & 1 deletion pkg/storage/pebble.go
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ var CompressionAlgorithmStorage = RegisterCompressionAlgorithmClusterSetting(
"storage.sstable.compression_algorithm",
`determines the compression algorithm to use when compressing sstable data blocks for use in a Pebble store;`+
` supported values: "snappy", "zstd"`,
CompressionAlgorithmSnappy.String(), // Default.
CompressionAlgorithmZstd.String(), // Default.
)

// CompressionAlgorithmBackupStorage determines the compression algorithm used
Expand Down

0 comments on commit 76b201f

Please sign in to comment.