New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: add storage.sstable.compression_algorithm cluster setting #120784
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/storage/pebble.go
Outdated
// Pre-24.1 Pebble's implementation of zstd had bugs that could cause | ||
// in-memory corruption. We require that the 24.1 version be finalized | ||
// before zstd will have any effect. | ||
if settings.Version.ActiveVersionOrEmpty(context.TODO()).IsActive(clusterversion.V24_1Start) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you do have a ctx in scope in newPebble
where you call this if you wanted it.
62e54c5
to
0da9eb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @itsbilal)
pkg/storage/pebble.go
line 144 at r1 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
maybe settings.SystemOnly is safer? but also good with this
I originally had SystemOnly
, but we build sstables within tenants for the purpose of AddSSTable
, so it needs to be visible to the tenant when the tenant calls MakeIngestionWriterOptions
. I added a comment describing this.
pkg/storage/pebble.go
line 164 at r1 (raw file):
Previously, dt (David Taylor) wrote…
you do have a ctx in scope in
newPebble
where you call this if you wanted it.
Done.
pkg/storage/sst_writer.go
line 121 at r2 (raw file):
// reduce compression ratio. opts.BlockSize = 128 << 10 opts.MergerName = "nullptr"
@dt @stevendanna — any reason not to apply this cluster setting to the backup SSTs too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @itsbilal, and @stevendanna)
pkg/storage/sst_writer.go
line 121 at r2 (raw file):
Previously, jbowens (Jackson Owens) wrote…
@dt @stevendanna — any reason not to apply this cluster setting to the backup SSTs too?
We should only do that if the cluster version is 24.1 - if we're in a mixed cluster, it's possible that the backup would be restored on a node running 23.2 code right?
No concerns about using it in backups. It is already gated on 24.1 here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @itsbilal, @RaduBerinde, and @stevendanna)
pkg/storage/sst_writer.go
line 121 at r2 (raw file):
Previously, RaduBerinde wrote…
We should only do that if the cluster version is 24.1 - if we're in a mixed cluster, it's possible that the backup would be restored on a node running 23.2 code right?
Yeah, it's already gated on the 24.1 cluster version for all usages because earlier versions of Pebble could corrupt memory when decompressing zstd (if the zstd library re-allocated rather than filling the block cache value we passed it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @dt, @itsbilal, and @jbowens)
pkg/storage/pebble.go
line 165 at r4 (raw file):
case compressionAlgorithmZstd: // Pre-24.1 Pebble's implementation of zstd had bugs that could cause // in-memory corruption. We require that the 24.1 version be finalized
"24.1 version be finalized" would suggest V24_1 not V24_1Start. I'd rephrase to "We require that the cluster version is 24.1 or an upgrade version towards 24.1 which implies that all nodes are running 24.1 code".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @itsbilal, and @RaduBerinde)
pkg/storage/pebble.go
line 165 at r4 (raw file):
Previously, RaduBerinde wrote…
"24.1 version be finalized" would suggest V24_1 not V24_1Start. I'd rephrase to "We require that the cluster version is 24.1 or an upgrade version towards 24.1 which implies that all nodes are running 24.1 code".
Done with a tweak to clarify that we're relying on the invariant that we will never run < 24.1 code again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @itsbilal, and @msbutler)
pkg/storage/pebble.go
line 165 at r4 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Done with a tweak to clarify that we're relying on the invariant that we will never run < 24.1 code again.
👍 👍
Introduce a new cluster setting that allows the operator to configure the compression algorithm used when compressing sstable blocks. This allows operators to opt into use of zstd (as opposed to the previous setting of snappy). ZSTD typically achieves better compression ratios than snappy, and operators may find that they can achieve higher node densities through enabling zstd. Future releases may change the default compression algorithm. In a side-by-side comparison of a 10000-warehouse tpcc import, the zstd cluster achieved a higher import speed of 146 MiB/s versus snappy's 135 MiB/s. The zstd cluster's physical database size was significantly less. Epic: none Release note (ops change): Add `storage.sstable.compression_algorithm` cluster setting that configures the compression algorithm to use when compressing sstable blocks.
TFTRs! bors r+ |
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
Introduce a new cluster setting that allows the operator to configure the compression algorithm used when compressing sstable blocks. This allows operators to opt into use of zstd (as opposed to the previous setting of snappy). ZSTD typically achieves better compression ratios than snappy, and operators may find that they can achieve higher node densities through enabling zstd. Future releases may change the default compression algorithm.
In a side-by-side comparison of a 10000-warehouse tpcc import, the zstd cluster achieved a higher import speed of 146 MiB/s versus snappy's 135 MiB/s. The zstd cluster's physical database size was significantly less (~30%).
Informs #105568.
Epic: none
Release note (ops change): Add
storage.sstable.compression_algorithm
cluster setting that configures the compression algorithm to use when compressing sstable blocks.