Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: add storage.sstable.compression_algorithm cluster setting #120784

Merged
merged 1 commit into from Mar 21, 2024

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Mar 20, 2024

Introduce a new cluster setting that allows the operator to configure the compression algorithm used when compressing sstable blocks. This allows operators to opt into use of zstd (as opposed to the previous setting of snappy). ZSTD typically achieves better compression ratios than snappy, and operators may find that they can achieve higher node densities through enabling zstd. Future releases may change the default compression algorithm.

In a side-by-side comparison of a 10000-warehouse tpcc import, the zstd cluster achieved a higher import speed of 146 MiB/s versus snappy's 135 MiB/s. The zstd cluster's physical database size was significantly less (~30%).

Screenshot 2024-03-20 at 3 07 13 PM Screenshot 2024-03-20 at 3 06 49 PM

Informs #105568.
Epic: none
Release note (ops change): Add storage.sstable.compression_algorithm cluster setting that configures the compression algorithm to use when compressing sstable blocks.

@jbowens jbowens requested a review from a team as a code owner March 20, 2024 21:11
@jbowens jbowens requested a review from itsbilal March 20, 2024 21:11
Copy link

blathers-crl bot commented Mar 20, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/storage/pebble.go Show resolved Hide resolved
// Pre-24.1 Pebble's implementation of zstd had bugs that could cause
// in-memory corruption. We require that the 24.1 version be finalized
// before zstd will have any effect.
if settings.Version.ActiveVersionOrEmpty(context.TODO()).IsActive(clusterversion.V24_1Start) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do have a ctx in scope in newPebble where you call this if you wanted it.

@jbowens jbowens force-pushed the compression-cs branch 2 times, most recently from 62e54c5 to 0da9eb4 Compare March 21, 2024 14:38
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @itsbilal)


pkg/storage/pebble.go line 144 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

maybe settings.SystemOnly is safer? but also good with this

I originally had SystemOnly, but we build sstables within tenants for the purpose of AddSSTable, so it needs to be visible to the tenant when the tenant calls MakeIngestionWriterOptions. I added a comment describing this.


pkg/storage/pebble.go line 164 at r1 (raw file):

Previously, dt (David Taylor) wrote…

you do have a ctx in scope in newPebble where you call this if you wanted it.

Done.


pkg/storage/sst_writer.go line 121 at r2 (raw file):

	// reduce compression ratio.
	opts.BlockSize = 128 << 10
	opts.MergerName = "nullptr"

@dt @stevendanna — any reason not to apply this cluster setting to the backup SSTs too?

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @itsbilal, and @stevendanna)


pkg/storage/sst_writer.go line 121 at r2 (raw file):

Previously, jbowens (Jackson Owens) wrote…

@dt @stevendanna — any reason not to apply this cluster setting to the backup SSTs too?

We should only do that if the cluster version is 24.1 - if we're in a mixed cluster, it's possible that the backup would be restored on a node running 23.2 code right?

@dt
Copy link
Member

dt commented Mar 21, 2024

No concerns about using it in backups.

It is already gated on 24.1 here.

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @itsbilal, @RaduBerinde, and @stevendanna)


pkg/storage/sst_writer.go line 121 at r2 (raw file):

Previously, RaduBerinde wrote…

We should only do that if the cluster version is 24.1 - if we're in a mixed cluster, it's possible that the backup would be restored on a node running 23.2 code right?

Yeah, it's already gated on the 24.1 cluster version for all usages because earlier versions of Pebble could corrupt memory when decompressing zstd (if the zstd library re-allocated rather than filling the block cache value we passed it).

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt, @itsbilal, and @jbowens)


pkg/storage/pebble.go line 165 at r4 (raw file):

	case compressionAlgorithmZstd:
		// Pre-24.1 Pebble's implementation of zstd had bugs that could cause
		// in-memory corruption. We require that the 24.1 version be finalized

"24.1 version be finalized" would suggest V24_1 not V24_1Start. I'd rephrase to "We require that the cluster version is 24.1 or an upgrade version towards 24.1 which implies that all nodes are running 24.1 code".

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @itsbilal, and @RaduBerinde)


pkg/storage/pebble.go line 165 at r4 (raw file):

Previously, RaduBerinde wrote…

"24.1 version be finalized" would suggest V24_1 not V24_1Start. I'd rephrase to "We require that the cluster version is 24.1 or an upgrade version towards 24.1 which implies that all nodes are running 24.1 code".

Done with a tweak to clarify that we're relying on the invariant that we will never run < 24.1 code again.

@jbowens jbowens requested a review from a team as a code owner March 21, 2024 16:20
@jbowens jbowens requested review from msbutler and removed request for a team March 21, 2024 16:20
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @itsbilal, and @msbutler)


pkg/storage/pebble.go line 165 at r4 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Done with a tweak to clarify that we're relying on the invariant that we will never run < 24.1 code again.

👍 👍

pkg/storage/pebble.go Outdated Show resolved Hide resolved
Introduce a new cluster setting that allows the operator to configure the
compression algorithm used when compressing sstable blocks. This allows
operators to opt into use of zstd (as opposed to the previous setting of
snappy). ZSTD typically achieves better compression ratios than snappy, and
operators may find that they can achieve higher node densities through enabling
zstd. Future releases may change the default compression algorithm.

In a side-by-side comparison of a 10000-warehouse tpcc import, the zstd cluster
achieved a higher import speed of 146 MiB/s versus snappy's 135 MiB/s. The zstd
cluster's physical database size was significantly less.

Epic: none
Release note (ops change): Add `storage.sstable.compression_algorithm` cluster
setting that configures the compression algorithm to use when compressing
sstable blocks.
@jbowens
Copy link
Collaborator Author

jbowens commented Mar 21, 2024

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 21, 2024

@craig craig bot merged commit 8dcbdc4 into cockroachdb:master Mar 21, 2024
22 checks passed
@jbowens jbowens deleted the compression-cs branch March 22, 2024 02:08
nicktrav added a commit to nicktrav/cockroach that referenced this pull request May 10, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used
as the codec for SSTables. Initial testing has shown this algorithm to
provide a better compression ratio than snappy, at the cost of a minor
increase in CPU. This unlocks benefits such as an improved node density,
as more data can reside in each store.

Alter the default SStable compression algorithm to `zstd`.

Closes cockroachdb#123953.

Release note (performance improvement): The default SSTable compression
algorithm was changed from snappy to zstd. The latter has been shown to
improve a store's compression ratio at the cost of a minor CPU increase.
Existing clusters can opt into this new compression algorithm by setting
`storage.sstable.compression_algorithm` to `zstd`. Existing SSTables
compressed with the snappy compression algorithm will NOT be actively
re-written. Instead, as SSTables are re-written over time with the new
compression algorithm as existing SSTables are compacted.
nicktrav added a commit to nicktrav/cockroach that referenced this pull request May 10, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used
as the codec for SSTables. Initial testing has shown this algorithm to
provide a better compression ratio than snappy, at the cost of a minor
increase in CPU. This unlocks benefits such as an improved node density,
as more data can reside in each store.

Alter the default SStable compression algorithm to `zstd`.

Closes cockroachdb#123953.

Release note (performance improvement): The default SSTable compression
algorithm was changed from snappy to zstd. The latter has been shown to
improve a store's compression ratio at the cost of a minor CPU increase.
Existing clusters can opt into this new compression algorithm by setting
`storage.sstable.compression_algorithm` to `zstd`. Existing SSTables
compressed with the snappy compression algorithm will NOT be actively
re-written. Instead, as SSTables are re-written over time with the new
compression algorithm as existing SSTables are compacted.
nicktrav added a commit to nicktrav/cockroach that referenced this pull request May 15, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used
as the codec for SSTables. Initial testing has shown this algorithm to
provide a better compression ratio than snappy, at the cost of a minor
increase in CPU. This unlocks benefits such as an improved node density,
as more data can reside in each store.

Alter the default SSTable compression algorithm to `zstd`. Note that
this change only applies to SSTs written directly into a local Pebble
store, or generated to send over the wire for ingestion into a remote
store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left
unchanged.

Closes cockroachdb#123953.

Release note (performance improvement): The default SSTable compression
algorithm was changed from snappy to zstd. The latter has been shown to
improve a store's compression ratio at the cost of a minor CPU increase.
Existing clusters can opt into this new compression algorithm by setting
`storage.sstable.compression_algorithm` to `zstd`. Existing SSTables
compressed with the snappy compression algorithm will NOT be actively
re-written. Instead, as SSTables are re-written over time with the new
compression algorithm as existing SSTables are compacted.
nicktrav added a commit to nicktrav/cockroach that referenced this pull request May 15, 2024
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used
as the codec for SSTables. Initial testing has shown this algorithm to
provide a better compression ratio than snappy, at the cost of a minor
increase in CPU. This unlocks benefits such as an improved node density,
as more data can reside in each store.

Alter the default SSTable compression algorithm to `zstd`. Note that
this change only applies to SSTs written directly into a local Pebble
store, or generated to send over the wire for ingestion into a remote
store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left
unchanged.

Closes cockroachdb#123953.

Release note (performance improvement): The default SSTable compression
algorithm was changed from snappy to zstd. The latter has been shown to
improve a store's compression ratio at the cost of a minor CPU increase.
Existing clusters can opt into this new compression algorithm by setting
`storage.sstable.compression_algorithm` to `zstd`. Existing SSTables
compressed with the snappy compression algorithm will NOT be actively
re-written. Instead, as SSTables are re-written over time with the new
compression algorithm as existing SSTables are compacted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants