-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: alter default cluster setting for sstable compression algorithm #123953
Comments
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SStable compression algorithm to `zstd`. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches cockroachdb#123953. Release note: None.
124388: storage: fix setting of compression algorithm r=aadityasondhi a=nicktrav Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None. Epic: CRDB-37583 Co-authored-by: Nick Travers <travers@cockroachlabs.com>
124388: storage: fix setting of compression algorithm r=RaduBerinde a=nicktrav Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None. Epic: CRDB-37583 Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Fix an issue where the compression cluster setting is being set on a copy of the per-level configuration, rather than the configuration that is ultimately passed to Pebble. Touches #123953. Release note: None.
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches cockroachdb#123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs.
124245: storage,backupccl: add cluster settings for backup-related sst compression r=RaduBerinde a=nicktrav **storage,backupccl: allow for configuring compression of backup ssts** Currently, there exists a single cluster setting, `storage.sstable.compression_algorithm`, that controls the compression algorithm used by the cluster when writing SSTables. This cluster setting currently defaults to `snappy`. There are various end destinations and use cases for SSTables - those that reside in a Pebble store (the most common), those sent over the wire via `AddSSTable`, and those generated as part of a backup for storage in S3/GCS. Each of these destinations and use cases benefit from being able to independently alter the compression algorithm. In addition to the existing cluster setting, add two backup-specific cluster settings: - `storage.sstable.compression_algorithm_backup_storage`: applies to SSTs generated as part of a backup, where SSTs will reside in blob storage. These require compression, but not at the cost of additional CPU at generation time. Defaults to `snappy`, the existing algorithm used. - `storage.sstable.compression_algorithm_backup_transport`: applies to SSTs generated as part of a backup that are sent, immediately iterated, and then discarded (i.e. they are never persisted). Such SSTs typically have larger block sizes and benefit from compression. Defaults to `snappy`, the existing algorithm used. While this change introduces new cluster settings that allows independently altering the compression algorithm for different types of SSTs, the existing compression behavior is unchanged. Touches #123953. Release note (general change): Adds two new cluster settings, `storage.sstable.compression_algorithm_backup_storage` and `storage.sstable.compression_algorithm_backup_transport`, that in addition to the existing cluster setting `storage.sstable.compression_algorithm`, can be used to alter the compression algorithm used for various types of SSTs. Epic: None. 125424: pkg/obs/logstream: increase timeout used in TestBlockedFlush r=dhartunian a=abarganier Addresses: #125290 TestBlockedFlush only provided 100 milliseconds of tolerance for a goroutine to be created, make a call to `(*asyncProcessorRouter).Process()`, and then signal on a channel. Under nightly stress, the test was timing out because the goroutine failed to signal on the channel within 100 milliseconds, which certainly feels possible if the test is running under nightly stress. This patch simply updates the test with a longer timeout (10ms -> 10s) to make the timeout less likely when run under stress. Release note: none Epic: none Co-authored-by: Nick Travers <travers@cockroachlabs.com> Co-authored-by: Alex Barganier <abarganier@cockroachlabs.com>
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Existing clusters can opt into this new compression algorithm by setting `storage.sstable.compression_algorithm` to `zstd`. Existing SSTables compressed with the snappy compression algorithm will NOT be actively re-written. Instead, as SSTables are re-written over time with the new compression algorithm as existing SSTables are compacted.
I'm noticing a material increase in CPU time when using zstd as the default (#125463). This was causing tests to timeout. The following shows a CPU profile with and without the default change, focused in on darwin-aarch64Snappy:
CPU time 30ms. ZSTD:
CPU time 610ms. @jbowens - I recall you saying that you had some tuning params up your sleeve that you thought might be worth looking into? I put the profiles here (internal). |
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Cluster operators can alter the compression algorithm by setting `storage.sstable.compression_algorithm` either `zstd` (the new default) or `snappy`. Existing SSTables compressed with the snappy compression algorithm will NOT be _actively_ re-written. Rather, SSTables are gradually re-written over time with the new compression algorithm as existing SSTables are compacted.
In cockroachdb#120784, we now allow for the `zstd` compression algorithm to be used as the codec for SSTables. Initial testing has shown this algorithm to provide a better compression ratio than snappy, at the cost of a minor increase in CPU. This unlocks benefits such as an improved node density, as more data can reside in each store. Alter the default SSTable compression algorithm to `zstd`. Note that this change only applies to SSTs written directly into a local Pebble store, or generated to send over the wire for ingestion into a remote store (e.g. `AddSSTable`). The defaults for backup-related SSTs are left unchanged. Closes cockroachdb#123953. Release note (performance improvement): The default SSTable compression algorithm was changed from snappy to zstd. The latter has been shown to improve a store's compression ratio at the cost of a minor CPU increase. Cluster operators can alter the compression algorithm by setting `storage.sstable.compression_algorithm` to either `zstd` (the new default) or `snappy`. Existing SSTables compressed with the snappy compression algorithm will NOT be _actively_ re-written. Rather, SSTables are gradually re-written over time with the new compression algorithm as existing SSTables are compacted.
Set
storage.sstable.compression_algorithm
tozstd
.There are likely some tests that will need to be altered to take into account SSTable size after being compressed with zstd, rather than snappy.
Jira issue: CRDB-38625
The text was updated successfully, but these errors were encountered: