Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/storage: enable EFOS, excise on snapshot, ingest splitting #117116

Merged
merged 1 commit into from
Dec 27, 2023

Conversation

itsbilal
Copy link
Member

The following three cluster settings were introduced in the 23.2 cycle, and are disabled by default in that release:

  • storage.experimental.eventually_file_only_snapshots.enabled,
  • kv.snapshot_receiver.excise.enabled,
  • storage.ingest_split.enabled.

Enable all three by default, in preparation for 24.1.

Fixes #115432.

Release note: None.

The following three cluster settings were introduced in the 23.2 cycle,
and are disabled by default in that release:

- `storage.experimental.eventually_file_only_snapshots.enabled`,
- `kv.snapshot_receiver.excise.enabled`,
- `storage.ingest_split.enabled`.

Enable all three by default, in preparation for 24.1.

Fixes cockroachdb#115432.

Release note: None.
@itsbilal itsbilal requested a review from a team as a code owner December 27, 2023 17:27
Copy link

blathers-crl bot commented Dec 27, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@itsbilal
Copy link
Member Author

This PR supersedes #116330.

Copy link
Collaborator

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@itsbilal
Copy link
Member Author

TFTR!

bors r=nicktrav

@craig
Copy link
Contributor

craig bot commented Dec 27, 2023

Build succeeded:

@craig craig bot merged commit 57bd0a9 into cockroachdb:master Dec 27, 2023
8 of 9 checks passed
@RaduBerinde
Copy link
Member

Looks like this change affected BenchmarkFKInsert, BenchmarkResolveFunction, BenchmarkSequenceIncrement significantly (#117494). Are any of these regressions expected?

@jbowens
Copy link
Collaborator

jbowens commented Jan 8, 2024

That sounds very unexpected. Are reads of virtual sstables slower?

@itsbilal
Copy link
Member Author

itsbilal commented Jan 8, 2024

It does sound unexpected on the surface, but I notice that all these benchmarks do a defer cluster.Stopper().Stop() that is getting timed by the benchmark. Is it possible that we're waiting for the consistency queue to empty out before we close the cluster? In that case the move to EFOS would explain the difference. However it seems non-ideal to have that timed by the benchmark to begin with.

@jbowens
Copy link
Collaborator

jbowens commented Jan 8, 2024

It does sound unexpected on the surface, but I notice that all these benchmarks do a defer cluster.Stopper().Stop() that is getting timed by the benchmark. Is it possible that we're waiting for the consistency queue to empty out before we close the cluster? In that case the move to EFOS would explain the difference. However it seems non-ideal to have that timed by the benchmark to begin with.

Good catch.

mgartner added a commit to mgartner/cockroach that referenced this pull request Jan 9, 2024
Several benchmarks incorrectly included cluster shutdown as part of the
benchmark timing. This caused major regressions in benchmarks
when cockroachdb#117116 was merged because it made cluster shutdown slower. The
benchmarks now stop the timer before initiating cluster shutdown to more
accurately measure the code in question.

Fixes cockroachdb#117494

Release note: None
mgartner added a commit to mgartner/cockroach that referenced this pull request Jan 9, 2024
Several benchmarks incorrectly included cluster shutdown as part of the
benchmark timing. This caused major regressions in benchmarks
when cockroachdb#117116 was merged because it made cluster shutdown slower. The
benchmarks now stop the timer before initiating cluster shutdown to more
accurately measure the code in question.

Fixes cockroachdb#117494

Release note: None
craig bot pushed a commit that referenced this pull request Jan 11, 2024
113893: cli: remove cockroach connect r=nvanbenschoten a=andrewbaptist

`connect` was implemented as an experiment to allow bootstrapping nodes from other nodes CA's (#60632). The details are described here: https://github.com/aaron-crl/toy-secure-init-handshake/tree/n-way-join

This implementation was never completed, and the visibility of this code can cause confusion. This PR removes all the code with the idea that we can bring it back later if necessary.

Epic: none

Release note (cli change): Removal of the `cockroach connect` functionality.

117542: sql: fix erroneous benchmark regressions r=mgartner a=mgartner

Several benchmarks incorrectly included cluster shutdown as part of the
benchmark timing. This caused major regressions in benchmarks
when #117116 was merged because it made cluster shutdown slower. The
benchmarks now stop the timer before initiating cluster shutdown to more
accurately measure the code in question.

Fixes #117494

Release note: None


117558: sql: add telemetry block regexps and fix flake r=mgartner a=mgartner

#### sqltestutils: allow blocking regexps in telemetry feature list

The `feature-allowlist` directive for telemetry tests has been renamed
`feature-list` and it now supports blocking regexps. Any line in the
`feature-list` that starts with "!" is a block regexp that prevents any
features matching the regexp from being included in the output of the
`feature-usage` and `feature-counters` directives. This is helpful
because Go's regexps do not support negative look-aheads, so adding
features that should not be matched is difficult and tedious.

Release note: None

#### sql: ignore "sql.schema.create_stats" in schema telemetry test

This commit ignores the `sql.schema.create_stats` in the `schema`
telemetry tests because it causes sporadic failures.

Fixes #117309

Release note: None


117569: authors: add homa31 to authors r=homa31 a=homa31

Release note: None
Epic: None

117572: *: Prep work for supporting CONFIGURE ZONE in declarative schema changer r=Xiang-Gu a=Xiang-Gu

This PR consists several preparation work that will be needed for properly supporting CONFIGURE ZONE stmts in declarative schema changer. They're separated out for easier review and bc I expect those to be a lot less controversial than the main work. See each commit for details.

Informs #117574
Epic: CRDB-31473

Co-authored-by: Andrew Baptist <baptist@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Howard Ma <howard.ma@cockroachlabs.com>
Co-authored-by: Xiang Gu <xiang@cockroachlabs.com>
yuzefovich added a commit to yuzefovich/cockroach that referenced this pull request Jan 17, 2024
This commit adjusts our benchmark harness to ignore the overhead of
starting up and shutting down the test cluster. This allows us to avoid
spurious "regressions" like that we've seen due to cockroachdb#117116.

Epic: None

Release note: None
craig bot pushed a commit that referenced this pull request Jan 18, 2024
117872: bench: ignore cluster startup and shutdown r=yuzefovich a=yuzefovich

This commit adjusts our benchmark harness to ignore the overhead of starting up and shutting down the test cluster. This allows us to avoid spurious "regressions" like that we've seen due to #117116.

Epic: None

Release note: None

117894: tests: skip some tests under `race` r=rail a=rickystewart

Some of these are skipped only under `StressRace` but not simply under `race` as they should have been.

Additionally, two tests: `spanconfigsqltranslatorccl.TestDataDriven` and `kvserver.TestStoreMetrics` are so long running that they stall or fail to complete under `race`, which has gone unnoticed before.

Epic: CRDB-8308
Release note: None

117898: testcluster: re-enable tenant in multi-node clusters under race r=yuzefovich a=yuzefovich

This commit is a partial revert of adfc98a. In that commit we disabled tenant randomization in multi-node-clusters under race because EngFlow executors were too overloaded in such a setup. We now use beefier executors, so let's see how they fare.

Epic: None

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this pull request Jan 18, 2024
This commit adjusts our benchmark harness to ignore the overhead of
starting up and shutting down the test cluster. This allows us to avoid
spurious "regressions" like that we've seen due to #117116.

Epic: None

Release note: None
blathers-crl bot pushed a commit that referenced this pull request Jan 18, 2024
This commit adjusts our benchmark harness to ignore the overhead of
starting up and shutting down the test cluster. This allows us to avoid
spurious "regressions" like that we've seen due to #117116.

Epic: None

Release note: None
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Jan 18, 2024
As of cockroachdb#117116 this is now the default and the plan is to keep it the
default in 24.1.

Release note: None
craig bot pushed a commit that referenced this pull request Jan 26, 2024
116488: backupccl: wrap a few errors r=msbutler a=stevendanna

Errors from the underlying parser calls used by rewrite functions can return confusing errors that make it appear as if the user's backup statement was invalid SQL.

Epic: none

Release note: None

117922: roachtest: remove unnecessary cluster setting change r=msbutler a=stevendanna

As of #117116 this is now the default and the plan is to keep it the default in 24.1.

Epic: none

Release note: None

117974: bulk: add tracing spans to addSStable in SSTBatcher r=msbutler a=stevendanna

Epic: None

Release note: None

117983: roachtest: update tpce tests with new connection options r=herkolategan,srosenberg a=DarrylWong

The tpc-e driver has been updated to accept options for a fixture bucket, port, username and password. This change adds those options accordingly to the tpce tests.

Previously, tpce hardcoded the sql port as 26257. Now that tpce can accept a port option, we can instead discover the port and pass it.

This change also includes a new SQLPort function that returns the sql port of the given nodes, as well as fixes miscellaneous function signatures to correctly say nodes instead of node.

Release note: None
Epic: None
Fixes: #117567

Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Co-authored-by: DarrylWong <darryl@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: enable excise-on-snapshot and EFOS by default
5 participants