Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: enable space management by default #15996

Merged
merged 1 commit into from
Jan 10, 2024

Conversation

dotnwat
Copy link
Member

@dotnwat dotnwat commented Jan 8, 2024

Make the following configuration options have these default values regardless of the upgrade history of the cluster

space_management_enable=true
retention_local_strict=false

Additionally these options used to work around an issue in v23.2 which prevented changes to the above options from being sticky in an upgraded cluster no longer have any affect.

space_management_enable_override
retention_local_strict_override

They'll be fully deprecated at a later time.

Fixes: #15995

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

Improvements

Space management is now enabled by default for upgraded clusters starting in v23.3, unless the feature has been explicitly disabled.

Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space management is now enabled by default for upgraded clusters starting in v23.3, unless the feature has been explicitly disabled.

I'm a bit confused by the mention of v23.3 here. Does this mean "clusters that were bootstrapped on versions higher than v23.3 will enable space management automatically" or "all clusters that exist in v23.3 will enable space management automatically, regardless of upgrades"? If the latter, perhaps remove the mention of v23.3; the release note would be applicable to any version this is backported to.

Overall looks non-controversial, just wanna clarify the versioning expectations. Perhaps an upgrade test would help clarify the intent?

src/v/config/configuration.cc Show resolved Hide resolved
@vbotbuildovich
Copy link
Collaborator

@piyushredpanda piyushredpanda added this to the v23.3.2 milestone Jan 9, 2024
oleiman
oleiman previously approved these changes Jan 9, 2024
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code changes lgtm.

Some specifics about legacy defaults in the commit message and/or PR description might be helpful for posterity.

andrwng
andrwng previously approved these changes Jan 9, 2024
Copy link
Contributor

@andrwng andrwng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We chatted about this earlier and to clarify. The mention of v23.3 make sense but could be clearer for dev. Doesn't feel like a blocker.

The following configuration options have these default values
regardless of the upgrade history of the cluster

  space_management_enable=true
  retention_local_strict=false

Additionally these options used to work around an issue in v23.2 which
prevented changes to the above options from being sticky in an upgraded
cluster no longer have any affect.

  space_management_enable_override
  retention_local_strict_override

They'll be fully deprecated at a later time.

Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Copy link
Member

@oleiman oleiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +1940 to +1956
# in 23.2 space management exists, but is disabled by default for
# upgraded clusters.
self._upgrade(wipe_cache, self.intermediate_version)
self._check_value_everywhere("space_management_enable", False)
self._check_value_everywhere("retention_local_strict", True)

# in >=23.3 (using upstream build) space management should be enabled by
# default provided that it wasn't explicitly disabled in 23.2. in this
# case no configs were changed so it should be enabled now.
self._upgrade(wipe_cache)
self._check_value_everywhere("space_management_enable", True)
self._check_value_everywhere("retention_local_strict", False)

# survives a restart
self.redpanda.restart_nodes(self.redpanda.nodes)
self._check_value_everywhere("space_management_enable", True)
self._check_value_everywhere("retention_local_strict", False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +1858 to +1859
# version where space_management legacy defaults are introduced
self.intermediate_version = (23, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, for some reason I was under the impression Redpanda would complain/fail if we skipped any logical versions. I guess not though, given these other tests are already jumping from 23.1 to 23.3-soon-to-be-24.1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh good call out i think we should probably revisit the upgrade test infra if anything just to say that it still looks good

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jan 10, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf102-0352-4bee-80f5-6a33416de007:

"rptest.tests.cluster_features_test.FeaturesMultiNodeUpgradeTest.test_rollback"

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf102-0348-43fd-80af-1c73b55a38be:

"rptest.tests.cluster_features_test.FeaturesMultiNodeUpgradeTest.test_upgrade"

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf102-034b-41f3-9921-109315fbafaf:

"rptest.tests.cluster_features_test.FeaturesSingleNodeUpgradeTest.test_upgrade"
"rptest.tests.cluster_features_test.FeaturesNodeJoinTest.test_old_node_join"

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf112-1371-4dd5-91dd-7c47a5603614:

"rptest.tests.cluster_features_test.FeaturesSingleNodeUpgradeTest.test_upgrade"
"rptest.tests.cluster_features_test.FeaturesNodeJoinTest.test_old_node_join"

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf112-136e-40fd-a13e-1621ecf8971a:

"rptest.tests.cluster_features_test.FeaturesMultiNodeUpgradeTest.test_upgrade"

new failures in https://buildkite.com/redpanda/redpanda/builds/43620#018cf112-136a-4d1d-8fe7-54e5c5d6859a:

"rptest.tests.cluster_features_test.FeaturesMultiNodeUpgradeTest.test_rollback"

@dotnwat
Copy link
Member Author

dotnwat commented Jan 10, 2024

Those failures are all labeled as new, but it looks like some sort of race or failure to find it because they are both here

#16040
#16037
#16039

@piyushredpanda piyushredpanda merged commit 2c6a872 into redpanda-data:dev Jan 10, 2024
17 of 20 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: default enable space management in v23.3 for upgraded clusters
5 participants