-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config/configuration: set cloud_storage_spillover_manifest_size #16172
config/configuration: set cloud_storage_spillover_manifest_size #16172
Conversation
this enable splitting the partition_manifest into spillovers once it reaches 2*cloud_storage_spillover_manifest_size
/backport v23.3.x |
The pull request is not merged yet. Cancelling backport... |
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2233-63b6-463c-99fc-f7d1e07dd017:
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2239-58c9-46ac-a705-c9a449b638a3:
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2233-63bf-4f5b-bf67-d6c2ef3d1bf7:
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2233-63ba-45db-bf0e-4418021975eb:
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2239-58c3-4560-8ee0-93675544e565:
new failures in https://buildkite.com/redpanda/redpanda/builds/43945#018d2233-63bc-46b6-ac69-2001f948edbb:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2348-208d-4f36-96d4-e7f9237a8caa:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2348-2093-42e6-a8b6-9388c0b2896e:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2355-f163-464f-a4ef-c7a7e12f2dad:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2355-f16d-4828-a365-9b0ffd7d7f32:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2355-f167-4548-8f35-c9905e4cef1b:
new failures in https://buildkite.com/redpanda/redpanda/builds/43979#018d2355-f16a-4695-bc5c-74c4e59f91a0:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d181-40a3-b3fb-56bd84e9b92d:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d17a-4977-83c4-040227117fe3:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d188-4efb-8a40-4e807c88b5f6:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d192-4fad-b373-a722fc0e05ee:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d18b-4a33-81d5-48315a2c6a74:
new failures in https://buildkite.com/redpanda/redpanda/builds/44001#018d23b3-d18f-4533-821b-6e05d8bd3516:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c0-775d-4cb9-af67-b6fe3a1d8550:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c0-7768-4c15-9fc2-71e8ffceff44:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c5-48b8-49d7-9bbe-38749f65d8cd:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c0-7761-4ffa-aad5-ad7ce010596f:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c5-48b2-4ec2-8a25-19cfcdb16bc3:
new failures in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c0-7765-41cc-927a-28291d7a9719:
|
fix use of cloud_storage_spillover_manifest_max_segments
explicitly disable cloud_storage_spillover_manifest_size if the tests bootstraps with cloud_storage_spillover_manifest_max_segments. this is done because inside redpanda, size takes precendece over max_segments and the tests will work with wrong assumptions
test fixes: explicitly disable cloud_storage_spillover_manifest_size if the test uses cloud_storage_spillover_manifest_max_segments. this is because inside redpanda size takes precedence over max_segments and the test fails due to wrong assumptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks okay, but I think this approach to ducktape is too heavyhanded and will be error prone moving forward
# in redpanda, cloud_storage_spillover_manifest_size takes preference over cloud_storage_spillover_manifest_max_segments. | ||
# disable the former to use the latter | ||
assert conf.get('cloud_storage_spillover_manifest_size', None) is None or \ | ||
conf.get('cloud_storage_spillover_manifest_max_segments', None) is None, \ | ||
"cannot set cloud_storage_spillover_manifest_max_segments if cloud_storage_spillover_manifest_size is already set, it will not be used by redpanda" | ||
if 'cloud_storage_spillover_manifest_max_segments' in conf: | ||
conf['cloud_storage_spillover_manifest_size'] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is surprising, and is something I can imagine us tripping over in the future.
Why isn't it sufficient to just set cloud_storage_spillover_manifest_size
to None in the tests that set ..max_segments
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests set the value in extra_rp_conf, some other set it in SISettings, i thought to just fix it in one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some other related tests are failing on the backport, I'll fix what there but just skip the redpanda build in the ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also curious about this. How many tests does this affect can we just update a couple tests explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. It is a little annoying, but I think it's preferable since it's less surprising than having this business logic encoded in the service itself. For instance if we wanted to test these two configs together, which isn't an unreasonable thing for an unsuspecting user to do, we wouldn't be able to now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, since we're trying to get this out the door, happy to approve once tests pass, and we can change this in a follow-up.
redpanda ignores cloud_storage_spillover_manifest_max_segments when cloud_storage_spillover_manifest_size is set. explicitly set the latter to None
/ci-repeat 1 skip-unit skip-redpanda-build |
Agreed if we create tickets to track the follow up work. |
@@ -262,7 +262,8 @@ def __init__(self, test_context): | |||
cloud_storage_segment_size_min=2 * self.log_segment_size, | |||
retention_local_target_bytes_default=2 * self.log_segment_size, | |||
cloud_storage_enable_segment_merging=True, | |||
cloud_storage_cache_chunk_size=self.chunk_size) | |||
cloud_storage_cache_chunk_size=self.chunk_size, | |||
cloud_storage_spillover_manifest_size=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect setting this to none will actually just not write the config, rather than setting it to nullopt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I was looking at the wrong test results :)
440a9ef
to
b03f5af
Compare
These tests either weren't designed to use spillover, or were manually configuring a max segment count. In either case, the new default value is problematic for them since the size-based config takes precedence. I went through all tests I could find that configured cloud_storage_spillover_manifest_max_segments and unset cloud_storage_spillover_manifest_size as appropriate. Previous attempts to alleviate this attempted to set these at runtime with the incorrect ducktape methods (set_cluster_config() instead of set_cluster_config_to_null()). It's more foolproof to just set them at bootstrap time, as they were set before the default was introduced.
b03f5af
to
70199a8
Compare
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44018#018d24c0-7761-4ffa-aad5-ad7ce010596f |
/backport v23.3.x |
Failed to create a backport PR to v23.3.x branch. I tried:
|
this enable splitting the partition_manifest into spillovers once it reaches 2*cloud_storage_spillover_manifest_size
Fixes https://github.com/redpanda-data/core-internal/issues/1012
Backports Required
Release Notes
Features