-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORE-526 k/configs: dont override topic-level cleanup policy #18284
CORE-526 k/configs: dont override topic-level cleanup policy #18284
Conversation
/ci-repeat 1 |
new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-adaa-43bf-9249-2318001540b4:
new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada3-4529-887b-824945435606:
new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada5-4198-9eee-4a91dd776426:
new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c573-4afe-b060-4d7d62feb2fe:
new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c570-4f11-9233-a90a3aa6bbfe:
new failures in https://buildkite.com/redpanda/redpanda/builds/48846#018f5e08-f5fe-4944-8f66-e3380c646138:
new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f7273-f84a-4495-8fb9-29fa17a76e04:
new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e58-4f89-b413-172dbb379d3e:
new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e5f-4753-8ff8-bf4d1baf5bd1:
new failures in https://buildkite.com/redpanda/redpanda/builds/51257#01909746-e7ec-4fa7-a50d-e40b9bbaa4b5:
|
/ci-repeat 1 |
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/48846#018f5e08-f600-4b46-9df2-0f80c8e34da5 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/48846#018f5e08-f5fe-4944-8f66-e3380c646138 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/49031#018f7329-9758-4e48-be0d-b1e321f93440 |
/ci-repeat 1 |
1 similar comment
/ci-repeat 1 |
bac0867
to
6c7d2ca
Compare
/ci-repeat 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to update the place where we generate topic_manifest in the ntp_config.
Also, when we update the cleanup policy flag in the topic config we have to restart archiver (in the partition::update_configuration
the flag is set in this case). But with this change we also need to do the same when the configuration parameter changes.
static retention | ||
get_retention_policy(const storage::ntp_config::default_overrides& prop) { | ||
auto flags = prop.cleanup_policy_bitflags; | ||
static retention get_retention_policy(const storage::ntp_config& prop) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change is needed?
The first attempt at fixing the way to report cleanup.policy was incorrect, because it meant that we reported the config as DEFAULT_CONFIG even if the config was explicitly set but to the cluster-default value. Instead, in this case, we want to report this as a DYNAMIC_TOPIC_CONFIG. We can do this now since we stopped setting the cleanup.policy value explicitly for each topic. Furthermore, the compat tests was simplified because now there are 2 separate fix attepts deployed at multiple redpanda versions, so to keep the test sane, we just ignore the diff until a version (24.2.1) where we know that this version will contain the fixes and all following version we might compare against also contains the fixes. Lastly, the change to the DescribeTopicConfigs ducktape test was also reverted to correctly report the cleanup.policy as DYNAMIC_TOPIC_CONFIG when this was explicitly specified on the topic creation request.
This reverts commit 261dc93.
6c7d2ca
to
c76b6c6
Compare
Force-push: rebased to dev and addressed the feedback. (It's best to re-review the PR as a whole rather than looking at the force-push diff.) |
I've done this now.
As far as I can see the (v2) topic_manifest serializes the By "update the place where we generate topic_manifest in the ntp_config", did you mean that the |
_log_cleanup_policy.watch([this]() { | ||
if (_as.abort_requested()) { | ||
return ss::now(); | ||
} | ||
auto changed = _raft->log()->notify_compaction_update(); | ||
if (changed) { | ||
vlog( | ||
clusterlog.debug, | ||
"[{}] updating archiver for cluster config change in " | ||
"log_cleanup_policy", | ||
_raft->ntp()); | ||
|
||
return restart_archiver(false); | ||
} | ||
return ss::now(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lazin We're doing the archiver restart synchronously on topic config updates, so I also went with making the restart on config updates synchronous here. But I wonder if that's safe or if I should offload it to a fibre? What do you think?
/dt |
c76b6c6
to
c7faff3
Compare
Force-pushed to fix test compilation errors. |
/dt |
src/v/cluster/partition.cc
Outdated
@@ -742,7 +742,7 @@ ss::future<> partition::update_configuration(topic_properties properties) { | |||
if ( | |||
old_ntp_config.is_archival_enabled() != new_archival | |||
|| old_ntp_config.is_read_replica_mode_enabled() | |||
!= new_ntp_config.read_replica | |||
!= new_ntp_config.read_replica.value_or(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be impossible to enable or disable read-replica mode, the normal topic can't be converted to read-replica and vice versa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I remove this check then instead of fixing it up? As I wrote in the commit message (aff7085), it was causing archival restarts on unrelated changes when the config was set to read_replica = std::optional(false)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/v/cluster/partition.cc
Outdated
bool new_archival = new_ntp_config.shadow_indexing_mode | ||
&& model::is_archival_enabled( | ||
new_ntp_config.shadow_indexing_mode.value()); | ||
|
||
bool old_read_replica = old_ntp_config.is_read_replica_mode_enabled(); | ||
bool new_read_replica = new_ntp_config.read_replica.value_or(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/v/cluster/partition.cc
Outdated
"read_replica", | ||
_raft->ntp()); | ||
cloud_storage_changed = true; | ||
} else if (old_compaction_status != new_compaction_status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why else if? should it be possible to change both compaction and archival properties in one call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be just an if
. else if
ensures that we only log once for restarts and we log the first reason that caused a restart. It's possible to have multiple config changes in a single topic update that cause a restart, so I could make it an if
to provide more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
c7faff3
to
e186399
Compare
src/v/cluster/partition.cc
Outdated
if (old_archival != new_archival) { | ||
vlog( | ||
clusterlog.debug, | ||
"[{}] updating archiver for topic config change in " | ||
"archival_enabled", | ||
_raft->ntp()); | ||
cloud_storage_changed = true; | ||
} | ||
if (old_retention_ms != new_retention_ms) { | ||
vlog( | ||
clusterlog.debug, | ||
"[{}] updating archiver for topic config change in " | ||
"retention_ms", | ||
_raft->ntp()); | ||
cloud_storage_changed = true; | ||
} | ||
if (old_retention_bytes != new_retention_bytes) { | ||
vlog( | ||
clusterlog.debug, | ||
"[{}] updating archiver for topic config change in " | ||
"retention_bytes", | ||
_raft->ntp()); | ||
cloud_storage_changed = true; | ||
} | ||
if (old_compaction_status != new_compaction_status) { | ||
vlog( | ||
clusterlog.debug, | ||
"[{}] updating archiver for topic config change in compaction", | ||
_raft->ntp()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I guess it's possible that multiple of these could change at the same time, and a single log message might be nicer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I did initially but then @Lazin recommended splitting it: #18284 (comment). I think I'm going to stay with the current approach for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I did initially but then @Lazin recommended splitting it: #18284 (comment). I think I'm going to stay with the current approach for now.
It looks like initially you had it printing one, not all? Anyway, I'm happy to take @Lazin's suggestion.
@@ -380,6 +381,7 @@ class disk_log_impl final : public log { | |||
std::optional<model::offset> _cloud_gc_offset; | |||
std::optional<model::offset> _last_compaction_window_start_offset; | |||
size_t _reclaimable_size_bytes{0}; | |||
bool _compaction_enabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does _compaction_enabled
need to exist? Does it ever diverge from config().is_compacted()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this stores the current state of the log's compaction status and it's used to compare against config().is_compacted()
for changes in disk_log_impl::notify_compaction_update
. We can't just use config().is_compacted()
because when the log_cleanup_policy
cluster config changes and we try to determine whether we need to (un)mark the segments as compacted, we need a way to remember what the old compaction status was.
void set_overrides(ntp_config::default_overrides) final; | ||
bool notify_compaction_update() final; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why this was split, it seems easier to misuse now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two manage independent behaviours. When the cluster-level config log_cleanup_policy
changes there is no ntp_config::default_overrides
to update but the compaction status still needs to update. I was following the pattern used in the consensus layer (sitting between partition
and log
) which has a similar notify_config_update
method for write caching config changes:
redpanda/src/v/raft/consensus.h
Lines 516 to 517 in 1e15575
// hook called on desired cluster/topic configuration updates. | |
void notify_config_update(); |
It's only used in
partition.cc
and partition+consensus+log are already highly coupled, so I think this is fine. (Perhaps the ideal way would be to have a methot with a signature ss::future<change_result> apply_config_change(std::variant<topic_config_change, cluster_config_change>);
but I wanted to keep the refactoring minimal and consistent with the existing pattern.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Perhaps the ideal way would be to have a methot with a signature ss::future<change_result> apply_config_change(std::variant<topic_config_change, cluster_config_change>); but I wanted to keep the refactoring minimal and consistent with the existing pattern.)
The concern here is that splitting it up means that the two calls must both be made or there's a bug. Within the log, it's not clear whether to read _compaction_enabled
or config().is_compacted()
.
It might be worth changing the name, or making it an optional that's only engaged during the gap between the two calls.
I think I'd still prefer updating configuration to also do the notification.
Pure refactor. Adds helpers to check if compaction or deletion are set in the cleanup policy flags.
This changes all uses of `ntp_config::is_compacted` and `ntp_config::is_collectible` to fall back to the cluster-wide default config `log_cleanup_policy`.
Since it is no longer the case that all topics have a topic-level override for the cleanup policy, we fall back to using the cluster-level default cleanup policy when the topic-level override is not set during the partition recovery process.
The read replica config is immutable, so there's no need to check it for changes. (This fixes a bug in the handling of archiver restarts when the read replica topic config changes. Previously the archiver would incorrectly restart on every topic config change if "read_replica" was set to `std::optional<bool>(false)`.) It also adds a check for changes to `retention.bytes` and `retention.ms` as these configs are used by archiver. `CloudArchiveRetentionTest.test_delete` fails without restarting archiver whenever these configs change.
Pure refactor except that the log messages emitted when cloud storage needs to be restart change. The message emitted now states the reason for restarting cloud storage instead of printing the old and new ntp configs.
Pure refactor, no behaviour change intended. This changes the partition-raft-log interface to decouple topic-level config overrides and changes to compaction. This allows calling `log::notify_compaction_update` independently from `log::set_overrides` whenever `log_cleanup_policy` changes in the follow up commit.
Now that the cleanup policy is not always set on the topic-level overrides, we need to watch for changes to the `log_cleanup_policy` config and react to that by: * Restarting the archiver if compaction changes * (Un)marking the segments for compaction if compaction changes
This adds tests to verify that archiver is restarted whenever there are changes to the compaction configs, either at the cluster-level defaults or at the topic-level overrides.
e186399
to
800d4e7
Compare
Force-pushed to address some nitpicks (using the new helpers in more places). |
Should this behavior cahnge be backported, or should be enabled only in new clusters with an appropriate announcement in the release notes? |
Good point. I am going to hold off on backporting it for now and talk to devex/product to see if there is a strong motivation to backport it. That being said, I consider this a bug fix. Existing topics will have their I put an announcement in the release notes up in the PR cover letter. Or did you have anything more in mind? |
@dotnwat I had a chat with @michael-redpanda and we think of this change as a bug fix that we should backport. Backporting this resolves a recent bug report: #21360 |
/backport v24.1.x |
/backport v23.3.x |
Failed to create a backport PR to v24.1.x branch. I tried:
|
Failed to create a backport PR to v23.3.x branch. I tried:
|
This fixes a bug in the kafka layer where redpanda would set the
cleanup.policy
config for every new topic even if this was not explicitly specified by the topic creation request. Instead, the expected behaviour is that this topic-level config is not set, but instead, redpanda should fall back in the case of an unsetcleanup.policy
to the cluster-level default without hardcoding that default at the topic-level.Fixes https://redpandadata.atlassian.net/browse/CORE-526
Fixes https://redpandadata.atlassian.net/browse/CORE-2807
Backports Required
Release Notes
Bug Fixes
cleanup.policy
config was always set at the topic-level to the cluster default even when the topic creation request did not specify this.