Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-526 k/configs: dont override topic-level cleanup policy #18284

Merged
merged 11 commits into from
Jul 12, 2024

Conversation

pgellert
Copy link
Contributor

@pgellert pgellert commented May 7, 2024

This fixes a bug in the kafka layer where redpanda would set the cleanup.policy config for every new topic even if this was not explicitly specified by the topic creation request. Instead, the expected behaviour is that this topic-level config is not set, but instead, redpanda should fall back in the case of an unset cleanup.policy to the cluster-level default without hardcoding that default at the topic-level.

Fixes https://redpandadata.atlassian.net/browse/CORE-526
Fixes https://redpandadata.atlassian.net/browse/CORE-2807

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • This fixes a bug in kafka topic configs, where the cleanup.policy config was always set at the topic-level to the cluster default even when the topic creation request did not specify this.

@pgellert pgellert self-assigned this May 7, 2024
@pgellert
Copy link
Contributor Author

pgellert commented May 7, 2024

/ci-repeat 1

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 7, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-adaa-43bf-9249-2318001540b4:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada3-4529-887b-824945435606:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada5-4198-9eee-4a91dd776426:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c573-4afe-b060-4d7d62feb2fe:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c570-4f11-9233-a90a3aa6bbfe:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/48846#018f5e08-f5fe-4944-8f66-e3380c646138:

"rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f7273-f84a-4495-8fb9-29fa17a76e04:

"rptest.tests.topic_creation_test.CreateSITopicsTest.topic_alter_config_test"
"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e58-4f89-b413-172dbb379d3e:

"rptest.tests.topic_creation_test.CreateSITopicsTest.topic_alter_config_test"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e5f-4753-8ff8-bf4d1baf5bd1:

"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

new failures in https://buildkite.com/redpanda/redpanda/builds/51257#01909746-e7ec-4fa7-a50d-e40b9bbaa4b5:

"rptest.tests.audit_log_test.AuditLogTestsAppLifecycle.test_app_lifecycle"

@pgellert
Copy link
Contributor Author

pgellert commented May 8, 2024

/ci-repeat 1

@pgellert
Copy link
Contributor Author

/ci-repeat 1

1 similar comment
@pgellert
Copy link
Contributor Author

/ci-repeat 1

@pgellert pgellert changed the title [WIP] k/configs: dont override topic-level cleanup policy k/configs: dont override topic-level cleanup policy May 14, 2024
@pgellert pgellert requested a review from Lazin May 14, 2024 11:30
@pgellert pgellert changed the title k/configs: dont override topic-level cleanup policy CORE-526 k/configs: dont override topic-level cleanup policy May 14, 2024
@pgellert pgellert force-pushed the configs/fix-cleanup-policy-source branch from bac0867 to 6c7d2ca Compare May 14, 2024 18:37
@pgellert
Copy link
Contributor Author

/ci-repeat 1

Copy link
Contributor

@Lazin Lazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to update the place where we generate topic_manifest in the ntp_config.
Also, when we update the cleanup policy flag in the topic config we have to restart archiver (in the partition::update_configuration the flag is set in this case). But with this change we also need to do the same when the configuration parameter changes.

static retention
get_retention_policy(const storage::ntp_config::default_overrides& prop) {
auto flags = prop.cleanup_policy_bitflags;
static retention get_retention_policy(const storage::ntp_config& prop) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change is needed?

The first attempt at fixing the way to report cleanup.policy was
incorrect, because it meant that we reported the config as
DEFAULT_CONFIG even if the config was explicitly set but to the
cluster-default value.

Instead, in this case, we want to report this as a DYNAMIC_TOPIC_CONFIG.
We can do this now since we stopped setting the cleanup.policy value
explicitly for each topic.

Furthermore, the compat tests was simplified because now there are 2
separate fix attepts deployed at multiple redpanda versions, so to keep
the test sane, we just ignore the diff until a version (24.2.1) where we
know that this version will contain the fixes and all following version
we might compare against also contains the fixes.

Lastly, the change to the DescribeTopicConfigs ducktape test was also
reverted to correctly report the cleanup.policy as DYNAMIC_TOPIC_CONFIG
when this was explicitly specified on the topic creation request.
@pgellert pgellert force-pushed the configs/fix-cleanup-policy-source branch from 6c7d2ca to c76b6c6 Compare July 3, 2024 15:56
@pgellert
Copy link
Contributor Author

pgellert commented Jul 3, 2024

Force-push: rebased to dev and addressed the feedback. (It's best to re-review the PR as a whole rather than looking at the force-push diff.)

@pgellert
Copy link
Contributor Author

pgellert commented Jul 3, 2024

@Lazin

Also, when we update the cleanup policy flag in the topic config we have to restart archiver (in the partition::update_configuration the flag is set in this case). But with this change we also need to do the same when the configuration parameter changes.

I've done this now.

You also need to update the place where we generate topic_manifest in the ntp_config.

As far as I can see the (v2) topic_manifest serializes the cluster::topic_configuration into the topic_manifest, ie. it only serializes the topic-specific overrides into the manifest rather than the "resolved value" (= topic specific override or else the cluster config based callback).

By "update the place where we generate topic_manifest in the ntp_config", did you mean that the topic_manifest should contain the resolved value of the cleanup_policy config instead?

Comment on lines +87 to +104
_log_cleanup_policy.watch([this]() {
if (_as.abort_requested()) {
return ss::now();
}
auto changed = _raft->log()->notify_compaction_update();
if (changed) {
vlog(
clusterlog.debug,
"[{}] updating archiver for cluster config change in "
"log_cleanup_policy",
_raft->ntp());

return restart_archiver(false);
}
return ss::now();
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lazin We're doing the archiver restart synchronously on topic config updates, so I also went with making the restart on config updates synchronous here. But I wonder if that's safe or if I should offload it to a fibre? What do you think?

@pgellert pgellert requested a review from Lazin July 3, 2024 16:02
@pgellert
Copy link
Contributor Author

pgellert commented Jul 5, 2024

/dt

@pgellert pgellert force-pushed the configs/fix-cleanup-policy-source branch from c76b6c6 to c7faff3 Compare July 5, 2024 12:08
@pgellert
Copy link
Contributor Author

pgellert commented Jul 5, 2024

Force-pushed to fix test compilation errors.

@pgellert
Copy link
Contributor Author

pgellert commented Jul 5, 2024

/dt

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Jul 5, 2024

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908324-653b-4590-8818-137cef15b65e:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908324-653d-403c-8b81-e0a66a8a00c4:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908324-6539-46b7-8e32-81fe23921338:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908324-653e-4efe-8f94-3044f42df30b:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908326-00c1-4632-b1bc-bfff3e4ce0b9:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908326-00c2-414b-8955-e28b08e8f73a:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908326-00bd-4af7-949c-4cb233a542f6:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51145#01908326-00bf-4ed8-9600-21e1698c0d4a:
pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51406#0190a78d-050f-4a41-9cc6-47230ba51f1a:
pandatriage cache was not found

@@ -742,7 +742,7 @@ ss::future<> partition::update_configuration(topic_properties properties) {
if (
old_ntp_config.is_archival_enabled() != new_archival
|| old_ntp_config.is_read_replica_mode_enabled()
!= new_ntp_config.read_replica
!= new_ntp_config.read_replica.value_or(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be impossible to enable or disable read-replica mode, the normal topic can't be converted to read-replica and vice versa

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I remove this check then instead of fixing it up? As I wrote in the commit message (aff7085), it was causing archival restarts on unrelated changes when the config was set to read_replica = std::optional(false).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

bool new_archival = new_ntp_config.shadow_indexing_mode
&& model::is_archival_enabled(
new_ntp_config.shadow_indexing_mode.value());

bool old_read_replica = old_ntp_config.is_read_replica_mode_enabled();
bool new_read_replica = new_ntp_config.read_replica.value_or(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

"read_replica",
_raft->ntp());
cloud_storage_changed = true;
} else if (old_compaction_status != new_compaction_status) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why else if? should it be possible to change both compaction and archival properties in one call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be just an if. else if ensures that we only log once for restarts and we log the first reason that caused a restart. It's possible to have multiple config changes in a single topic update that cause a restart, so I could make it an if to provide more context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@pgellert pgellert force-pushed the configs/fix-cleanup-policy-source branch from c7faff3 to e186399 Compare July 9, 2024 10:17
@pgellert pgellert marked this pull request as ready for review July 9, 2024 10:21
Lazin
Lazin previously approved these changes Jul 11, 2024
@pgellert pgellert requested a review from BenPope July 12, 2024 07:44
src/v/model/fundamental.h Show resolved Hide resolved
Comment on lines 759 to 804
if (old_archival != new_archival) {
vlog(
clusterlog.debug,
"[{}] updating archiver for topic config change in "
"archival_enabled",
_raft->ntp());
cloud_storage_changed = true;
}
if (old_retention_ms != new_retention_ms) {
vlog(
clusterlog.debug,
"[{}] updating archiver for topic config change in "
"retention_ms",
_raft->ntp());
cloud_storage_changed = true;
}
if (old_retention_bytes != new_retention_bytes) {
vlog(
clusterlog.debug,
"[{}] updating archiver for topic config change in "
"retention_bytes",
_raft->ntp());
cloud_storage_changed = true;
}
if (old_compaction_status != new_compaction_status) {
vlog(
clusterlog.debug,
"[{}] updating archiver for topic config change in compaction",
_raft->ntp());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I guess it's possible that multiple of these could change at the same time, and a single log message might be nicer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did initially but then @Lazin recommended splitting it: #18284 (comment). I think I'm going to stay with the current approach for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I did initially but then @Lazin recommended splitting it: #18284 (comment). I think I'm going to stay with the current approach for now.

It looks like initially you had it printing one, not all? Anyway, I'm happy to take @Lazin's suggestion.

@@ -380,6 +381,7 @@ class disk_log_impl final : public log {
std::optional<model::offset> _cloud_gc_offset;
std::optional<model::offset> _last_compaction_window_start_offset;
size_t _reclaimable_size_bytes{0};
bool _compaction_enabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does _compaction_enabled need to exist? Does it ever diverge from config().is_compacted()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this stores the current state of the log's compaction status and it's used to compare against config().is_compacted() for changes in disk_log_impl::notify_compaction_update. We can't just use config().is_compacted() because when the log_cleanup_policy cluster config changes and we try to determine whether we need to (un)mark the segments as compacted, we need a way to remember what the old compaction status was.

Comment on lines +149 to +150
void set_overrides(ntp_config::default_overrides) final;
bool notify_compaction_update() final;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this was split, it seems easier to misuse now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two manage independent behaviours. When the cluster-level config log_cleanup_policy changes there is no ntp_config::default_overrides to update but the compaction status still needs to update. I was following the pattern used in the consensus layer (sitting between partition and log) which has a similar notify_config_update method for write caching config changes:

// hook called on desired cluster/topic configuration updates.
void notify_config_update();

It's only used in partition.cc and partition+consensus+log are already highly coupled, so I think this is fine. (Perhaps the ideal way would be to have a methot with a signature ss::future<change_result> apply_config_change(std::variant<topic_config_change, cluster_config_change>); but I wanted to keep the refactoring minimal and consistent with the existing pattern.)

Copy link
Member

@BenPope BenPope Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Perhaps the ideal way would be to have a methot with a signature ss::future<change_result> apply_config_change(std::variant<topic_config_change, cluster_config_change>); but I wanted to keep the refactoring minimal and consistent with the existing pattern.)

The concern here is that splitting it up means that the two calls must both be made or there's a bug. Within the log, it's not clear whether to read _compaction_enabled or config().is_compacted().

It might be worth changing the name, or making it an optional that's only engaged during the gap between the two calls.

I think I'd still prefer updating configuration to also do the notification.

src/v/cluster/partition.cc Show resolved Hide resolved
Pure refactor. Adds helpers to check if compaction or deletion are set
in the cleanup policy flags.
This changes all uses of `ntp_config::is_compacted` and
`ntp_config::is_collectible` to fall back to the cluster-wide default
config `log_cleanup_policy`.
Since it is no longer the case that all topics have a topic-level
override for the cleanup policy, we fall back to using the cluster-level
default cleanup policy when the topic-level override is not set during
the partition recovery process.
The read replica config is immutable, so there's no need to check it for
changes.

(This fixes a bug in the handling of archiver restarts when the read
replica topic config changes. Previously the archiver would incorrectly
restart on every topic config change if "read_replica" was set to
`std::optional<bool>(false)`.)

It also adds a check for changes to `retention.bytes` and `retention.ms`
as these configs are used by archiver.
`CloudArchiveRetentionTest.test_delete` fails without restarting
archiver whenever these configs change.
Pure refactor except that the log messages emitted when cloud storage
needs to be restart change. The message emitted now states the reason
for restarting cloud storage instead of printing the old and new ntp
configs.
Pure refactor, no behaviour change intended.

This changes the partition-raft-log interface to decouple topic-level
config overrides and changes to compaction. This allows calling
`log::notify_compaction_update` independently from `log::set_overrides`
whenever `log_cleanup_policy` changes in the follow up commit.
Now that the cleanup policy is not always set on the topic-level
overrides, we need to watch for changes to the `log_cleanup_policy`
config and react to that by:
 * Restarting the archiver if compaction changes
 * (Un)marking the segments for compaction if compaction changes
This adds tests to verify that archiver is restarted whenever there are
changes to the compaction configs, either at the cluster-level defaults
or at the topic-level overrides.
@pgellert
Copy link
Contributor Author

Force-pushed to address some nitpicks (using the new helpers in more places).

@pgellert pgellert requested review from BenPope and Lazin July 12, 2024 11:01
@dotnwat
Copy link
Member

dotnwat commented Jul 12, 2024

Should this behavior cahnge be backported, or should be enabled only in new clusters with an appropriate announcement in the release notes?

@pgellert
Copy link
Contributor Author

Good point. I am going to hold off on backporting it for now and talk to devex/product to see if there is a strong motivation to backport it.

That being said, I consider this a bug fix. Existing topics will have their cleanup.policy set as an override and it is only when they create a new topic or issue an AlterConfigs request that a topic's cleanup policy will change.

I put an announcement in the release notes up in the PR cover letter. Or did you have anything more in mind?

@pgellert pgellert merged commit d4e2c5f into redpanda-data:dev Jul 12, 2024
21 checks passed
@pgellert pgellert deleted the configs/fix-cleanup-policy-source branch July 12, 2024 18:33
@pgellert
Copy link
Contributor Author

@dotnwat I had a chat with @michael-redpanda and we think of this change as a bug fix that we should backport. Backporting this resolves a recent bug report: #21360

@pgellert
Copy link
Contributor Author

/backport v24.1.x

@pgellert
Copy link
Contributor Author

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v24.1.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-18284-v24.1.x-970 remotes/upstream/v24.1.x
git cherry-pick -x 9a3f84695eaee59f096b7eeb8fbba8caf4d82523 6a26dba9b73f3f59fbe5d25a42bf62a3446426a1 f2bea7fbc7488670733afe54ec2f71a93632f4a3 c3e2be3042a362ca5a12eeed63242732914619c5 b4ac3d76c37ec1d5eb4ee93374881ac09068503f 35fb002268d75a51490b70d003072c72d64a6741 7517614b708e7d645ea0993c60b51b1160a161a0 811a5edd60d4c3c30f1efa343d6e07e58f4f68d8 2504b4c32c6c9cd70e7ee8a1a1149f78d7ea8eb2 f24256a98c0142ae460fd1e18a94a78e39ffe6b0 800d4e7a8e9412b85fff6fa77075be668673df69

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-18284-v23.3.x-521 remotes/upstream/v23.3.x
git cherry-pick -x 9a3f84695eaee59f096b7eeb8fbba8caf4d82523 6a26dba9b73f3f59fbe5d25a42bf62a3446426a1 f2bea7fbc7488670733afe54ec2f71a93632f4a3 c3e2be3042a362ca5a12eeed63242732914619c5 b4ac3d76c37ec1d5eb4ee93374881ac09068503f 35fb002268d75a51490b70d003072c72d64a6741 7517614b708e7d645ea0993c60b51b1160a161a0 811a5edd60d4c3c30f1efa343d6e07e58f4f68d8 2504b4c32c6c9cd70e7ee8a1a1149f78d7ea8eb2 f24256a98c0142ae460fd1e18a94a78e39ffe6b0 800d4e7a8e9412b85fff6fa77075be668673df69

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants