Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v24.1.x] controller_backend: prevent busy-looping when removing partitions #18216

Merged
merged 4 commits into from
May 3, 2024

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented May 2, 2024

Backport of PR #18157

@ztlpn ztlpn added this to the v24.1.x-next milestone May 2, 2024
@ztlpn ztlpn added the kind/backport PRs targeting a stable branch label May 2, 2024
@ztlpn
Copy link
Contributor Author

ztlpn commented May 2, 2024

/ci-repeat

ztlpn added 4 commits May 2, 2024 14:21
To better mimic concurrency structure of the real shard_balancer,
introduce the shard_assigner component to the stress test that is
responsible for modifying assignments in the shard placement table.

(cherry picked from commit e969333)

Conflicts:
	src/v/cluster/tests/shard_placement_table_test.cc
Previously, in the interval between topic_table update and
shard_placement_table update, controller_backend busy-looped by
"successfully removing" a partition that already wasn't there (the only
thing remaining was the assignment marker that remained until the
shard_placement_table update fully completed). Return a
wait_for_target_update error instead that will put the reconciliation
fiber to sleep.

(cherry picked from commit 7bae9ef)
…ping

(cherry picked from commit dbf88ca)

Conflicts:
	src/v/cluster/tests/shard_placement_table_test.cc
If reconciliation was successful, but another notification arrived in
the meantime, we don't have to exit try_reconcile_ntp loop (and
therefore sleep before the next attempt), we can proceed to reconciling
again right away.

(cherry picked from commit e312753)

Conflicts:
	src/v/cluster/controller_backend.cc
@ztlpn
Copy link
Contributor Author

ztlpn commented May 2, 2024

/ci-repeat

@ztlpn ztlpn marked this pull request as ready for review May 2, 2024 13:34
@ztlpn
Copy link
Contributor Author

ztlpn commented May 2, 2024

test errors unrelated (#18032 and something similar to #14062 related to the fact that we sent a reconfiguration request to a node that was being restarted).

@piyushredpanda piyushredpanda merged commit 21d540c into redpanda-data:v24.1.x May 3, 2024
17 checks passed
@piyushredpanda piyushredpanda modified the milestones: v24.1.x-next, v24.1.2 May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda kind/backport PRs targeting a stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v24.1.x] controller_backend: prevent busy-looping when removing partitions
2 participants