-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v23.3.x] Refactored cluster::partition_leaders_table
to use hierarchical structure of metadata
#16709
[v23.3.x] Refactored cluster::partition_leaders_table
to use hierarchical structure of metadata
#16709
Conversation
Changed the way how leader information is stored in `cluster::partition_leaders_table`. Previously each partition was represented as an entry in leaders map. This way all partitions from the same topic required a full `model::ntp` based lookup. Change the implementation to the one where top level map is keyed by topic and its value is a map storing leader per partition. This way all topic partitions can be updated without the need to look up the whole topic. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 2410d67)
c65d65c
to
25353dd
Compare
Signed-off-by: Michal Maslanka <michal@redpadna.com>
Added tracking the number of leader less partitions in leaders table. This prevents iterating over the whole list of leaders when generating cluster metrics. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 0d946e0)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit bd3cc30)
Added tracking version of partition leaders table to be able to identify concurrent modification. This will allow yielding while iterating the leaders table. If a table is modified during operation an exception is thrown and operation can be retried. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit b397956)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit abe6173)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit fdeb6ac)
Leveraging the hierarchical structure of node health report and internals of partition leaders table to minimize the number of lookups in leaders map. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 9a255f0)
Previously `get_leadership_reply` did not contain any information about the operation state, therefore it was impossible to propagate service error to the client. Added a field indicating if response is successful. The field allow us to explicitly handle errors like partition leaders table concurrent modification. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit aeab006)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 2db09f9)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit cddc003)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit e37df0c)
Using `ntp_callbacks` to wait for the leaders without additional promises map. When caller requests to wait for a leader we register the notification which sets the promise value when called. This way we do not need a separate mechanism to keep track of leadership notifications. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 4eed31f)
Replaced previously used `chunked_fifo` with dynamically sized fragmented vectors. Fragemented vector provides a random access iterator and automatically controls the size of allocated chunks Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 44e6cd7)
Added methods allowing `fragmented_vector::iter` to satisfy `std::random_random__iterator` concept. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit f4e6bce)
Using async algorithm will call `ss::coroutine::maybe_yield()` every 100 operations while still being lightweight while iterating over synchronously over a chunk. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 9dbeb81)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 8948fe5)
25353dd
to
45466d5
Compare
Michal Maslanka seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
1 similar comment
Michal Maslanka seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Since leaders table might have been modified during the execution of iteration over the topic partitions we must check the version of topics table after the asynchronous iteration finished. Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 625fb44)
Signed-off-by: Michal Maslanka <michal@redpanda.com> (cherry picked from commit 6eaa7b8)
Backport of PR #16512
Backport of PR #16711