Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v23.2.x] c/topics_dispatcher: do not guesstimate leader ids #16239

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Jan 23, 2024

Backport of PR #15678
Backport of PR #15981
Fixes: #16237

Updating leadership metadata before the topic is ready to serve traffic
is not desired. It prevents waiting for leader to be reported by raft
group when it is actually ready.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 3e86234)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 8f81843)
Instead of returning `-1` indicating that there is an ongoing leader
election we return randomly selected node as a partition leader. This is
much less interrupting for the client as it simply forces it to refresh
metadata. Some clients do not tolerate `-1` returned from Metadata
handler and simply stop working.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 86d583b)
@ztlpn
Copy link
Contributor

ztlpn commented Jan 23, 2024

I guess we should also backport #15981 along with this one

@piyushredpanda piyushredpanda added this to the v23.2.24 milestone Jan 24, 2024
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 763c0ac)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 5bf2a27)
The metadata dissemination test must account for exponential backoff
which may be longer than used 10 seconds timeout. Now as we do not
update leaders table after topic create command is applied we need to
wait for the request to be delivered to joining node.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit fea4f6a)
The consumer group recovery test doesn't do any validation if offset
listing was successful. Disabling leader balancer will make the test
more stable before we introduce validation in recovery tool (now the
tool assumes that user is going to review offsets listing).

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 3b2eac5)
Wait for at least some of the messages to be successfully written to
topic before starting consumers to prevent them trying to read empty
partition.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 6823527)
When partition is first created in Redpanda some of the cluster nodes
which are not hosting partition replicas may not yet have leadership
metadata. In this case Redpanda still has to return partition metadata.
In order not to disturb the client (returning -1 as a leader id may
cause some clients to stop) Redpanda has to return a leader id. If the
information is not present we will always return the first node from
replica set in leader epoch equal to 0. This way client will either
communicate with the actual leader or issue a metadata request to other
node that may contain up to date information.

Fixes: redpanda-data#15949

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 7a60b75)
@piyushredpanda piyushredpanda merged commit 0f3cac2 into redpanda-data:v23.2.x Jan 24, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v23.2.x] c/topics_dispatcher: do not guesstimate leader ids
3 participants