-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v21.11.x] kafka: probe to send group offset to prometheus #3440
Merged
ZeDRoman
merged 9 commits into
redpanda-data:v21.11.x
from
ZeDRoman:backport-issue-1275
Jan 17, 2022
Merged
[v21.11.x] kafka: probe to send group offset to prometheus #3440
ZeDRoman
merged 9 commits into
redpanda-data:v21.11.x
from
ZeDRoman:backport-issue-1275
Jan 17, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To monitor group status we need to push offset for each topic partition to prometheus Added probe to send topic partition offset (cherry picked from commit 9c4ae20)
Added group seek command support to rpk ducktape wrapper (cherry picked from commit 378ebf7)
ducktape redpanda service wrapper metrics_sample returns None if none of the samples matches pattern. (cherry picked from commit 5f9725c)
Added ducktape tests for checking group offsets in metrics (cherry picked from commit 66c2131)
ZeDRoman
changed the title
kafka: probe to send group offset to prometheus
[v21.11.x] kafka: probe to send group offset to prometheus
Jan 11, 2022
Would be good to investigate test failures on |
It seems that leadership rebalancing was causing some of the rpk operations to be a bit flaky. There may be an opportunity here to add some retrying into rpk/franz-go. rptest.clients.rpk.RpkException: RpkException<command /var/lib/buildkite-agent/builds/buildkite-amd64-builders-i-07ad15334ccea297e-1/vectorized/redpanda/vbuild/debug/clang/dist/local/redpanda/bin/rpk group --brokers docker_n_14:9092,docker_n_6:9092,docker_n_5:9092 seek g2 --to start returned 1, output: , error: unable to list all offsets successfully: NOT_LEADER_FOR_PARTITION: This server is not the leader for that topic-partition. Fixes: redpanda-data#3443 Signed-off-by: Noah Watkins <noah@vectorized.io> (cherry picked from commit 54a8589)
After deleting a topic there may be a delay before the group manager on a broker removes the in-memory state. Add a retry so that we can tolerate these windows. Signed-off-by: Noah Watkins <noah@vectorized.io> (cherry picked from commit 101445e)
Signed-off-by: Noah Watkins <noah@vectorized.io> (cherry picked from commit 74fcd0b)
Signed-off-by: Noah Watkins <noah@vectorized.io> (cherry picked from commit d2f6779)
Some timeouts have been occuring in the group metrics leadership transfer test. The timeout was occuring on a conjunction condition so it wasn't clear what was timing out. This patch splits those two waits so we can have finer grained information about what is failing. Also adding some logging for additional context. Signed-off-by: Noah Watkins <noah@vectorized.io> (cherry picked from commit 27349be)
jcsp
approved these changes
Jan 17, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cover letter
Backport of #3181 , #3462 , #3482
To monitor group status we need to push offset for each topic partition to prometheus
According to topic partition offset we can calculate group lag
Fixes: #1275
Release notes
kafka server send group topic partition offset metric to prometheus