-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORE-1743 rptest: fix flaky metric check by disabling leader balancer #17833
Conversation
new failures in https://buildkite.com/redpanda/redpanda/builds/47735#018ed344-70ed-493a-8653-dec48f325166:
new failures in https://buildkite.com/redpanda/redpanda/builds/47735#018ed34b-3e86-4bdd-8ae4-ddf947ee733f:
|
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/47735#018ed344-70ed-493a-8653-dec48f325166 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/47793#018ee1e8-321a-4ef8-8722-a55bfdd8215a |
cloud_storage_cache_chunk_size=self.default_chunk_size, | ||
# Disable leader balancer to have stable node to fetch metrics from. | ||
enable_leader_balancer=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would making the metric check be more flexible be an alternative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be always a chance of a race condition between querying who the leader is and the metric value.
So you have ideas? For this particular test it would probably be easier to just start one replica and not bother about leadership at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i approved the pr, i'm just asking if it's possible/reasonable.
So you have ideas?
if in the face of leadership transfer there is a metrics query, potentially involving aggregation across endpoints, that even with the leadership transfer correctly asserts the property being tested for, then that would be the idea. not that it's easy or possible. just wondering, because less we assume about the system (e.g. balancer enabled/disabled) the more robust the tests are.
/dt |
Fixes #16342
Backports Required
Release Notes