-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: fix kafka_max_offset on read replicas #16263
metrics: fix kafka_max_offset on read replicas #16263
Conversation
6fb81bb
to
7a09ddd
Compare
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d392c-7c97-49bd-8552-274b3f1dc481:
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d392c-7c9e-4bae-83be-9987280a72a2:
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d392c-7c94-4fa9-b4b1-50954e766635:
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d393d-6bea-43fc-91fa-3ddbced39eb6:
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d393d-6bf5-4df2-a9ab-97276f8bd230:
new failures in https://buildkite.com/redpanda/redpanda/builds/44191#018d393d-6bf2-4a15-83df-4575771b5098:
|
@andrwng Needs some test updates, it looks like. |
Updates the filter used when collecting metrics samples. In some cases, the samples take the form: Sample(name='redpanda_kafka_max_offset', labels={'redpanda_namespace': 'kafka', 'redpanda_partition': '5', 'redpanda_topic': 'panda-topic'}, value=0.0, ...
Read replicas were previously translating their own local log offsets to return `redpanda_kafka_max_offset` to the metrics endpoint. This is different from how read replicas calculate the HWM when returning to the Kafka endpoint, which just goes directly to cloud storage. This adds the same read replica check that we have in the Kafka layer.
7a09ddd
to
7e33cc7
Compare
labels = sample.labels | ||
if ns: | ||
if "redpanda_namespace" in labels: | ||
if labels["redpanda_namespace"] != ns: | ||
continue | ||
elif "namespace" in labels: | ||
if labels["namespace"] != ns: | ||
continue | ||
else: | ||
assert False, f"Missing namespace label: {sample}" | ||
if topic: | ||
if "redpanda_topic" in labels: | ||
if labels["redpanda_topic"] != topic: | ||
continue | ||
elif "topic" in labels: | ||
if labels["topic"] != topic: | ||
continue | ||
else: | ||
assert False, f"Missing topic label: {sample}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: these checks are a little hard to read, it looks like they could be simplified a bit and collapsed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it's a bit ugly. I spent some time earlier trying to clean it up, but couldn't come to anything simpler. Open to suggestions on how to improve it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe I'll do this in a follow up, if it works:
if ns:
assert "kafka_namespace" in labels or "namespace" in labels, f"Missing namespace"
if labels.get("kafka_namespace", labels.get("namespace")) != ns
continue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here #16277
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44199#018d3a42-e5d6-4e16-b602-29b4fb055df7 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44199#018d3a42-e5dd-4cde-a82b-bc83e4b703c6 ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/44212#018d3b97-0506-43d2-b2d6-e350e21003ba |
/backport v23.3.x |
/backport v23.2.x |
/backport v23.1.x |
Failed to create a backport PR to v23.1.x branch. I tried:
|
Failed to create a backport PR to v23.2.x branch. I tried:
|
Follow-up to e567d63 (redpanda-data#16263) to simplify the metrics_sum filtering.
Follow-up to e567d63 (redpanda-data#16263) to simplify the metrics_sum filtering.
Also fixes the
metric_sum
filtering in ducktape, as this was required for the included test.Fixes #16259
Backports Required
Release Notes
Bug Fixes
redpand_kafka_max_offset
metric.