New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics: Reduce pubsub cardinality #13378
Conversation
When the cardinality of the pubsub metrics get too large it causes juju to crash with lots of data. Instead we swap out the topic name from a UUID to a name that we can track easily.
Tom confirmed that this looks like what they are seeing on PS5: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is problematic to change the labels for the responses, then this is ok. But if we can clean that up while we're here, we might as well.
@@ -75,51 +81,56 @@ func (m *PubsubMetrics) Unsubscribed() { | |||
m.subscriptions.Dec() | |||
} | |||
|
|||
var leaseRequestRegex = regexp.MustCompile("lease.request.[0-9a-f]+.[0-9]+") | |||
var ( | |||
leaseRequestRegex = regexp.MustCompile("lease.request.[0-9a-f]+.[0-9]+") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the prefix in the one but not the other? I guess we are doing responses as UUIDs, but that makes me a bit sad that we aren't using any sort of identifier in the response topic for anything more than just an opaque UUID.
Is there any reason why we can't change the response to:
lease.request.callback.$UUID
and then only filter those instead of arbitrary UUIDS?
I don't think we care what the keys are, and prefixing them rather than just UUIDs is cleaner anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I've namespaced the response topic. These of course will all go away once the raft over API lands.
The lease callback topics conformed to UUID, but the problem with them, is that they weren't namespaced, so it was hard to know where they came from. The change here, was to ensure that they are namespaced and that we can identify their origin.
|
#13385 Merge 2.9 #13363 Use a mock for upgradedatabase test clock #13347 Update oracle api and fix bootstrap #13343 Bootstrap test refactor #13372 Pass HTTP Client through #13364 Make upgrade smoke test workflow more robust #13374 For older machine agents, there needs to be a symlink for the series based tools #13371 Stick the secret revision in the URL path not as a query parameter #13375 CLI: NO_COLOR support #13365 Pin kubeflow test to a fixed sha; #13376 Rename secret status pending to staged #13380 Add secret grant/revoke hook command CLI #13350 Implement elasticContainerRegistry; #13378 Metrics: Reduce pubsub cardinality #13377 State: Logs already exist #13381 Update Pebble to add replan, wait-change, and one-shot commands #13383 LXD network retrieval efficiency ``` # Conflicts: # caas/kubernetes/provider/bootstrap_test.go # go.mod # go.sum # ``` ## QA steps See PRs
When the cardinality of the pubsub metrics get too large it causes juju
to crash with lots of data. Instead we swap out the topic name from a
UUID to a name that we can track easily.
QA steps
The output should include
lease.request.callback
instead of a UUID.The new:
The old:
Fixes:
https://bugs.launchpad.net/juju/+bug/1945644