Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support customizing bucket boundaries for status metrics bundle_loading_duration_ns #6950

Open
catstail1 opened this issue Aug 21, 2024 · 2 comments

Comments

@catstail1
Copy link

What is the underlying problem you're trying to solve?

We're using status metrics bundle_loading_duration_ns to monitor opa server performance on bundle loading. The metrics are used to debug why it takes several minutes for new version of bundles to be loaded by our opa server. The current metrics indicates at least 50% of bundle_loading_duration_ns metrics fall into bucket +Inf for all 4 stages, and most small duration buckets are not used. Found currently the buckets are set with prometheus.ExponentialBuckets(1000, 2, 20), which is 1 microsecond to 1 second, which is too small for our case, fetching the bundle remotely.
image

We would like to get more buckets with long durations to know more how long it spend on each stages. Thanks!

Describe the ideal solution

it would be great if the configuration support customized bucket for metrics bundle_loading_duration_ns, as the configuraiton of server.metrics.prom.http_request_duration_seconds.buckets, so that we can make most usage of configured buckets.

Describe a "Good Enough" solution

if prometheus.ExponentialBuckets is still preferably used, it would be enough to support the config of the parameters start (probably also factor) in prometheus.ExponentialBuckets

Additional Context

I found a similar resolved issue for http_request_duration_seconds: #3196

@ashutosh-narkar
Copy link
Member

Making this configurable via the server config seems like a good approach. We'll have to push that new config over through the runtime into the code that handles the plugin instantiation. Then use the provided config or fallback to the default bucket values in the status plugin. If you'd like to contribute this feature, please feel free to do so. Happy to help if you have any questions. Thanks.

@jwu730-1
Copy link

Making this configurable via the server config seems like a good approach. We'll have to push that new config over through the runtime into the code that handles the plugin instantiation. Then use the provided config or fallback to the default bucket values in the status plugin. If you'd like to contribute this feature, please feel free to do so. Happy to help if you have any questions. Thanks.

After reading more code, it seems better to set the custom buckets under status configuration (currently there's a boolean prometheus). And it makes sense to allow customize the bucket only when status.prometheus is true.
However, during my local test, it seems not easy to fix/update here. The original test tries to checks if the histogram collector for status.bundle_loading_duration_ns is registered as bundleLoadDuration, which is a global variable created by NewHistogramVec. From prometheus code, I didnt find a accessible field can be used to check the buckets. Do you have any suggestion on how to check if registered histogram use the default / custom bucket? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants