-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional host_weight
metric for per_endpoint_stats
#33006
Comments
@ggreenway You've fortunately implemented the |
I don't think it would be difficult. We may want to add a config knob for additional host stats, given that enabling them already creates A LOT more total published metrics. Then you can add another chunk of code like this one to publish: envoy/source/common/upstream/host_utility.cc Line 228 in c3da130
|
I know that this can lead to cardinality issues within a TSDB with even more and more metrics exposed. So definitely not a feature for a production environment. It's more about load tests on a non-production environment with a pre-defined set of envoy instances and upstream members. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions. |
Title: Additional
host_weight
metric for per_endpoint_statsDescription:
For analyzing load balancing behavior it would be good to be able to have an
host_weight
metric per endpoint.It's already possible to enable detailed endpoint metrics using
track_cluster_stats. per_endpoint_stats
. Just the calculated weight per endpoint is missing here.Use case: In my scenario i observe a drop of throughput when a new host is added to the upstream cluster (scale-up). Despite using all best practices (active health checks, round robin LB with slow-start, pre-warmed HTTP handler, k8s readiness probes) they still appear. It seems that traffic of old cluster members is already decreased but the delta is not handled by the new host.
[optional Relevant Links:]
The text was updated successfully, but these errors were encountered: