You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Kong API Gateway Enterprise (via Konnect Hybrid mode) and Self-Managed servers.
My goal is to monitor individual targets within an upstream to diagnose performance bottlenecks. However, I have realized that there is no native way in Kong to capture detailed metrics for each specific target within an upstream. Both in Datadog and Prometheus, I can see aggregated upstream metrics, but I cannot obtain per-target granularity.
Currently, I can access metrics such as kong.http.requests.count, kong.upstream.latency.ms.bucket, kong.upstream.latency.ms.count, and kong.upstream.latency.ms.sum, as well as general latency metrics like kongdd.upstream_latency.avg and kongdd.upstream_latency.max
However, all these metrics refer to the upstream as a whole, not to each individual target. The only per-target information available in Prometheus is the kong_upstream_target_health metric, which only indicates whether the target is healthy or unhealthy, without any visibility into the number of requests received or the individual response time.
In Kong’s access logs, both upstream_addr and upstream_response_time appear correctly, confirming that Kong knows which target was used and how long it took to respond. However, there is no native way to convert this information into metrics consumable by Datadog or Prometheus. I have attempted multiple approaches, such as modifying the Datadog plugin to include upstream_addr as a tag, creating a Kong Post-Function plugin to add an X-Upstream-Addr header to the response, and even storing the information in kong.ctx.shared within a Pre-Function plugin, but in all cases, the values were not available in the logging phase.
Since Kong already has internal access to upstream_addr and upstream_response_time, why is it so difficult to expose them as metrics? The lack of this granularity makes it challenging to monitor individual targets within an upstream, preventing precise identification of specific instances that may be causing performance issues. Is there a technical limitation preventing this feature from being implemented, or is there a recommended approach to efficiently work around this problem within Kong?
The text was updated successfully, but these errors were encountered:
It is technically possible for Kong to expose these metrics. The main effort would involve:
Developing New Metric Collection Logic: Implementing new code within Kong to specifically track and expose upstream_addr and upstream_response_time on a per-target basis.
Integrating with Metric Reporting Plugins: Ensuring that existing or new metric reporting plugins (like Prometheus or Datadog) can access and format this granular data.
Developing this feature will require some effort. Since you're already an Enterprise user, I suggest opening a support ticket with the Enterprise Team if this functionality is critical for you.
While it might seem simple, this small feature involves intricate work and will change the log format, which could potentially be a breaking change to other users.
I am using Kong API Gateway Enterprise (via Konnect Hybrid mode) and Self-Managed servers.
My goal is to monitor individual targets within an upstream to diagnose performance bottlenecks. However, I have realized that there is no native way in Kong to capture detailed metrics for each specific target within an upstream. Both in Datadog and Prometheus, I can see aggregated upstream metrics, but I cannot obtain per-target granularity.
Currently, I can access metrics such as kong.http.requests.count, kong.upstream.latency.ms.bucket, kong.upstream.latency.ms.count, and kong.upstream.latency.ms.sum, as well as general latency metrics like kongdd.upstream_latency.avg and kongdd.upstream_latency.max
However, all these metrics refer to the upstream as a whole, not to each individual target. The only per-target information available in Prometheus is the kong_upstream_target_health metric, which only indicates whether the target is healthy or unhealthy, without any visibility into the number of requests received or the individual response time.
In Kong’s access logs, both upstream_addr and upstream_response_time appear correctly, confirming that Kong knows which target was used and how long it took to respond. However, there is no native way to convert this information into metrics consumable by Datadog or Prometheus. I have attempted multiple approaches, such as modifying the Datadog plugin to include upstream_addr as a tag, creating a Kong Post-Function plugin to add an X-Upstream-Addr header to the response, and even storing the information in kong.ctx.shared within a Pre-Function plugin, but in all cases, the values were not available in the logging phase.
Since Kong already has internal access to upstream_addr and upstream_response_time, why is it so difficult to expose them as metrics? The lack of this granularity makes it challenging to monitor individual targets within an upstream, preventing precise identification of specific instances that may be causing performance issues. Is there a technical limitation preventing this feature from being implemented, or is there a recommended approach to efficiently work around this problem within Kong?
The text was updated successfully, but these errors were encountered: