CFP: HTTP request and response size in HTTP L7 flows/metrics #21253

chancez · 2022-09-08T22:51:15Z

Cilium Feature Proposal

Is your feature request related to a problem?

No

Describe the feature you'd like

Same idea as #21252 but for HTTP.

Knowing how big a request or response is can be very useful when debugging issues with web services. A large request could cause a service to timeout, and a large response could cause a client to timeout, or simply increase the tail latency of the service when looking at request durations.

(Optional) Describe your proposed solution

Add the request/response size fields to the L7 HTTP protobufs and configure cilium proxy (envoy) to send this information back to cilium/hubble so it can be associated with the Flow.

Once this information is in the flow, I can easily make a PR to add metrics for this. The metrics would be a histogram or summary so we can track the distribution of the request/response sizes.

jrajahalme · 2022-09-14T09:56:09Z

Would value of the content-length header be enough? If so we might already have it in the logged headers. This is bit harder for chunked encoding and if you'd want the size of the headers to be included. Currently we emit access logs when headers have been processed, streaming the body right after.

chancez · 2022-09-14T09:59:08Z

@jrajahalme I thought of that, but it can be incorrect. A client can send a large body and set the Content-Length of the request to a small value, similarly a server can do the same but for a response. Most well behaved clients and servers shouldn't do this, but it's possible, and can lead to misleading data if dealing with less-trusted clients, or poorly implemented clients/servers.

jrajahalme · 2022-09-14T10:12:25Z

OK, so the request here is to actually count the bytes of the payload and delay access logging until the payload is done and report the actual payload length with the log record? What if the payload is streamed and it never ends?

jrajahalme · 2022-09-14T10:15:19Z

The stated main benefit is debugging, and for that IMO the value of content length header is likely enough. I would start with that and do more only if/when evidence for the need to dig deeper surfaces.

chancez · 2022-09-14T10:18:49Z

@jrajahalme I think the problem is that the request/response flow model is mis-aligned with how requests actually work. In Istio/Envoy they have metrics for TCP natively, and I believe it's just incrementing a counter each time a packet flows through envoy. In cilium + envoy, we're sending request/response access logs and calculating metrics based on that, but that doesn't work as well for streaming use-cases. Perhaps we should have a new access log type for in-progress (streaming) requests.

github-actions · 2023-01-09T02:00:01Z

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

github-actions · 2023-03-12T02:03:06Z

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

chancez · 2023-03-13T15:44:59Z

Not stale. As an aside, I think this would be important to have to be comparable to other service meshes which offer this as a metric already.

chancez added the kind/feature This introduces new functionality. label Sep 8, 2022

aanm added the sig/agent Cilium agent related. label Nov 9, 2022

github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jan 9, 2023

rolinh added the sig/hubble Impacts hubble server or relay label Jan 9, 2023

github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jan 10, 2023

github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Mar 12, 2023

chancez removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Mar 13, 2023

rolinh added the pinned These issues are not marked stale by our issue bot. label Mar 15, 2023

chancez mentioned this issue Nov 30, 2023

Ability to keep track of traffic generated within a namespace or per Pod #12913

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CFP: HTTP request and response size in HTTP L7 flows/metrics #21253

CFP: HTTP request and response size in HTTP L7 flows/metrics #21253

chancez commented Sep 8, 2022

jrajahalme commented Sep 14, 2022

chancez commented Sep 14, 2022

jrajahalme commented Sep 14, 2022

jrajahalme commented Sep 14, 2022

chancez commented Sep 14, 2022

github-actions bot commented Jan 9, 2023

github-actions bot commented Mar 12, 2023

chancez commented Mar 13, 2023

CFP: HTTP request and response size in HTTP L7 flows/metrics #21253

CFP: HTTP request and response size in HTTP L7 flows/metrics #21253

Comments

chancez commented Sep 8, 2022

Cilium Feature Proposal

jrajahalme commented Sep 14, 2022

chancez commented Sep 14, 2022

jrajahalme commented Sep 14, 2022

jrajahalme commented Sep 14, 2022

chancez commented Sep 14, 2022

github-actions bot commented Jan 9, 2023

github-actions bot commented Mar 12, 2023

chancez commented Mar 13, 2023