New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFP: HTTP request and response size in HTTP L7 flows/metrics #21253
Comments
Would value of the content-length header be enough? If so we might already have it in the logged headers. This is bit harder for chunked encoding and if you'd want the size of the headers to be included. Currently we emit access logs when headers have been processed, streaming the body right after. |
@jrajahalme I thought of that, but it can be incorrect. A client can send a large body and set the Content-Length of the request to a small value, similarly a server can do the same but for a response. Most well behaved clients and servers shouldn't do this, but it's possible, and can lead to misleading data if dealing with less-trusted clients, or poorly implemented clients/servers. |
OK, so the request here is to actually count the bytes of the payload and delay access logging until the payload is done and report the actual payload length with the log record? What if the payload is streamed and it never ends? |
The stated main benefit is debugging, and for that IMO the value of content length header is likely enough. I would start with that and do more only if/when evidence for the need to dig deeper surfaces. |
@jrajahalme I think the problem is that the request/response flow model is mis-aligned with how requests actually work. In Istio/Envoy they have metrics for TCP natively, and I believe it's just incrementing a counter each time a packet flows through envoy. In cilium + envoy, we're sending request/response access logs and calculating metrics based on that, but that doesn't work as well for streaming use-cases. Perhaps we should have a new access log type for in-progress (streaming) requests. |
This issue has been automatically marked as stale because it has not |
This issue has been automatically marked as stale because it has not |
Not stale. As an aside, I think this would be important to have to be comparable to other service meshes which offer this as a metric already. |
Cilium Feature Proposal
Is your feature request related to a problem?
No
Describe the feature you'd like
Same idea as #21252 but for HTTP.
Knowing how big a request or response is can be very useful when debugging issues with web services. A large request could cause a service to timeout, and a large response could cause a client to timeout, or simply increase the tail latency of the service when looking at request durations.
(Optional) Describe your proposed solution
Add the request/response size fields to the L7 HTTP protobufs and configure cilium proxy (envoy) to send this information back to cilium/hubble so it can be associated with the Flow.
Once this information is in the flow, I can easily make a PR to add metrics for this. The metrics would be a histogram or summary so we can track the distribution of the request/response sizes.
The text was updated successfully, but these errors were encountered: