Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http logs for Kubernetes services #11221

Closed
pz07 opened this issue Jul 14, 2015 · 15 comments
Closed

Http logs for Kubernetes services #11221

pz07 opened this issue Jul 14, 2015 · 15 comments
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@pz07
Copy link

pz07 commented Jul 14, 2015

I wonder if there is a way to get HTTP (access) logs for Kubernetes services. Ideally it would look similar to logs provided by HAProxy: http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#8.2.3

I guess it would be helpful in investigating issues in microservices environment.

@erictune
Copy link
Member

Do you mean logs for the kubernetes apiserver? We plan to do that but haven't gotten around to it. It is sort of covered by #2203.

Or do you mean logs for web servers in user-created pods? We don't have a general way to do that at the moment. There are multiple ways to reach a kubernetes service:

  • pod to service to pod: uses kube-proxy. This is using L3 load balancing. So, the kubernetes infrastructure does not currently have a way to extract Layer 7 (HTTP) information.
    • internal user to apiserver /proxy url to service to pod: This access path does understand L7 and does log all requests to the apiserver application logs.
  • internet to service to pod: this access path does not understand L7. We have talked about doing L7 load balancing (HTTP L7 load balancer / reverse proxy #561) and if we did, then we could do as you suggest.

Related:

@erictune
Copy link
Member

Happy to hear a longer description of your requirements.

@erictune
Copy link
Member

@lavalamp to fact check above.

@lavalamp
Copy link
Member

I think the easiest way to get this is to actually just use HAProxy in place of a load balancer (some assembly required). If you're already using a load balancer, your cloud provider should be keeping the logs you desire.

At the moment, we do store access logs for requests that go through the apiserver's proxy; however it's not in HAProxy's format, and it's mixed in with all api calls.

I don't think it makes sense to make the apiserver proxy fancy, since for production work you almost certainly want to use a load balancer/HAProxy/nginx for getting users to your services.

It might make sense to do something fancier with apiserver's log if there's a compelling reason for people to want to scrape it for data; it shouldn't be too hard to make it conform to any desired format, and I don't think we have strong opinions about what the format should be (other than useful).

@pz07
Copy link
Author

pz07 commented Jul 14, 2015

@erictune, I meant logs for web servers in user-created pods. Api server logs would be useful as well, but less critical, I guess.

Usually, having an application consisted of many services with service to service communication I try to hide each service behind HAProxy (other L7 proxies would work as well I guess). This configuration works very well for me on production. For example, having HAProxy HTTP logs I can get information such as:

  • average response time (in general or for given REST call),
  • top 10 longest response time today - very useful in finding performance bottlenecks the application,
  • error response code rate - one can find out production issue before somebody call the support,
  • and many more.

For now I didn't find the way to get such information for service-to-service communication inside Kubernetes and I'm afraid I will have to little information to investigate production issues (my application is not on production yet).

That said I'm looking forward for #561 to be implemented.

@lavalamp, I will certainly use a load balancer to get users to my services, but this will allow me only to log external traffic. I won't be able to get log for service-to-service communication.

@dchen1107
Copy link
Member

subscribe me

@lavalamp
Copy link
Member

I will certainly use a load balancer to get users to my services, but this will allow me only to log external traffic. I won't be able to get log for service-to-service communication.

That's a fair point. Your options for this are a) scrape kube-proxy logs on every machine. Kube-proxy does TCP proxying, not HTTP, so you'll not get much useful data this way; or b) aggregate logs from each pod running the service.

@erictune erictune added this to the v1.0-post milestone Jul 14, 2015
@erictune
Copy link
Member

@thockin we just talked about this. You said that you thought doing L7 in kube-proxy might degrade network performance. But, if someone is already using ha-proxy between their microservices, maybe there isn't much change when moving to kubernetes with L7-enabled kube-proxy.

@erictune
Copy link
Member

@thockin @lavalamp maybe doing L7 proxying in kube-proxy could be optional: introspection vs performance tradeoff user can make.

@thockin
Copy link
Member

thockin commented Jul 14, 2015

L7 in userspace may also defeat some micro-segmentation plans which operate
below L7. I am not against making this more visible, but we have to
consider the intersection of a lot of desireable features.

On Tue, Jul 14, 2015 at 10:53 AM, Eric Tune notifications@github.com
wrote:

@thockin https://github.com/thockin @lavalamp
https://github.com/lavalamp maybe doing L7 proxying in kube-proxy could
be optional: introspection vs performance tradeoff user can make.


Reply to this email directly or view it on GitHub
#11221 (comment)
.

@erictune erictune added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jul 14, 2015
@pz07
Copy link
Author

pz07 commented Jul 15, 2015

b) aggregate logs from each pod running the service

@lavalamp, do you mean aggregating application logs? that's must have, but still I find HTTP logs extremely useful. Application logs are of different quality and it's often hard to reason from them.

@erictune
Copy link
Member

This is clearly useful. I'm not sure whether it is a core feature of kubernetes or a feature of a PaaS/Framework that runs on top of it. @smarterclayton does OpenShift provide a standard way to log intra-cluster http requests along with latency stats for them?

@aronchick
Copy link
Contributor

Wouldn't best practice to be to run the web server as a foreground process
and output logs to stdio ? Or is this for multiple logs?

On Wed, Jul 15, 2015, 09:19 Eric Tune notifications@github.com wrote:

This is clearly useful. I'm not sure whether it is a core feature of
kubernetes or a feature of a PaaS/Framework that runs on top of it.
@smarterclayton https://github.com/smarterclayton does OpenShift
provide a standard way to log intra-cluster http requests along with
latency stats for them?


Reply to this email directly or view it on GitHub
#11221 (comment)
.

@smarterclayton
Copy link
Contributor

It (Openshift) does not automatically (doesn't change the Kube proxy internally to an L7 aware). I know a few folks have replaced the service proxy with HAProxy, and we plan on exposing some metrics data from our edge HAProxy to heapster / influx for use in autoscaling, and that HAProxy and Apache and others could easily generate the latency numbers. I don't think we have a short term plan to track latency, but would be interested to see what folks come up with. When the service proxy goes to iptables stats will be harder.

Pods can definitely log and aggregate their own http metrics - would be nice to have a way to roll those up and process then to extract latency and other metrics.

@bgrant0607 bgrant0607 removed this from the v1.0-post milestone Jul 24, 2015
@thockin thockin added the sig/network Categorizes an issue or PR as relevant to SIG Network. label May 19, 2017
@thockin thockin closed this as completed May 19, 2017
@luarx
Copy link

luarx commented Feb 17, 2020

Hi guys !
Do you know if nowadays there is an option to debug it? It would be very useful to debug internal HTTP requests between services in a Kubernetes cluster.

As you said, it is easy to debug external requests that go into the cluster through an HA proxy but for internal requests I don't know how to do it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Higher priority than priority/awaiting-more-evidence. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

9 participants