-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instrumentation needed for kubelet, apiserver, etc #1625
Comments
Let's use http://golang.org/pkg/expvar/ to publish metrics? |
I've proposed the same thing on #2675 @lavalamp |
Most of the existing tools try to daemonize the stats/metrics process, which is not what we are looking for. I'd like simple package that we can expose metrics per component/server, probably easy enough to implement our own. Speaking of which, running the stats as a per node daemon is also an option, but I think we are not there yet. |
This is redundant with #621, but I'll close that one, as this has more discussion. |
/cc @satnam6502 |
#415 is related. |
#490 is also somewhat related. |
/cc @nikhiljindal |
cc @rsokolowski |
I assume our goal is to add an HTTP endpoint to these jobs so that it's easy for third parties to do what they will with the metrics, right? That's a much more extensible approach than having to add new plugins or libraries to kubernetes to support exporting to different metric aggregators. Assuming that's the case, has anyone looked much into or worked with any of the existing libraries we could use for this? I see the following from a quick search:
|
AFAIK nobody has looked at anything here. I want to see things like, for On Thu, Feb 5, 2015 at 1:34 PM, Alex Robinson notifications@github.com
|
@ddysher expvar does not use a fixed url. @thockin @a-robinson For publishing metrics over HTTP, etcd uses https://github.com/codahale/metrics. We have looked at other options too. But they are not suitable for a simple HTTP metrics endpoint. |
I don't have a preference up front, I just want something a) On Thu, Feb 5, 2015 at 1:42 PM, Xiang Li notifications@github.com wrote:
|
@xiang90, you do? All I see in etcd's repo is this custom wrapper around expvar. Maybe it was inspired by codahale's library (minus the histogram stuff), but it doesn't look like you're using codahale's stuff directly. |
@a-robinson Actually, we first went with go-metrics and found it is over-kill for a http endpoint. |
Ah, I'm sorry. Github hadn't indexed your change yet, so my searches for codahale and for metrics didn't turn that up. Other than its support for distributions, the codahale library looks worse than directly using expvar. It uses global mutexes for all updates on counters, gauges, and histograms (one mutex for each category), and it doesn't seem to make the interfaces any nicer than expvar's. I'd propose just using expvar directly for everything other than histograms, for which we could try using the implementation in rcrowley's larger library or codahale's hdrhistogram, with a bit of added logic to export them through expvar. I'll throw a few simple metrics into the apiserver as a proof of concept. |
sure. we are not worry about stats the perf at this moment (since it is low frequency and low contention). |
Here are a few other ideas. Maybe we should put this stuff in a doc...
all components
|
Note that because the master does not register itself as a node the usage of the pods there is not tracked. I think someone was working on fixing that. |
We measure package pull times in the Kubelet (and latency of Docker operations), but we don't have restarts, crashes, etc. We should add those. Prometheus exports by default some metrics from the Go runtime as well. |
+1 on etcd metrics - since it seems to be a common pain point, more visibility would be great As for what Prometheus automatically exports - resident and virtual memory usage, cpu usage, number of goroutines, number (and max number) of open file descriptors, and process start time, although I don't know how accurate it all is. |
I'm not sure what this means-- we definitely do have etcd do CAS for us. Every CAS is sent to etcd.
Good thing to track, but scheduler has no idea about the latter, so it may not be the best place to track it. Instead, how about "rejected by apiserver", and keep a histogram of rejection reasons. This would let us see the double-scheduling rate (to verify that it's very low).
This should already be gettable from the status kubelet writes about the pod that the component runs in. I think we should only talk about things that aren't applicable to all components in this bug, because those things should be implemented by kubernetes on behalf of all pods in the system. Other ideas:
(stats on X means: rolling median, 99th %ile, maybe average) All of that can be added to the workqueue object-- the interface allows it to collect all that info. |
Thanks for the feedback. Yeah, sorry about my confusion about etcd index vs. apiserver resourceVersion.
Oh that's cool, I thought the only stuff we were exporting was the HTTP handlers you explicitly instrumented (PRs listed earlier in this issue). How do I access the Prometheus stats that are exported? And is every k8s component (api server, scheduler, controller manager, kubelet, etc.) linked with Prometheus, so we're getting these stats for every component?
Yeah sorry, I didn't mean to imply this would be counted by the scheduler (I should have labeled that section "scheduling" not "scheduler"). What I had in mind was something like exporting a count of the number of pods (cluster-wide) that have gone to phase podFailed with one of the message strings listed in handleNotFittingPods(). Does that sound reasonable? BTW I wasn't clear what you meant "rejected by apiserver"
I guess it depends on whether you view this bug as being about "requirements" or "implementation." I was hoping this issue could basically be "additional instrumentation needed for 1.0 to make us feel comfortable that users will have enough information to debug problems." In that sense I think it doesn't matter whether it's something we only do in one component or in all components. |
@davidopp All of our components (i.e. api server, scheduler, controller amanger, kubelet) are exporting prometheus metrics on /metrics handler. I believe it's enabled everywhere now. |
It sounds like this issue might be resolved. @davidopp / @a-robinson can you verify that we have metrics available and then we can file individual issue if we believe any further metrics are necessary for v1.0? |
+1 for @roberthbailey - I think we already have a bunch of different metrics. In case we find something is missing we can file another issue. |
IMO this issue isn't finished. Maybe we can move it out of 1.0, but there is a long list of metrics that we don't have yet that will be useful for debugging production clusters (for example see my earlier comment in this issue: I'd rather not file individual issues right now as that will just explode the number of open issues. |
I'd definitely be interested in scheduling latency as well as more visibility into etcd, such as errors broken down by type and number of open watches, but after thinking a little more I'm not convinced anything more is absolutely needed for 1.0. |
Thanks @a-robinson. Moving this to the v1.0-post milestone so that we can follow up with all of the great ideas for metrics discussed herein. |
We did an experiment today and discovered that it actually is possible to access the /metrics endpoint on the master components (previously we had thought not, because the master node does not register as a cluster node). One way to do this (for apiserver; same should work for other master components, but you need to know their port number, instead of 8080 -- see pkg/master/ports/ports.go for port numbers)
/cc @dchen1107 |
I think that is because today the master kubelet is registered with the cluster. It'll be interesting to see if this still works after I explicitly detach them in #6949. |
Do we have any visibility into or tracking of master component crashes/restarts? |
I'm sure we could do more here, but closing in favor of more specific issues. |
…s-check OCPBUGS-15866: remove readiness check for cache exclusion
We do not report much in the way of stats (memory used, latency, counters, etc) for our core components.
The text was updated successfully, but these errors were encountered: