Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kubelet metrics for pod and container counts. #4792

Merged
merged 1 commit into from
Feb 26, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions pkg/kubelet/kubelet.go
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ func NewMainKubelet(
}
klet.dockerCache = dockerCache

metrics.Register(dockerCache)

if err = klet.setupDataDirs(); err != nil {
return nil, err
}
Expand Down
86 changes: 66 additions & 20 deletions pkg/kubelet/metrics/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ limitations under the License.
package metrics

import (
"sync"

"github.com/GoogleCloudPlatform/kubernetes/pkg/kubelet/dockertools"
"github.com/GoogleCloudPlatform/kubernetes/pkg/types"
"github.com/golang/glog"
"github.com/prometheus/client_golang/prometheus"
)

Expand All @@ -30,30 +35,71 @@ var (
Help: "Image pull latency in microseconds.",
},
)
// TODO(vmarmol): Implement.
// TODO(vmarmol): Split by source?
PodCount = prometheus.NewGauge(
prometheus.GaugeOpts{
Subsystem: kubeletSubsystem,
Name: "pod_count",
Help: "Number of pods currently running.",
},
)
// TODO(vmarmol): Implement.
// TODO(vmarmol): Split by source?
ContainerCount = prometheus.NewGauge(
prometheus.GaugeOpts{
Subsystem: kubeletSubsystem,
Name: "container_count",
Help: "Number of containers currently running.",
},
)
// TODO(vmarmol): Containers per pod
// TODO(vmarmol): Latency of pod startup
// TODO(vmarmol): Latency of SyncPods
)

func init() {
var registerMetrics sync.Once

// Register all metrics.
func Register(containerCache dockertools.DockerCache) {
// Register the metrics.
prometheus.MustRegister(ImagePullLatency)
registerMetrics.Do(func() {
prometheus.MustRegister(ImagePullLatency)
prometheus.MustRegister(newPodAndContainerCollector(containerCache))
})
}

func newPodAndContainerCollector(containerCache dockertools.DockerCache) *podAndContainerCollector {
return &podAndContainerCollector{
containerCache: containerCache,
}
}

// Custom collector for current pod and container counts.
type podAndContainerCollector struct {
// Cache for accessing information about running containers.
containerCache dockertools.DockerCache
}

// TODO(vmarmol): Split by source?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by source here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THe 4 Kubelet sources: http,file,etcd,(4th I forgot)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok 👍

var (
runningPodCountDesc = prometheus.NewDesc(
prometheus.BuildFQName("", kubeletSubsystem, "running_pod_count"),
"Number of pods currently running",
nil, nil)
runningContainerCountDesc = prometheus.NewDesc(
prometheus.BuildFQName("", kubeletSubsystem, "running_container_count"),
"Number of containers currently running",
nil, nil)
)

func (self *podAndContainerCollector) Describe(ch chan<- *prometheus.Desc) {
ch <- runningPodCountDesc
ch <- runningContainerCountDesc
}

func (self *podAndContainerCollector) Collect(ch chan<- prometheus.Metric) {
runningContainers, err := self.containerCache.RunningContainers()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the cache not return pods instead of containers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is equivalent to "docker ps"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Is it not possible to have a wrapper around this in kubelet core that provides pods instead of containers. As of now, I see quite a bit of duplicated code in kubelet around docker containers handling. This is not required for this PR, but having one API for getting pods might help in the long run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting I'll take a look. A lot of the Kubelet does special things like checking if the containers are not associated with a known pod so it may or may not work to go that route. I'll take a look.

if err != nil {
glog.Warning("Failed to get running container information while collecting metrics: %v", err)
return
}

// Get a mapping of pod to number of containers in that pod.
podToContainerCount := make(map[types.UID]struct{})
for _, cont := range runningContainers {
_, uid, _, _ := dockertools.ParseDockerName(cont.Names[0])
podToContainerCount[uid] = struct{}{}
}

ch <- prometheus.MustNewConstMetric(
runningPodCountDesc,
prometheus.GaugeValue,
float64(len(podToContainerCount)))
ch <- prometheus.MustNewConstMetric(
runningContainerCountDesc,
prometheus.GaugeValue,
float64(len(runningContainers)))
}