Skip to content
This repository has been archived by the owner on Dec 1, 2018. It is now read-only.

Expose network stats for pods #852

Merged
merged 1 commit into from
Jan 12, 2016

Conversation

jimmidyson
Copy link
Contributor

Fixes #368

Dirty: using leaky infra container name, regex for container name to pod mapping, but seems to work OK...

/cc @vishh @akash010 @smarterclayton @simon3z

@k8s-bot
Copy link

k8s-bot commented Jan 4, 2016

Jenkins GCE e2e

Build/test failed for commit b5c136e.

@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit 5552fce.

@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit f1bc44b.

@smarterclayton
Copy link

If leaky is imported just to get pod infra container name, it's not worth it. I'd just have it be a constant or parameter in this code.

@jimmidyson
Copy link
Contributor Author

retest this please

@jimmidyson
Copy link
Contributor Author

Removed leaky package dependency.

@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit cf6f29b.

@jimmidyson jimmidyson changed the title Expose network stats for pods. [WIP] Expose network stats for pods. Jan 5, 2016
@jimmidyson
Copy link
Contributor Author

Got more work to do on this to expose via the API so changing to WIP - please don't merge.

@vishh @mwielgus Do you know if the Jenkins GCE e2e should pass? Seeing unrelated flake in logs:

No error is expected but got expected [gcm] sinks, found []

@mwielgus
Copy link
Contributor

mwielgus commented Jan 5, 2016

Yeah, the problem seems to be unrelated. BTW, we will have to redo this work in heapster-scalability branch.

@jimmidyson
Copy link
Contributor Author

@mwielgus Are you targeting the new metrics APIs in the scalability branch? Sorry I've not kept up to speed with it.

@mwielgus
Copy link
Contributor

mwielgus commented Jan 5, 2016

If the new API is delivered on time then yes we will switch, if not we will keep using the old one.

@jimmidyson jimmidyson changed the title [WIP] Expose network stats for pods. Expose network stats for pods Jan 5, 2016
@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit 42d3c59.

@jimmidyson
Copy link
Contributor Author

Ready for review please. Jenkins GCE e2e is flaky & needs to either disabled or fixed separately to this PR.

// A model entity can be a Pod, a Container, a Namespace or a Node.
type ExternalEntityListEntry struct {
Name string `json:"name"`
CPUUsage uint64 `json:"cpuUsage"`
MemUsage uint64 `json:"memUsage"`
RxBytes uint64 `json:"rx_bytes"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need network metrics in the model or is it necessary only for monitoring purposes?
The cost for adding metrics to the model is kindda high for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are you saying it shouldn't be returned by API queries, only passed to sinks? I'd prefer it to be exposed via the REST API & that means using the model, doesn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Is it required to be exposed via APIs as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point me to the requirements :) Don't we make it up as we go along? ;)

Honestly, if it's too expensive to store a couple of extra values per pod then I don't mind dropping it from there & just leave it as passed to sinks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd personally prefer exposing this and filesystem stats as well. Its just that the default resource limits in Kube has to change once these metrics are added. So as long as we can go fix the limits, then adding these metrics is OK by me :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK how about I remove from the model for this version & revisit it for @mwielgus' rewritten version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. So is monitoring the primary use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monitoring & accounting, yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't think of a use for autoscaling on network traffic right now.

@jimmidyson jimmidyson force-pushed the network-stats branch 2 times, most recently from 44cd513 to b445c4f Compare January 5, 2016 14:58
@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit 44cd513.

@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit b445c4f.

pod := &sd.data.Pods[podIndex]
// If we find a matching pod then add the container to the pod's containers slice.
if pod.Name == podName && pod.Namespace == podNamespace {
cont.Hostname = pod.Hostname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the pod infra container should be hidden inside heapster. That way if we collect metrics from rocket for example, which does not need an infra container, we will not break users of heapster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions on how to do that? Right now I can't think of one tbh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need stats at the pod level in addition to container level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit confused by this. We're adding the infra container to the appropriate pod so this is done.

@vishh
Copy link
Contributor

vishh commented Jan 5, 2016

General structure LGTM. Thanks @jimmidyson !!

@k8s-bot
Copy link

k8s-bot commented Jan 5, 2016

Jenkins GCE e2e

Build/test failed for commit e412d1a.

@jimmidyson
Copy link
Contributor Author

Network stats are now only sent to sinks, not retrievable via Heapster API as discussed with @vishh.

@jimmidyson
Copy link
Contributor Author

@vishh Please can you let me know what is outstanding for this PR to be merged?

@vishh
Copy link
Contributor

vishh commented Jan 12, 2016

The only issue with this PR is that of exposing the infra container. I'd prefer adding pod level metrics and exposing network as a pod level metrics.

@jimmidyson
Copy link
Contributor Author

This is how's it's been done in heapster-scalability branch & I'm not going to duplicate that functionality as this is just a quick fix until heapster-scalability branch becomes master.

@vishh
Copy link
Contributor

vishh commented Jan 12, 2016

Ok then. No issues in that case.

On Tue, Jan 12, 2016 at 8:00 AM, Jimmi Dyson notifications@github.com
wrote:

This is how's it's been done in heapster-scalability branch & I'm not
going to duplicate that functionality as this is just a quick fix until
heapster-scalability branch becomes master.


Reply to this email directly or view it on GitHub
#852 (comment).

@jimmidyson
Copy link
Contributor Author

Thanks @vishh.

Merging.

jimmidyson added a commit that referenced this pull request Jan 12, 2016
@jimmidyson jimmidyson merged commit 78ff89c into kubernetes-retired:master Jan 12, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants