Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volume metrics exposed in /stats/summary not available in /metrics #34137

Closed
vjsamuel opened this issue Oct 5, 2016 · 30 comments
Closed

Volume metrics exposed in /stats/summary not available in /metrics #34137

vjsamuel opened this issue Oct 5, 2016 · 30 comments
Assignees
Labels
area/kubelet lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@vjsamuel
Copy link
Contributor

vjsamuel commented Oct 5, 2016

Kubernetes Version: 1.3

On hitting http://node:10255/stats/summary I am able to see volume metrics like:

"volume": [
     {
      "availableBytes": 92852543488,
      "capacityBytes": 92852555776,
      "usedBytes": 12288,
      "name": "default-token-eb5uv"
     }
    ]

but the same metrics are not available when I hit http://node:10255/metrics.

Im under the assumption that all metrics are equally available on both endpoints. We use prometheus today to do our timeseries metric storage and because of this gap we are not able to see volume usage on prometheus.

@micahhausler
Copy link
Member

@timothysc I'd be happy to make a PR that adds this, but I'm not sure where I'd need to add this. Would this be a cadvisor patch or a k8s patch? What package might that take place in?

@timothysc
Copy link
Member

/cc @timstclair

@timstclair
Copy link

Volume stats collection is currently only in the Kubelet, so this would be a Kubernetes patch. I believe we don't currently translate the Summary metrics (including volume) into prometheus metrics. Most stats are translated in cAdvisor directly, and some additional Kubelet specific stats are translated here. I'm not sure whether it makes sense to plumb the Summary through to that metrics collector or not.

@dashpole may have more thoughts.

@micahhausler
Copy link
Member

I don't have a great way of monitoring PVC device usage without adding it to the kubelet /metrics. I already have prometheus set up, and cAdvisor doesn't pick up mounts from an EBS volume. Is there a better way to monitor PVC device usage?

@micahhausler
Copy link
Member

Is this an accepted feature?

@javefang
Copy link

Up vote for this feature! We also have prometheus set up but it can't monitor PV usage currently due to the missing metrics under the /metrics path. Since kubelet seems to have the data already (/stats/summary), it might be a good idea to expose them under /metrics?

@msau42
Copy link
Member

msau42 commented May 22, 2017

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label May 22, 2017
@jingxu97
Copy link
Contributor

I am working on this feature, assign the issue to myself. Will send out a proposal soon.

@jingxu97 jingxu97 self-assigned this Jul 28, 2017
@cofyc
Copy link
Member

cofyc commented Aug 16, 2017

@jingxu97
Any progress on this?

@jingxu97
Copy link
Contributor

I have a proposal in the community kubernetes/community#855
@vkamra is currently working on the feature.

@vkamra
Copy link

vkamra commented Aug 16, 2017

@cofyc @javefang - Take a look at the proposal Jing linked above. I'm currently working on the first part of this which is adding PVC information to the kubelet stats summary. After that, we should be able to expose that via prometheus metrics.

There was some discussion in the last sig meeting whether the metric exposed needs to be indexed by PVC name or if PodName + Pod Volume Name is sufficient. Can you provide some input on that for your use case?

@cofyc
Copy link
Member

cofyc commented Aug 17, 2017

@vkamra

hi, what we plan to do in our production cluster are:

  • monitor PVC status and capacity requested (we use kube-state-metrics to monitor PVC objects)
  • monitor volume capacity (in bytes) (total and per volume) (1)
  • monitor volume used (in bytes) (total and per volume) (2)
  • monitor volume available (in bytes) (total and per volume) (3)
  • monitor volume inodes/used/free (same as above) (4)

For (1), (2), (3), (4) we need volume metrics to be associated with PVC/PV objects. Because users create PVC objects to request PV. Pod name often is generated by controller (e.g. deployment), pod volume name is meaningless outside of Pod. For monitoring and alerting, these metrics is better to be associated with PVC, then uses can know which PVC they created is near full for example.

I think VolumeStat indexed by PodName + Pod Volume Name is sufficient as long as VolumeStat includes PVC/PV references. But in /metrics endpoint, which is consumed by prometheus directly, volume metrics needs to include PVC & PV labels.

@jingxu97
Copy link
Contributor

Hi @cofyc We added the PVC ref into volume metrics in 1.8 #51448
Now you should be able to see the information through /metrics endpoint. Please let us know if you have any issue with it. Thanks!

@cofyc
Copy link
Member

cofyc commented Oct 14, 2017

@jingxu97 Awesome! I will try it in our clusters.

@jingxu97
Copy link
Contributor

@cofyc, @vjsamuel @javefang @micahhausler we continue to work on expose volume metrics in Kubernetes. Just want to collect more feedback on this. One question come up is whether users prefer to get volume usage information in

  1. Inside of Pod stats, volume stats contains PVC reference (namespace, PVC name)
  2. PVC stats is exposed separately from Pod. So the metric has PVC name (possibly PV name) and the usage/capacity information.

If we choose option 2, will users want to know which pod is using the PVC? (from PVC name to get Pod name?) Thanks!

@javefang
Copy link

javefang commented Oct 24, 2017 via email

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2018
@dashpole
Copy link
Contributor

Was this completed in #51553? Or is this different?

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 21, 2018
@dylanzr
Copy link

dylanzr commented Mar 14, 2018

Hello,
I can confirm that on 1.9.2 the volume stats from /stats/summary are still not available in the /metrics endpoint. Are there still plans to implement this?

Thanks!

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 14, 2018
@HaveFun83
Copy link

I have the same issue on 1.10.0

@jingxu97
Copy link
Contributor

I think after this PR #59170, you need to set up kube-state-metrics to get the metrics

@ghost
Copy link

ghost commented Jun 20, 2018

I have one requirement that is,
with out port number(10255) I need to run. that port is blocked we dont have access to open that port. is there any way without using port number.

@seleznev
Copy link

seleznev commented Jul 3, 2018

I think after this PR #59170, you need to set up kube-state-metrics to get the metrics

Looks like PR #59170 about PVC only. But with /stats/summary you can get stats for all mounted volumes (including emptyDir, secrets, etc).

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 1, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 31, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@novneet03
Copy link

Currently I am using k8s version 1.22.X with runtime as containerd and cgroup as systems.
I am still facing the same issue, nodeip:10255/stats/summary contains the volume data but in /metrics I am unable to see it

whereas when I use EKS with the same runtime and cgroup, I can see all the volume_stats metrics in /metrics route [http://localhost:8001/api/v1/nodes/ap-south-1.compute.internal/proxy/metrics]

@k8s-ci-robot
Copy link
Contributor

@novneet03: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests