-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] KEP 5105: kubectl top pvc #5106
base: master
Are you sure you want to change the base?
Conversation
tsmetana
commented
Jan 30, 2025
- One-line PR description: Add a new kubectl top persistentvolumeclaim subcommand that would allow to display volume usage of PVCs
- Issue link: kubectl top persistentvolumeclaim #5105
- Other comments:
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tsmetana The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @ardaguclu |
@tsmetana: GitHub didn't allow me to request PR reviews from the following users: gmeghnag. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cc |
5c3233a
to
15f2fec
Compare
There are the `kubelet_volume_stats_used_bytes` and | ||
`kubelet_volume_stats_capacity_bytes` Prometheus metrics which can be used to | ||
compute the volume used space percentage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, kubectl top pod
and kubectl top node
are obtained by the metrics server definition API, not directly from prometheus. If we want to support kubectl top pvc
, what we want is to get it directly from prometheus? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC,
kubectl top pod
andkubectl top node
are obtained by the metrics server definition API, not directly from prometheus. If we want to supportkubectl top pvc
, what we want is to get it directly from prometheus? 🤔
By using Prometheus query we can get the "most interesting" value (usage percentage) with just one call -- it is simple. Though I understand the Prometheus dependency is a disadvantage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @googs1025 mentioned, the implementation needs to be same with top node and top pod. Direct Prometheus query would be very non-standard (and does not work for some cases like exec credentials plugins, etc.).
Good part is following the same patterns of these 2 commands would make the implementation easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsmetana We have discussed similar issues in other issues, you can also refer to: 🤔
FYI: kubernetes/kubectl#1696
FYI: kubernetes-sigs/metrics-server#1623
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After going through the documentation, I wonder what would be the benefit of using the metrics API. The metrics-server README states:
Metrics Server is meant only for autoscaling purposes. For example, don't use it to forward metrics to monitoring solutions, or as a source of monitoring solution metrics. In such cases please collect metrics from Kubelet /metrics/resource endpoint directly.
And what we're trying to achieve sounds more like the "for example": just monitoring. Reading the metrics directly from kubelet would remove the necessity of running the metrics-server altogether. Yes, it's lighter dependency than Prometheus, but still an additional dependency that doesn't look necessary. In the light of the above quote I even wonder if the existing kubectl top
implementation is doing the right thing by depending on the metrics API for simple resource usage querying and whether adding APIs for volume usage metrics would not be too heavyweight solution -- autoscalers have no use for them at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar KEP had been proposed in the past at #497, but it was closed without being implemented. Eventually, someone created a kubectl plugin (yashbhutwala/kubectl-df-pv) instead. Implementing it as a kubectl plugin might be one option.
If this feature depends on the Metrics API (where the metrics-server retrieves metrics from the kubelet and exposes them via the Metrics API), I think we need to discuss it with the SIG-Instrumentation (owner of the Metrics API and metrics-server subprojects) and get their agreement. See kubernetes-sigs/metrics-server#1318.
Here are some points I’m personally curious about:
- What are the required permissions to execute
kubectl top pvc
? Read access to Pods and PVCs? - Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of
kubectl top pvc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of
kubectl top pvc
?
This is a good point. 🤔 There may be multiple types of PVs and support for third-party to do their own, which may make this implementation more complicated? In addition: kubelet_volume_stats_used_bytes
and
kubelet_volume_stats_capacity_bytes
are both compatible and feasible for the current mainstream PVs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* What are the required permissions to execute `kubectl top pvc`? Read access to Pods and PVCs?
I guess that would depend on the implementation. The user would have to be able to get to the kubelet stats somehow. Either directly or potentially through the metrics-server. But I would really just keep the implementation confined to present statistics that are already present in the cluster.
- Not all PVs are mounted as file systems on Nodes. In the case of PVs using a FUSE CSI driver, what would be the expected result of
kubectl top pvc
?This is a good point. 🤔 There may be multiple types of PVs and support for third-party to do their own, which may make this implementation more complicated? In addition:
kubelet_volume_stats_used_bytes
andkubelet_volume_stats_capacity_bytes
are both compatible and feasible for the current mainstream PVs?
The volume stats are being collected by cAdvisor AFAIK, so whatever it supports would work. That's also where potential new volume types support should come from. Which is another reason I'd like to really stick to using the existing metrics. I assume (I have some data from OpenShift cluster so I'm reasonably certain about it) they'd cover vast majority of use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The volume stats are being collected by cAdvisor AFAIK
As summarized in the table here, it seems that volume stats are exposed by Kubelet.
I assume (I have some data from OpenShift cluster so I'm reasonably certain about it) they'd cover vast majority of use cases.
Basically, that's the case, but for CSI drivers that mount object storage via FUSE (e.g., awslabs/mountpoint-s3-csi-driver), the kubelet_volume_stats_used_bytes
metric did not contain any values for such PVCs. If usage cannot be retrieved from kubelet_volume_stats_used_bytes
, I think we could simply ignore the usage of them.
15f2fec
to
e863b37
Compare
- [ ] (R) Production readiness review approved | ||
- [ ] "Implementation History" section is up-to-date for milestone | ||
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] | ||
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: make sure to check all the appropriate boxes.
[documentation style guide]: https://github.com/kubernetes/community/blob/master/contributors/guide/style-guide.md | ||
--> | ||
|
||
Provide a simple command to display a PersistentVolumeClaim capacity usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provide a simple command to display a PersistentVolumeClaim capacity usage. | |
Provide a simple command to display a PersistentVolumeClaim capacity usage, similar to `kubectl top node`. |
This might be a good place to talk about core concepts and how they relate. | ||
--> | ||
|
||
The feature depends on the metric endpoints to be accessible by the user. That |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about metrics.k8s.io
? If so let's make it clear.
extending the production code to implement this enhancement. | ||
--> | ||
|
||
- `<package>`: `<date>` - `<test coverage>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed.
https://storage.googleapis.com/k8s-triage/index.html | ||
--> | ||
|
||
- <test>: <link to test coverage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required.
- [ ] Feature gate (also fill in values in `kep.yaml`) | ||
- Feature gate name: | ||
- Components depending on the feature gate: | ||
- [X] Other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like most kubectl enhancements, we'll introduce an env var KUBECTL_TOP_PVC
for example to guard access to this functionality. It will go through the usual cycles alpha (default to off), beta (on by default), GA (locked).
|
||
###### Are there any tests for feature enablement/disablement? | ||
|
||
No, there's no way to disable the feature in a single `kubectl` version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment about env var.
status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced | ||
creation-date: 2025-01-30 | ||
reviewers: | ||
- TBD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put Arda and my name as reviewers and approvers here.
owning-sig: sig-cli | ||
participating-sigs: | ||
- sig-storage | ||
status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status should be implementable.
# The target maturity stage in the current dev cycle for this KEP. | ||
# If the purpose of this KEP is to deprecate a user-visible feature | ||
# and a Deprecated feature gates are added, they should be deprecated|disabled|removed. | ||
stage: alpha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you still targeting 1.33? Because this feature wasn't opted in, yet.
I almost forgot, you're gonna need PRR, iow. https://github.com/kubernetes/enhancements/blob/master/keps/prod-readiness/template/nnnn.yaml filled in with matching number. We can ask John to handle that one. |
in their cluster. | ||
|
||
## Alternatives | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a custom plugin to kubectl
? If so, write an alternative here for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and I believe it already exists: https://github.com/yashbhutwala/kubectl-df-pv. I will have to basically rewrite the KEP, so I'll make sure to include this in the alternatives as well as previous work.
Since the volume usage values are read from the metrics endpoints, the feature | ||
is only useful to the users who have the matrics server configured and running | ||
in their cluster. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this report the metrics for all storage, or only for storage in-use within the cluster?
PVCs can exist without being mounted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since cAdvisor gets the metrics from statfs
syscall, it would only be able to get data about mounted filesystems, i.e. in case of persistent volumes, the ones in use by the pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a big limitation; to be honest, I think we should aim to cover all PVCs and report at least capacity. If we can't also report space used across all PVCs, maybe we should leave this as a plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know of any (usable) way to get an usage of unmounted filesystem -- it requires knowledge of the filesystem internals. Even the capacity of unmounted PVCs is rather difficult to report. The numbers in API objects might be quite far from the reality. There are volume types where the capacity reported in the PVC status is only guaranteed to be lower than the actual volume size and nothing much else (e.g. NFS, which is quite popular one). I'm not sure how having this as plugin would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we (Kubernetes) put something in-tree, it has to be kind of "right". For example, people might expect kubectl top pvc
to do something that's not feasible. We (Kubernetes) would have to manage expectations there.
With such a large user base, it's actually really likely that end users would complain about the missing behavior, or that the subcommand is confusing.
However, if it's instead a community plugin and you install it and it's not as good as you hoped, this project doesn't have to handle the feedback there. A community plugin can provide half a solution and still be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps providing a plugin example would be less risky than adding it directly. Considering that the current solution may not be universal, or the process of making it universal will be longer (after all, the metrics server needs to be changed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a plugin example, it can provide users (third parties) with their own command line tools. Of course, it is possible to prepare for both, providing plug-in examples in the short term, and then follow up in the long term if there is a long-term plan. 🤔 Just my opinion
I think it's clear that this feature is out of scope for 1.33. I was overly optimistic. If I'm to go with the implementation using the metrics API, it would need a new feature also in metrics-server. And I would have to get more familiar with it as well as all the previous work since I seem to be repeating some well-known mistakes. And of course: we already have the plugin, so perhaps this whole effort is not as useful as I thought it would be. |