Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding metrics to nfs driver #75805

Merged
merged 1 commit into from Jul 18, 2019

Conversation

@brahmaroutu
Copy link
Contributor

commented Mar 28, 2019

Fixes #62778

/kind bug

What this PR does / why we need it:
This PR adds default metrics to nfs driver. Unlike many other drivers nfs driver is missing this behavior.

/sig storage

NFS Drivers are now enabled to collect metrics, StatFS metrics provider is used to collect the metrics. (#75805 , @brahmaroutu)
@jsafrane

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

The code itself looks OK, however, it will report total used / free space on a NFS share that may be shared among several volumes. At least that's what our nfs provisioner does.

@gnufied @jingxu97 @msau42, @wongma7, is that what we want? Maybe it's better than no stats.

mounter: mounter,
pod: &v1.Pod{ObjectMeta: metav1.ObjectMeta{UID: podUID}},
plugin: plugin,
MetricsProvider: volume.NewMetricsStatFS(getPath(podUID, volName, plugin.host)),

This comment has been minimized.

Copy link
@rootfs

rootfs Mar 29, 2019

Member

NewMetricsStatFS calls FsInfo. According to NFS spec, FSINFO [1] call doesn't return capacity, inode, free inodes, etc. Additionally, NFS FSSTAT call [2] doesn't return these info either.

For reference, there was an early attempt to get NFS metrics through du [3], but there was a performance concern

  1. https://www.freesoft.org/CIE/RFC/1813/39.htm
  2. https://www.freesoft.org/CIE/RFC/1813/38.htm
  3. https://github.com/kubernetes/kubernetes/pull/33468/files

This comment has been minimized.

Copy link
@gnufied

gnufied Mar 29, 2019

Member

Can the performance problem mitigated by collecting this metric less often (or caching it..)?

This comment has been minimized.

Copy link
@jsafrane

jsafrane Mar 29, 2019

Member

du on NFS can take really long time (hours) on big volumes...

statfs() returns something useful, at least with a Linux NFS server.

This comment has been minimized.

Copy link
@brahmaroutu

brahmaroutu May 8, 2019

Author Contributor

There is a thought that I can use a switch like 'EnableMetrics' to conditionally pull statfs data but then I need to add a bool flag someplace like nfs VolumeSource. If that is acceptable then I can work on it.

@wongma7

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

I think it's fine to have. I'm sure there are nfs servers/appliances where quota is reflected in df/statfs, so the metric could show if pvcs are approaching their quota value not the whole share value. In other cases, the metric may be a little misleading but could still be useful for admin to know the server is filling up and x pvcs are using it.

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented May 15, 2019

/test pull-kubernetes-e2e-gce
/test pull-kubernetes-integration

@tejaswiniVadlamudi

This comment has been minimized.

Copy link

commented May 28, 2019

It would be good to see NFS storage-type metrics associated with a container. Looking forward to the solution.

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

Can we please review @jsafrane @wongma7 @msau42

@tejaswiniVadlamudi

This comment has been minimized.

Copy link

commented Jun 10, 2019

Is anything blocking this merge activity?

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

Please review or comment. I feel this should be merged?

@jsafrane

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

Is anything blocking this merge activity?

Yes, it has couple of labels that prevent merge:

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2019

/release-note-none

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented Jun 13, 2019

@jsafrane I have removed those tags now. Can we see if it is mergeable now?

@jsafrane

This comment has been minimized.

Copy link
Member

commented Jun 14, 2019

@brahmaroutu, please do fill the release note. This is user facing change that can potentially break things.

@jsafrane

This comment has been minimized.

Copy link
Member

commented Jul 11, 2019

@brahmaroutu, can you please fill the release note in the first comment of this PR? We can't merge it without it. You can find examples here: https://github.com/kubernetes/community/blob/master/contributors/guide/release-notes.md

@brahmaroutu

This comment has been minimized.

Copy link
Contributor Author

commented Jul 17, 2019

@jsafrane I used the following test to release note, please let me know if this is proper.
NFS Drivers are now enabled to collect metrics, StatFS metrics provider is used to collect the metrics. (#75805 , @brahmaroutu)

@jsafrane

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Thanks! It looks OK.

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 18, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: brahmaroutu, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

Copy link

commented Jul 18, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 748849b into kubernetes:master Jul 18, 2019

23 checks passed

cla/linuxfoundation brahmaroutu authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Job succeeded.
Details
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.