Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add number measurement for bound/unbound pv/pvc #57872

Merged
merged 2 commits into from
Feb 6, 2018

Conversation

mlmhl
Copy link
Contributor

@mlmhl mlmhl commented Jan 5, 2018

What this PR does / why we need it:

Implement number measurement for bound/unbound pv/pvc defined in the Metrics Spec

ref feature: kubernetes/features#496

Release note:

Intended for post-1.9

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 5, 2018
@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 5, 2018

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jan 5, 2018
@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 5, 2018

/cc @gnufied

@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 5, 2018

/assign @thockin

@spiffxp
Copy link
Member

spiffxp commented Jan 7, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 7, 2018
@mlmhl mlmhl force-pushed the volume_metric_bound_pvc branch 3 times, most recently from f53c56d to 5e03ae9 Compare January 7, 2018 03:26
pvControllerSubsystem = "pv_collector"

// Metric names.
boundPvKey = "bound_pv_count"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boundPVKey

(and everywhere else Pv -> PV, Pvc -> PVC in indentifiers)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jsafrane
Copy link
Member

jsafrane commented Jan 9, 2018

/assign @gnufied

I have only annoying comments about variable and function names.

@mlmhl mlmhl force-pushed the volume_metric_bound_pvc branch 2 times, most recently from d983821 to 9992112 Compare January 10, 2018 03:00
@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 10, 2018

/retest

storageClassName)
}
for storageClassName, number := range unboundNumberByStorageClass {
ch <- prometheus.MustNewConstMetric(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason you chose to use MustNewConstMetric? The documentation of this function implies that this type is most useful for "throwaway" metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use MustNewConstMetric here just to make the code simple as we needn't to handle the error returned by NewConstMetric, but it pushed the volume controller into an unstable state, so I changed all these functions to NewConstMetric, PTAL.

@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 11, 2018

/retest

1 similar comment
@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 11, 2018

/retest

@gnufied
Copy link
Member

gnufied commented Jan 11, 2018

@mlmhl can you update existing e2e tests to cover these metrics https://github.com/kubernetes/kubernetes/blob/master/test/e2e/storage/volume_metrics.go ?

@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 11, 2018

@gnufied OK, I will update e2e tests to cover these metrics.

@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 12, 2018

@gnufied e2e tests already added for these metrics, PTAL.

By the way, I intend to add total provision/deletion time metrics after this PR, but I'm not sure the exact definition of total Provision/Deletion time. According to my understanding, the total provision time starts from the PVC created and end to the PV created, the total deletion time starts from the PVC deleted and end to the PV deleted. Please let me know if I understand something wrong.

pv, err = framework.CreatePV(c, pv)
Expect(err).NotTo(HaveOccurred(), "Error creating pv: %v", err)
waitForPVControllerSync(metricsGrabber, unboundPVKey, classKey)
validator([]map[string]int64{nil, {className: 1}, nil, nil})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These metrics appear to be checking absolute number of bound or unbound PVs. Will this not fail when some other PV might exist in the cluster while this test is running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All volume metric e2e tests are labeled as [Serial](see here), so we can consider that no other existing PVs while this test is running.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - I know but reality is stranger than that. I have fixed a number of flakes in this test suite because something else caused metrics to jump around. We have to be careful, and only observe increment in metric values, rather than absolute values - because asserting on absolute values is almost sure to be error prone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. This is indeed a problem as some other tests maybe create PV/PVCs and forget to cleanup. I will change to use increment instead of absolute values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied The e2e tests are updated to validate the relative increment value instead of absolute value, PTAL, thanks.

@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 24, 2018

/retest

1 similar comment
@mlmhl
Copy link
Contributor Author

mlmhl commented Jan 24, 2018

/retest

@gnufied
Copy link
Member

gnufied commented Feb 2, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2018
@mlmhl
Copy link
Contributor Author

mlmhl commented Feb 3, 2018

/retest

@thockin
Copy link
Member

thockin commented Feb 6, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, mlmhl, thockin

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 6, 2018
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 58317, 58687, 57872, 59063, 59328). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 997fe31 into kubernetes:master Feb 6, 2018
@mlmhl mlmhl deleted the volume_metric_bound_pvc branch February 7, 2018 01:53
k8s-github-robot pushed a commit that referenced this pull request Feb 14, 2018
…nd_pvc

Automatic merge from submit-queue (batch tested with PRs 57445, 59523). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Revert "add number measurement for bound/unbound pv/pvc"

Reverts #57872

Fixes : #59517
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants