Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI storage capacity check #92387

Merged
merged 9 commits into from
Jul 9, 2020
Merged

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jun 22, 2020

What type of PR is this?
/kind feature

What this PR does / why we need it:

This is the part of kubernetes/enhancements#1472 where the scheduler uses the information for improving pod scheduling of pods with unbound volumes.

Special notes for your reviewer:

The API change itself is getting reviewed in PR #91939 and should be merged first. This PR then just adds one commit with the scheduler change.

Does this PR introduce a user-facing change?:

scheduler: optionally check for available storage capacity before scheduling pods which have unbound volumes (alpha feature with the new `CSIStorageCapacity` feature gate, only works for CSI drivers and depends on support for the feature in a CSI driver deployment)

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/1472-storage-capacity-tracking
- [Usage]: https://github.com/kubernetes/website/pull/21634

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 22, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 22, 2020

/hold

Let's merge #91939 first, then I'll rebase to resolve the merge conflict in the new commit.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. area/apiserver area/test kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 22, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 22, 2020

/assign @msau42

@pohly pohly changed the title CSI storage capacity CSI storage capacity check Jun 22, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 22, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 22, 2020

The merge conflict was trivial and only affected a function comment. I've rebased this code here, but will do that one more time to get rid of the API change commits once #91939 is merged.

@fedebongio
Copy link
Contributor

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jun 23, 2020
driver, err := b.capacityCheck.CSIDriverInformer.Lister().Get(provisioner)
if err != nil {
if errors.IsNotFound(err) {
// Either the provisioner is no CSI driver or the driver does not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "not a CSI driver"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both sound fine to me, but what do I know about English? 😅

Changed.

@@ -161,17 +169,31 @@ type volumeBinder struct {
bindTimeout time.Duration

translator InTreeToCSITranslator

capacityCheck *CapacityCheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually when this goes GA, I think this structure should be folded into the main struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an E2E test showed that this code was actually broken in a very subtle (read: took me two hours to figure out) way: because I wasn't calling the Lister() functions during NewVolumeBinder, the corresponding listeners were never started. That didn't matter with the mock objects used during unit testing, but it did for the real ones: presumably, the factory and thus the listeners are started sometime after NewVolumeBinder and before the volume binder is invoked. Asking for listers later works, but they never return any data...

The revised code now stores the listers and a boolean in this struct instead of CapacityCheck. I kept it to bundle the related parameters together.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @cofyc has a fix for this in #92684

if scenario.shouldFail && err == nil {
t.Error("returned success but expected error")
}
checkReasons(t, reasons, scenario.reasons)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to validate Pod cache in the success case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's useful as an additional check, so I've added it (for both positive and negative cases).

var capacityCheck *scheduling.CapacityCheck
if utilfeature.DefaultFeatureGate.Enabled(features.CSIStorageCapacity) {
capacityCheck = &scheduling.CapacityCheck{
CSIDriverInformer: fh.SharedInformerFactory().Storage().V1().CSIDrivers(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rbac rules for the scheduler need to be updated with these permissions as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I guess proper E2E testing would have found this. Let me think about how I can add that already now, without depending on the external-provisioner changes and a CSI driver deployment that enables the feature. Probably something involving the mock driver and manually created CSIStorageCapacity objects...

Yes, that works. For the negative case (= volume cannot be created) it's currently waiting to ensure that the pod doesn't start, which makes the test slow. We have other cases like that and there decided against relying on error events to shorten the test runtime. Would it perhaps make sense here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the integration test have caught the rbac issue?

The reason to not rely on events wasn't strictly about shortening test time, it was because events are unreliable and could be dropped/garbage collected under high load

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the integration test have caught the rbac issue?

Not sure... let's test it... no. The test completes fine even with the new RBAC permission missing.

The reason to not rely on events wasn't strictly about shortening test time, it was because events are unreliable and could be dropped/garbage collected under high load.

So one cannot rely on them for the test (i.e. not encountering the event must not be a test failure). But if the event is seen, the test could be stopped early. If slow test execution isn't an issue, then this is probably not worth the complexity.


// TestCapacity covers different scenarios involving CSIStorageCapacity objects.
// Scenarios without those are covered by TestFindPodVolumesWithProvisioning.
func TestCapacity(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add some integration test cases so we test the full interaction with etcd?

https://github.com/kubernetes/kubernetes/blob/master/test/integration/volumescheduling/volume_binding_test.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, done.

@msau42
Copy link
Member

msau42 commented Jun 23, 2020

/assign @cofyc @ahg-g

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 7, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 7, 2020

/hold cancel

We (@msau42, @thockin, myself) discussed namespacing of CSIStorageCapacity and decided to go with it for now, with a TODO item in kubernetes/enhancements#1472 (comment) to re-evaluate that choice.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 7, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 7, 2020

/hold

Waiting for CSIStorageCapacity tests in
pull-kubernetes-e2e-gce-alpha-features to pass.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 7, 2020
@@ -944,8 +945,194 @@ var _ = utils.SIGDescribe("CSI mock volume", func() {
})
}
})

// These tests *only* work on a cluster which has the CSIStorageCapacity feature enabled.
ginkgo.Context("CSIStorageCapacity [Feature: CSIStorageCapacity]", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be a whitespace after "Feature".

Temporarily adding a whitespace in the test job so that we can get a run in today. kubernetes/test-infra#18201

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@msau42
Copy link
Member

msau42 commented Jul 7, 2020

/retest

@msau42
Copy link
Member

msau42 commented Jul 8, 2020

/lgtm
/priority important-soon

/hold
Can you update the release note to clarify that this is a new alpha feature with the feature gate name, and that it's only for csi volumes?

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 8, 2020
pohly added 4 commits July 8, 2020 08:02
We can create CSIStorageCapacity objects manually, therefore we don't
need the updated external-provisioner for these tests.
This is similar to the E2E test, it just doesn't need a real cluster.
Setting testParameters.scName had no effect because
StorageClassTest.StorageClassName isn't used anywhere. Instead, the
storage class name is generated dynamically.
By creating CSIStorageCapacity objects in advance, we get the
FailedScheduling pod event if (and only if!) the test is expected to
fail because of insufficient or missing capacity. We can use that as
indicator that waiting for pod start can be stopped early. However,
because we might not get to see the event under load, we still need
the timeout.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 8, 2020

Can you update the release note to clarify that this is a new alpha feature with the feature gate name, and that it's only for csi volumes?

Done.

@pohly
Copy link
Contributor Author

pohly commented Jul 8, 2020

/retest

@pohly
Copy link
Contributor Author

pohly commented Jul 8, 2020

/retest

TestSubresourcePatch in pull-kubernetes-integration seems to be flaky.

@pohly
Copy link
Contributor Author

pohly commented Jul 8, 2020

Waiting for CSIStorageCapacity tests in pull-kubernetes-e2e-gce-alpha-features to pass.

Which they have done in https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/92387/pull-kubernetes-e2e-gce-alpha-features/1280744225868091392/

@msau42
Copy link
Member

msau42 commented Jul 8, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 8, 2020
@msau42
Copy link
Member

msau42 commented Jul 8, 2020

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 8, 2020
@fejta-bot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@k8s-ci-robot k8s-ci-robot merged commit 94a08e1 into kubernetes:master Jul 9, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jul 9, 2020
michaelkolber pushed a commit to michaelkolber/test-infra that referenced this pull request Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants