Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multi PVC versions of e2e tests for WaitForFirstConsumer #70362

Merged
merged 1 commit into from Nov 12, 2018
Merged

Add multi PVC versions of e2e tests for WaitForFirstConsumer #70362

merged 1 commit into from Nov 12, 2018

Conversation

ddebroy
Copy link
Member

@ddebroy ddebroy commented Oct 29, 2018

What type of PR is this?
/kind bug

What this PR does / why we need it:
This PR adds e2e tests that exercise creation of pods with multiple PVCs associated with storage classes configured with volume binding mode WaitForFirstConsumer . The tests ensure all PVs get provisioned in the same zone as the pod due to delayed binding.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #68517

Special notes for your reviewer:
RePD versions of these tests will be added to this PR. This is currently read to review for PD and EBS.

Does this PR introduce a user-facing change?:
"NONE"

NONE

/sig storage

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/storage Categorizes an issue or PR as relevant to SIG Storage. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Oct 29, 2018
@ddebroy
Copy link
Member Author

ddebroy commented Oct 29, 2018

/assign @msau42

@ddebroy
Copy link
Member Author

ddebroy commented Oct 29, 2018

/assign @marun

@msau42
Copy link
Member

msau42 commented Oct 29, 2018

/assign @verult

@ddebroy ddebroy changed the title [WIP] Add multi PVC versions of e2e tests for WaitForFirstConsumer Add multi PVC versions of e2e tests for WaitForFirstConsumer Oct 31, 2018
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 31, 2018
@ddebroy
Copy link
Member Author

ddebroy commented Oct 31, 2018

Multiple PVC tests for regional PDs with delayed binding is now in. Tests for multi-PVCs with EBS and GCE PD are passing. So the multi PVC tests and utils is ready to be reviewed.

The remaining portion is the tests with node selectors and taints which will build on this.

return WaitForPersistentVolumeClaimsPhase(phase, c, ns, []string{pvcName}, Poll, timeout, true)
}

// WaitForPersistentVolumeClaimPhase waits for any or all of the PersistentVolumeClaims to be in a specific phase or until timeout occurs, whichever comes first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "waits for any (if matchAny is true) or all (if matchAny is false)..."

}
}
if phaseFoundInAllClaims && len(pvcNames) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a separate error for len(pvcNames) == 0?

)

func testBindingWaitForFirstConsumer(client clientset.Interface, claim *v1.PersistentVolumeClaim, class *storage.StorageClass) (*v1.PersistentVolume, *v1.Node) {
pvs, node := testBindingWaitForFirstConsumerMultiPVC(client, []*v1.PersistentVolumeClaim{claim}, class)
return pvs[0], node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expect(len(pvs)).ToNot(Equal(0))

Expect(err).NotTo(HaveOccurred())
By("creating claims")
var createdClaims []*v1.PersistentVolumeClaim
var claimNames []string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: combine the two slices into a map

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions used below like WaitForPersistentVolumeClaimsPhase, CreatePod need access to the slices createdClaims and claimNames. So putting them into a map will require extracting the keys and values as slices multiple times down below. To avoid doing that and given we don't need to look up claim through names, I am keeping it as two separate slices that can be directly passed to the functions used below.

defer func() {
framework.ExpectNoError(framework.DeletePersistentVolumeClaim(client, claim.Name, claim.Namespace), "Failed to delete PVC ", claim.Name)
for _, claim := range createdClaims {
framework.ExpectNoError(framework.DeletePersistentVolumeClaim(client, claim.Name, claim.Namespace), "Failed to delete PVC ", claim.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a single deletion error stop other deletions? Should we issue all deletes first and check errors later?

Expect(err).To(HaveOccurred())
for _, claim := range createdClaims {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be simplified to framework.WaitForPersistentVolumeClaimsPhase(v1.ClaimPending) and a very short timeout? If there's another function that just checks the instantaneous phase that'd be even better

Copy link
Member Author

@ddebroy ddebroy Nov 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting and checking that the claim did not bind for a while is necessary to verify that the WaitForFirstConsumer is indeed taking effect. If we wait for an extremely short period (through something like framework.WaitForPersistentVolumeClaimsPhase(v1.ClaimPending)), the verification that WaitForFirstConsumer is indeed taking effect will not be effective as a PVC with Immediate binding mode also stay Pending for a little while (as the PV is created, attached, mounted etc.) before switching to bound.


return pv, node
zone, ok := node.Labels[kubeletapis.LabelZoneFailureDomain]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the function unusable for CSI drivers. IMO we should leave label/affinity checking outside the function for now. Maybe in the future we could have each volume plugin under test implement a callback check for this.

Copy link
Member Author

@ddebroy ddebroy Nov 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In volume_provisioning.go, the zone checks have been moved out of testBindingWaitForFirstConsumerMultiPVC. So I think this should be good already.

node, err := client.CoreV1().Nodes().Get(pod.Spec.NodeName, metav1.GetOptions{})
Expect(err).NotTo(HaveOccurred())

pv, err := client.CoreV1().PersistentVolumes().Get(claim.Spec.VolumeName, metav1.GetOptions{})
Expect(err).NotTo(HaveOccurred())
By("re-checking the claims to see they binded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the label and node affinity check is left here, update the By() message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary as the node affinity/label checks have been moved out.

@@ -885,6 +911,52 @@ var _ = utils.SIGDescribe("Dynamic Provisioning", func() {
}
})
})
Describe("DynamicProvisioner delayed binding with multiple PVCs [Slow]", func() {
It("should create multiple persistent volumes in the same zone as node after a pod mounting the claims is started", func() {
tests := []testsuites.StorageClassTest{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this test be pulled into a common function instead?

@@ -965,7 +1037,57 @@ var _ = utils.SIGDescribe("Dynamic Provisioning", func() {
}
})
})

Describe("DynamicProvisioner delayed binding with allowedTopologies with multiple PVCs [Slow]", func() {
It("should create multiple persistent volumes in the same zone as specified in allowedTopologies after a pod mounting the claims is started", func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto (common function)

@@ -74,15 +74,17 @@ var _ = utils.SIGDescribe("Regional PD", func() {
})

It("should provision storage with delayed binding [Slow]", func() {
testRegionalDelayedBinding(c, ns)
testRegionalDelayedBinding(c, ns, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 1 /* pvcCount */

@ddebroy
Copy link
Member Author

ddebroy commented Nov 2, 2018

Thanks for reviewing @verult. Code review comments so far addressed.

@ddebroy
Copy link
Member Author

ddebroy commented Nov 3, 2018

/retest

@@ -251,6 +261,101 @@ func checkGCEPD(volume *v1.PersistentVolume, volumeType string) error {
return nil
}

func testZonalDelayedBinding(c clientset.Interface, ns string, pvcCount int) {
tests := []testsuites.StorageClassTest{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testZonalDelayedBinding() and testZonalDelayedBindingWithAllowedTopology() share a lot of code, and their StorageClassTests are the same too. I was thinking we could combine these two functions

@verult
Copy link
Contributor

verult commented Nov 5, 2018

mostly LGTM, some minor comments!

@ddebroy
Copy link
Member Author

ddebroy commented Nov 8, 2018

@verult latest CR comments addressed.

Copy link
Contributor

@verult verult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

just a minor comment

@@ -261,17 +261,22 @@ func checkGCEPD(volume *v1.PersistentVolume, volumeType string) error {
return nil
}

func testZonalDelayedBinding(c clientset.Interface, ns string, pvcCount int) {
func testZonalDelayedBinding(c clientset.Interface, ns string, specifyAllowedTopology bool, pvcCount int) {
storageClassNameFmt := "Delayed binding %s storage class test %s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storageClassTestNameFmt

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2018
Signed-off-by: Deep Debroy <ddebroy@docker.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2018
@ddebroy
Copy link
Member Author

ddebroy commented Nov 9, 2018

Thanks for reviewing @verult . Addressed the last review comment above and squashed all commits so far. PTAL and add back the LGTM if all looks good.

@verult
Copy link
Contributor

verult commented Nov 9, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2018
@saad-ali
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ddebroy, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2018
@AishSundar
Copy link
Contributor

/milestone v1.13
/priority important-soon

@k8s-ci-robot k8s-ci-robot added this to the v1.13 milestone Nov 12, 2018
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 12, 2018
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 0ca67bd into kubernetes:master Nov 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more e2es for topology aware dynamic provisioning
8 participants