Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provision vsphere volume as per zone #72731

Merged
merged 1 commit into from Feb 18, 2019

Conversation

@skarthiksrinivas
Copy link
Contributor

skarthiksrinivas commented Jan 9, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
Currently vsphere cloud provider (VCP) insists on provisioning a volume only on a globally shared datastore. Hence, in a zoned environment, even in presence of a shared datastore within a specific zone, the volume provisioning can fail if that datastore is not shared across all the zones hosting kubernetes nodes. This change fixes this issue by considering the zone information provided in allowedTopologies for selection of the datastore. If allowedTopologies is not provided, the current behaviour is retained as-is.
This PR addresses one part of issue #67703. The other part to attach zone labels to the created volumes is here - #72687
Which issue(s) this PR fixes:
Fixes #

Does this PR introduce a user-facing change?:
Yes

This change ensures that volumes get provisioned based on the zone information provided in allowedTopologies.

Storage class spec:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fastpolicy1
provisioner: kubernetes.io/vsphere-volume
parameters:
    diskformat: zeroedthick
    storagePolicyName: vSAN Default Storage Policy
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - zone1

PV creation Logs:
I0109 11:17:52.321372       1 vsphere.go:1147] Starting to create a vSphere volume with volumeOptions: &{CapacityKB:1048576 Tags:map[kubernetes.io/created-for/pvc/namespace:default kubernetes.io/created-for/pvc/name:pvcsc-1-policy kubernetes.io/created-for/pv/name:pvc-34650c12-1400-11e9-aef4-005056804cc9] Name:kubernetes-dynamic-pvc-34650c12-1400-11e9-aef4-005056804cc9 DiskFormat:zeroedthick Datastore: VSANStorageProfileData: StoragePolicyName:vSAN Default Storage Policy StoragePolicyID: SCSIControllerType: Zone:[zone1]}
...
I0109 11:17:59.430113       1 vsphere.go:1334] The canonical volume path for the newly created vSphere volume is "[vsanDatastore] 98db185c-6683-d8c7-bc55-0200435ec5da/kubernetes-dynamic-pvc-34650c12-1400-11e9-aef4-005056804cc9.vmdk"

Ran regression tests (no zone) and they passed.
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 9, 2019

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 9, 2019

Hi @skarthiksrinivas. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@skarthiksrinivas

This comment has been minimized.

Copy link
Contributor Author

skarthiksrinivas commented Jan 9, 2019

@frapposelli

This comment has been minimized.

Copy link
Member

frapposelli commented Jan 9, 2019

/ok-to-test

@frapposelli

This comment has been minimized.

Copy link
Member

frapposelli commented Jan 9, 2019

/sig vmware

@@ -104,6 +104,7 @@ func (util *VsphereDiskUtil) CreateVolume(v *vsphereVolumeProvisioner) (volSpec
Name: name,
}

volumeOptions.Zone = selectedZone

This comment has been minimized.

@gnufied

gnufied Jan 9, 2019

Member

So if I understand this correctly, this feature does not uses NodeAffinity field of PVs and hence does not uses topology aware provisioning?

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Jan 10, 2019

Author Contributor

Yes. Your understanding is correct. The scope of this fix is limited to honouring the allowedTopologies zones during volume provisioning.

This comment has been minimized.

@gnufied

gnufied Jan 17, 2019

Member

So if I don't specify allowedTopology in my default storageClass and my nodes are spread across zones then it will basically result in PVCs which can not be used by pods. Previously - we blocked/errored out on volume provisioning altogether if datastore being used is not shared with all VMs (#72497). How does this interact with that bug?

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Jan 18, 2019

Author Contributor

The issue in #72497 is that volume provisioning will fail if there is no shared datastore available across all kubernetes node VMs. Now, with this change, by specifying allowedTopologies in SC, the volume provisioning can happen as long as there is a shared datastore available for the nodes within the zone.
There is no change in behaviour when allowedTopology is not specified. It will continue to work in the same way today, i.e by looking for shared datastore across all nodes and succeed or fail as the case is.

nm.zoneInfoLock.Lock()
nm.zoneInfoMap[nodeName] = zone
nm.zoneInfoLock.Unlock()
}

This comment has been minimized.

@gnufied

gnufied Jan 9, 2019

Member

Should use defer pattern?

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Jan 10, 2019

Author Contributor

Makes sense. Done.


nm.zoneInfoLock.Lock()
delete(nm.zoneInfoMap, node.ObjectMeta.Name)
nm.zoneInfoLock.Unlock()
}

This comment has been minimized.

@gnufied

gnufied Jan 9, 2019

Member

lets use defer if possible.

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Jan 10, 2019

Author Contributor

Done.

This comment has been minimized.

@gnufied

gnufied Jan 11, 2019

Member

Still not fixed.

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Jan 17, 2019

Author Contributor

My bad. Fixed the code to use defer pattern for all access to zoneInfoLock.

@skarthiksrinivas

This comment has been minimized.

Copy link
Contributor Author

skarthiksrinivas commented Jan 10, 2019

Addressed comments.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Feb 13, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Feb 13, 2019

@skarthiksrinivas: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws a7f8f66 link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@skarthiksrinivas

This comment has been minimized.

Copy link
Contributor Author

skarthiksrinivas commented Feb 13, 2019

/retest

@SandeepPissay

This comment has been minimized.

Copy link
Contributor

SandeepPissay commented Feb 15, 2019

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Feb 15, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: frapposelli, SandeepPissay, skarthiksrinivas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vladimirvivien

This comment has been minimized.

Copy link
Member

vladimirvivien commented Feb 15, 2019

@SandeepPissay will this be cherry picked to earlier k8s version ?

@frapposelli

This comment has been minimized.

Copy link
Member

frapposelli commented Feb 16, 2019

@vladimirvivien I believe this will not be backported.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Feb 16, 2019

@rhockenbury

This comment has been minimized.

Copy link

rhockenbury commented Feb 16, 2019

Does anyone know if there are plans to implement theWaitForFirstConsumer option for the volume binding mode for vsphere?

@skarthiksrinivas skarthiksrinivas force-pushed the skarthiksrinivas:vsphere_volume_zone branch from 64c82ec to a309d8a Feb 18, 2019

@frapposelli

This comment has been minimized.

Copy link
Member

frapposelli commented Feb 18, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm kind/bug and removed needs-kind labels Feb 18, 2019

@k8s-ci-robot k8s-ci-robot merged commit 701f914 into kubernetes:master Feb 18, 2019

17 checks passed

cla/linuxfoundation skarthiksrinivas authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-godeps Skipped
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Skipped
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped
tide In merge pool.
Details
@frapposelli

This comment has been minimized.

Copy link
Member

frapposelli commented Feb 18, 2019

🎉

return "", err
}

if err != nil {

This comment has been minimized.

@tedyu

tedyu Feb 18, 2019

Contributor

Shouldn't this be ahead of the if block starting at line 1260 ?

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Feb 19, 2019

Author Contributor

Yes. That's correct. Thanks for pointing out. The current sequence makes this check a no-op. Fixed it. Have created a PR for this - #74263


for _, host := range hosts {
var hostSystemMo mo.HostSystem
host.Properties(ctx, host.Reference(), []string{"datastore"}, &hostSystemMo)

This comment has been minimized.

@tedyu

tedyu Feb 18, 2019

Contributor

What's the effect of this call (I don't see assignment) ?

This comment has been minimized.

@skarthiksrinivas

skarthiksrinivas Feb 19, 2019

Author Contributor

The last parameter in the this method call is an OUT parameter which will be set and that's how hostSystemMo gets assigned. However, this method returns error which is not processed in this step. I have fixed that now in the PR mentioned above.

@skarthiksrinivas skarthiksrinivas deleted the skarthiksrinivas:vsphere_volume_zone branch Feb 19, 2019

@skarthiksrinivas skarthiksrinivas restored the skarthiksrinivas:vsphere_volume_zone branch Feb 19, 2019

@skarthiksrinivas

This comment has been minimized.

Copy link
Contributor Author

skarthiksrinivas commented Feb 19, 2019

Does anyone know if there are plans to implement theWaitForFirstConsumer option for the volume binding mode for vsphere?

Yes. We do have that task in the pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.