Skip to content
This repository has been archived by the owner on Sep 7, 2022. It is now read-only.

Too many requests to VC when multiple volumes are attached to kubernetes nodes #175

Closed
BaluDontu opened this issue Jun 23, 2017 · 8 comments

Comments

@BaluDontu
Copy link

BaluDontu commented Jun 23, 2017

Current DisksAreAttached([]volPaths, nodeName) implementation makes multiple calls to VC. For every volume attached to a node, it makes individual query to VC to check if the volume is attached.
On the top of it Kubernetes queries DisksAreAttached() API every 1 min to check if the volumes are still attached to the nodes. This is significant overkill on the VC performance.

Ideally, we want to query VC once to check if all the volumes are attached to the node.

@divyenpatel @tusharnt @rohitjogvmw @luomiao

@BaluDontu
Copy link
Author

This is a very high priority task that needs to be done.

@tusharnt tusharnt added the P0 label Jun 27, 2017
@tusharnt tusharnt added this to the July milestone Jun 29, 2017
@tusharnt
Copy link

@BaluDontu to add data to elaborate what the current performance is.

@tusharnt tusharnt removed this from the July milestone Jul 18, 2017
@BaluDontu BaluDontu added this to the Sprint - Nemo milestone Jul 20, 2017
@BaluDontu
Copy link
Author

AWS has a similar sort of issue. There is a PR that they have for fixing this issue on AWS - kubernetes#41306. Here, kubernetes introduces a new interface BulkVerifyVolumes with all the nodes data and volumes attached to the nodes to be implemented by the cloud provider. The cloud provider can thus use this information to verify if the volumes are attached to these nodes.

So there are 2 approaches.

  1. Use VolumeAreAttached(specs []*volume.Spec, nodeName types.NodeName) interface to verify if volumes are attached on each node. The number of kubernetes calls per minute will be proportional to the number of nodes.
  2. Use BulkVerifyVolumes(volumesByNode map[types.NodeName][]*volume.Spec) interface to verify if volumes are attached on each node. The number of kubernetes calls per minute will be 1.

Need to evaluate these approaches to see if it can implemented in vsphere cloud provider.

@BaluDontu
Copy link
Author

In the current implementation, we verify if the UUID of virtual disk matches the UUID of any of the devices available on the virtual machine to check if the disk is attached on VM.

Instead we would be using a different approach for verifying if the list of volumes are available on the nodes. Here we would be comparing the volume path with the virtual device backing path to verify if the volume is attached on the node (virtual machine). This way we can reduce the number of calls made to the VC by the vsphere cloud provider.

But the problem here is the virtual device backing path is a canonical path of format "[datastore1] 47bb5e59-a61d-f252-e2a1-02001ad1b044/volume1.vmdk" instead of "[datastore1] kubevols/volume1.vmdk".

Now let's see how can we implement this. Let's consider the following scenarios:

  • Dynamic volume provisioning

    • After a successful persistent volume creation instead of returning the volume path in format "[datastore1] kubevols/volume1.vmdk", we would be returning "[datastore1] 47bb5e59-a61d-f252-e2a1-02001ad1b044/volume1.vmdk".
    • Kubernetes would use this new volume path to check if the volume is attached to the node. Then we would query the VM device backing path to see if it matches the volume path. The same logic can be applied even if there are multiple volumes.
  • Static volume provisioning

    • So here we would be asking the user to specify the canonical path instead of actual volume path. This way with the new DisksAreAttached() implementation uses this canonical path to verify the volume attached to the node directly. Without this canonical path we would have no way to verify if the disk is attached to volume in a single VC call.
apiVersion: v1
kind: PersistentVolume
metadata:
  name: redis-master-pv
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  vsphereVolume:
    volumePath: "[datastore1] 47bb5e59-a61d-f252-e2a1-02001ad1b044/volume1.vmdk"
    fsType: ext4 

@BaluDontu BaluDontu changed the title DisksAreAttached() implementation is very slow in vSphere Cloud Provider vSphere Cloud Provider implementation to check if the volumes are attached to nodes is slow Jul 31, 2017
@BaluDontu BaluDontu changed the title vSphere Cloud Provider implementation to check if the volumes are attached to nodes is slow vSphere Cloud Provider implementation to check if the volumes are attached to nodes is making too many VC calls Jul 31, 2017
@BaluDontu BaluDontu changed the title vSphere Cloud Provider implementation to check if the volumes are attached to nodes is making too many VC calls Too many requests to VC when multiple volumes are attached to kubernetes nodes Jul 31, 2017
@BaluDontu
Copy link
Author

BaluDontu commented Aug 14, 2017

I see BulkVerifyVolumes API returns map[types.NodeName]map[*volume.Spec]bool and error. For vsphere, I am trying to implement the functionality and see the following roadblocks.

In vsphere, when a node is powered off the volumes are not detached automatically as opposed to case in GCE and AWS. I see the existing BulkVerifyVolumes API throws an error for the entire API call but not for individual nodes. In case consider that I have 2 nodes with 2 volumes attached to each node. If node2 is down, the BulkVerifyVolumes should throw an error for that particular nodeVM which is node2, and then only kubernetes attempts to detach the disk for that node. For node 1, it returns vol1 - true and vol2 - true which shows that the volumes are still attached. With the existing API, I cannot return an error for the individual node, I have to throw an error for the entire API which doesn't seem to be helpful in this because its only node2 which is down and needs volumes to be detached automatically and not node1.

Overall with existing API, we cannot send the status for every node rather we can send it only for the entire API call. This might seem perfect in envt. like AWS and GCE which volumes are detached automatically when the node is powered off.

I am working with @jingxu and @gnufied for clarification on this. Also provided by comments on kubernetes#41306 and reached out to them via slack channel.

@tusharnt
Copy link

As of today, changes are almost ready. Undergoing rigorous testing before it can be posted for review on k8s master.

@BaluDontu
Copy link
Author

PR is already our for review around a week ago - kubernetes#52131. Waiting for review from kubernetes reviewers.

@BaluDontu
Copy link
Author

This change is merged in master. Closing it now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants