Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ControllerCreateVolume fails as the volume already exists #32

Closed
Madhan-SWE opened this issue Dec 6, 2021 · 3 comments · Fixed by #83
Closed

ControllerCreateVolume fails as the volume already exists #32

Madhan-SWE opened this issue Dec 6, 2021 · 3 comments · Fixed by #83
Assignees

Comments

@Madhan-SWE
Copy link

While running multiple workloads from the Kube-burner,
ControllerCreateVolume method is called and failed for few volumes of the batch with the error

{"description":"bad request: pvc-d5d0d6f7-1c1e-4c13-8a2a-ebabd31fa9b0 volume name already exists for cloud instance 7031b049297e4588a3eafb21335d6a2b; duplicate names are not allowed","error":"bad request"}

Log:


I1206 11:10:37.257617       1 controller.go:78] CreateVolume: called with args {Name:pvc-e79fa4ea-e1bb-484d-9a7a-6539f23635f5 CapacityRange:required_bytes:1073741824  VolumeCapabilities:[mount:<fs_type:"xfs" > access_mode:<mode:SINGLE_NODE_WRITER > ] Parameters:map[type:tier3] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:requisite:<segments:<key:"topology.powervs.csi.ibm.com/disk-type" value:"tier1" > > preferred:<segments:<key:"topology.powervs.csi.ibm.com/disk-type" value:"tier1" > >  XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
2021/12/06 11:10:37 calling the PowerVolume Create Method
2021/12/06 11:10:37 Calling the New Auth Method in the IBMPower Session Code
2021/12/06 11:10:37 Calling the crn constructor that is to be passed back to the caller  65b64c1f1c29460e8c2e4bbfbd893c2c
2021/12/06 11:10:37 the region is lon and the zone is  lon04
2021/12/06 11:10:37 the crndata is ... crn:v1:bluemix:public:power-iaas:lon04:a/65b64c1f1c29460e8c2e4bbfbd893c2c:7845d372-d4e1-46b8-91fc-41051c984601:: 
POST /pcloud/v1/cloud-instances/7845d372-d4e1-46b8-91fc-41051c984601/volumes HTTP/1.1
Host: lon.power-iaas.cloud.ibm.com
User-Agent: Go-http-client/1.1
Content-Length: 98
Accept: application/json
Authorization: Bearer
Content-Type: application/json
Crn: crn:v1:bluemix:public:power-iaas:lon04:a/65b64c1f1c29460e8c2e4bbfbd893c2c:7845d372-d4e1-46b8-91fc-41051c984601::
Accept-Encoding: gzip

{"diskType":"tier3","name":"pvc-e79fa4ea-e1bb-484d-9a7a-6539f23635f5","shareable":false,"size":1}

HTTP/1.1 400 Bad Request
Content-Length: 206
Cf-Cache-Status: DYNAMIC
Cf-Ray: 6b95118b3e204e97-FRA
Connection: keep-alive
Content-Type: application/json
Date: Mon, 06 Dec 2021 11:10:39 GMT
Expect-Ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
Strict-Transport-Security: max-age=15724800; includeSubDomains

{"description":"bad request: pvc-e00e231d-f97b-44ab-8a9b-23cd677ad612 volume name already exists for cloud instance 7031b049297e4588a3eafb21335d6a2b; duplicate names are not allowed","error":"bad request"}

E1206 11:10:39.924560       1 driver.go:116] GRPC error: rpc error: code = Internal desc = Could not create volume "pvc-e00e231d-f97b-44ab-8a9b-23cd677ad612": {"description":"bad request: pvc-e00e231d-f97b-44ab-8a9b-23cd677ad612 volume name already exists for cloud instance 7031b049297e4588a3eafb21335d6a2b; duplicate names are not allowed","error":"bad request"}

@Madhan-SWE
Copy link
Author

/assign

@Madhan-SWE Madhan-SWE changed the title ControllerCreate Volumes fails as the volume already exists ControllerCreateVolume fails as the volume already exists Dec 6, 2021
@Madhan-SWE
Copy link
Author

Since the multiple volumes are created at the same time, IBM Cloud takes some more time to return the response back.
In the mean time, controller plugin creates another new request to create volume and ignores the previous request's response.

Explored the External-Provisioner side car component and added timeout as 100s as part of deployment.
With this change create volume requests are passing.

Fix:

[root@madhan-1-kube-1-22-2 powervs-csi-driver]# cat deploy/kubernetes/base/controller.yaml 
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: powervs-csi-controller
  namespace: kube-system
  labels:
    app.kubernetes.io/name: powervs-csi-driver
spec:
      ....
      ....
      containers:
        ....
        ....
        - name: csi-provisioner
          image: k8s.gcr.io/sig-storage/csi-provisioner:v2.0.4
          args:
            - --csi-address=$(ADDRESS)
            - --v=5
            - --feature-gates=Topology=true
            - --leader-election
            - --timeout=100s
            #- --leader-election-type=leases
          env:
            - name: ADDRESS
              value: /var/lib/csi/sockets/pluginproxy/csi.sock
          volumeMounts:
            - name: socket-dir
              mountPath: /var/lib/csi/sockets/pluginproxy/
        ....
        ....

@Madhan-SWE
Copy link
Author

Eventhough increasing the timeout fixes the issue, need to benchmark the driver with multiple 100s of workloads and then fix constant timeout and make changes in the code.

Shouldn't close this issue until the constant is decided and made changes in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant