K8SSAND-1180 ⁃ How do we gracefully increase storage capacity via cass-operator while Cass Datacenter, Statefulset etc are in service with incoming workloads #263

mparikhcloudbeds · 2022-01-20T12:45:40Z

Following the below thread, wanted to get an update:
https://community.datastax.com/questions/12269/index.html

Environment:

AWS EKS and AWS EBS
Cass-Operator : 1.9
Server Image : DSE 6.8.18 and/or OSS 3.11.11

┆Issue is synchronized with this Jira Task by Unito
┆friendlyId: K8SSAND-1180
┆priority: Medium

burmanm · 2022-01-21T15:07:30Z

Hi, does your PV provider support PVC volume expansion?

mparikhcloudbeds · 2022-01-22T20:03:00Z

Hi, does your PV provider support PVC volume expansion?

@burmanm - Yes, the storage class that we are using has the following property.
allowVolumeExpansion: true

mparikhcloudbeds · 2022-01-28T12:53:01Z

@burmanm - Following up to see if there's an update on this?

burmanm · 2022-01-28T13:56:08Z

Hey, sorry. The process of expanding a PVC with StatefulSets is a bit tricky and involves manual operations (restriction of Kubernetes). Sadly my local instance did not support the feature, but I'll try to create an example shortly with documented steps.

mparikhcloudbeds · 2022-01-28T14:50:24Z

thnx @burmanm .

Is this something on the roadmap of cass-operator project?

bradfordcp · 2022-04-12T05:41:03Z

It's a feature we would like to see, but unfortunately has not been scheduled yet. We have identified the steps to resolve the issue, but it will require a bit of time to implement.

counter2015 · 2022-05-06T06:40:10Z

Hey, sorry. The process of expanding a PVC with StatefulSets is a bit tricky and involves manual operations (restriction of Kubernetes). Sadly my local instance did not support the feature, but I'll try to create an example shortly with documented steps.

@burmanm Could you provide more details about this?
I have a 4 node cluster and there disk usage is almost full.
A workaround is to add nodes in cluster, and the data will rebalanced, and cleanup auto.
But it is a waste of cpu and memory resources.

discostur · 2022-06-09T10:06:38Z

@counter2015 you can easily upgrade your storage manually:

set new storage capacity in your PVC
restart the cassandra pods one by one

Then, your PVCs should automatically get resized by your storage csi.

counter2015 · 2022-06-09T10:28:18Z

@discostur I am not sure if the PVC capactity will be changed by operator after I edit datacenter yaml file.
Finally, I incresed storage capacity by creating a new datacenter and migrating data from old dc1 to new dc2.

discostur · 2022-06-09T15:08:50Z

@counter2015 no it does not! i edited my datacenter yaml file and nothing was changed in the pvc / pv. So i edited the pvc manually and the storage was resized ...

jsanda · 2022-06-17T03:51:32Z

The process is actually a bit more involved to do it safely.

First, we need to delete the StatefulSet without deleting the pods. This can be done for example with kubectl delete --cascade=false.

Next, make sure that persistentVolumeReclaimPolicy on the PV is set to Retain. Remove the claim reference. Then delete the PVC.

Now go ahead expand the volume and update the capacity in the PV spec.

Create new PVC that will bind to the PV. The name of the PVC needs to be the same as the name of the old one.

Lastly, recreate the StatefulSet. The StatefulSet controller find the existing PVCs and pods. The StatefulSet will immediately move into the ready state (assuming the pods are ready).

counter2015 · 2022-06-17T04:02:02Z

@jsanda Is there any risk to edit the pvc size directly ?

jsanda · 2022-06-17T04:04:55Z

That may work and might be easier than what I prescribed. I would need to do some testing/investigation to be certain.

okgolove · 2023-01-05T10:38:05Z

prometheus-operator (which uses statetulset for prometheus pod as well) offeris this way. But this does not work for k8scassandra because of admission webhook:

admission webhook "vcassandradatacenter.kb.io" denied the request: CassandraDatacenter write rejected, attempted to change storageConfig

My example:

k8ssandracluster:

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo
spec:
  cassandra:
    serverVersion: "4.0.3"
    serverImage: k8ssandra/cass-management-api:4.0.3
    telemetry:
      prometheus:
        enabled: true
    storageConfig:
      cassandraDataVolumeClaimSpec:
        storageClassName: gp3-multizone
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
    config:
      jvmOptions:
        heapSize: 512M
    datacenters:
      - metadata:
          name: dc1
        size: 9
        racks:
          - name: r1
            nodeAffinityLabels:
              onairent.live/node-type: cassandra-node
              topology.kubernetes.io/zone: eu-north-1a
          - name: r2
            nodeAffinityLabels:
              onairent.live/node-type: cassandra-node
              topology.kubernetes.io/zone: eu-north-1b
          - name: r3
            nodeAffinityLabels:
              onairent.live/node-type: cassandra-node
              topology.kubernetes.io/zone: eu-north-1c

Change storage to 150Gi
Apply changed manifest
Patch PVCs

for p in $(kubectl get pvc -l cassandra.datastax.com/datacenter=dc1 -o jsonpath='{range .items[*]}{.metadata.name} {end}'); do \
  kubectl patch pvc/${p} --patch '{"spec": {"resources": {"requests": {"storage":"150Gi"}}}}'; \
done

Delete statefulsets

kubectl delete statefulset -l cassandra.datastax.com/datacenter=dc1 --cascade=orphan

After that no changes are applied to cassandra cluster due to the error mentioned above. Even if I try to resize my cluster I get the error and nothing happens.

adziura-ledger · 2023-05-30T13:36:42Z

having the same issue described in previous comment
what is the procedure of increase storage capacity in this case?

chandapukiran · 2023-05-31T05:10:55Z

I have directly edited the PVCs and restarted the pods in my test environment. Well, nothing is broken and I can see the new size in the PVCs reflected and access the test data.

okgolove · 2023-05-31T07:05:33Z

Check prometheus-operator resizing manual. Works fine for k8ssandra as well.

chandapukiran · 2023-06-07T11:40:26Z

@okgolove I was trying the steps provided on prometheus-operator resizing manual and it worked for me but when i deleted the cluster and tried on a new cluster. It throws the error you mentioned above. Is it still working for you?

Error from server (CassandraDatacenter write rejected, attempted to change storageConfig.CassandraDataVolumeClaimSpec): admission webhook "vcassandradatacenter.kb.io" denied the request: CassandraDatacenter write rejected, attempted to change storageConfig.CassandraDataVolumeClaimSpec

okgolove · 2023-06-07T11:56:23Z

@chandapukiran have you changed storage size in cluster manifest before recreating?

chandapukiran · 2023-06-07T12:23:55Z

@okgolove No, so basically i have created a cluster with a default size and later tried to change the size by trying to modify the cass object

okgolove · 2023-06-07T12:27:16Z

@chandapukiran ahh, yes. Admission webhok won't let you make this change. I disabled it temporary then modified.

chandapukiran · 2023-06-07T12:35:16Z

@okgolove oh ok, could you share me the commands to disable/enable admission webhook

okgolove · 2023-06-07T12:57:59Z

@chandapukiran how did you install the operator? If via helm chart then just set

cass-operator:
  admissionWebhooks:
    enabled: false

Or just delete admission webhook via kubectl

chandapukiran · 2023-06-07T13:26:51Z

Thanks @okgolove , i see it is already disabled in my helm chart but I now understand why it worked for me before but not now. I was playing with k8ssandra-operator in another namespace and that was causing the issue. Now I am good.

surajk94 · 2023-06-13T12:27:51Z

Adding the exact steps to be followed for quick reference:

disable admissionWebhooks in operator and re-deploy it - cass-operator: admissionWebhooks: enabled: false
stop the required data-centers and set new value for volume size in K8ssandraCluster and apply the changes. Set stopped: true flag in each of the required data-centers in the datacenters list and apply the yaml file using kubectl apply -f <file>.
manually edit the PVC to the required size for each node in the cluster. One can use kubectl edit pvc <pvc-name> -n <namespace> and edit the size in the spec section
delete the underlying StatefulSet using the orphan deletion strategy: kubectl delete statefulset <sts-name> -n <namespace> --cascade=orphan
remove the stopped flag in k8ssandra-cluster yaml file and apply the changes to re-start the stopped data-centers in the cluster
re-enable admissionWebhooks in operator and re-deploy it

burmanm · 2023-12-19T08:48:04Z

Implementation ticket: #602

mparikhcloudbeds added the question Further information is requested label Jan 20, 2022

sync-by-unito bot assigned burmanm Mar 2, 2022

adejanovski added zh:To Do Issues in the ZenHub pipeline 'To Do' zh:To-Do and removed zh:To Do Issues in the ZenHub pipeline 'To Do' labels Mar 30, 2022

adejanovski added zh:Ready and removed zh:To-Do zh:Ready labels Apr 21, 2022

adejanovski added the zh:Assess/Investigate label Jul 26, 2022

adejanovski added zh:Assess/Investigate and removed zh:Assess/Investigate zh:Ready labels Aug 30, 2022

adejanovski added the assess Issues in the state 'assess' label Jun 7, 2023

burmanm mentioned this issue Nov 14, 2023

Do we support use storage size equal or greater than 1024Gi? #596

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K8SSAND-1180 ⁃ How do we gracefully increase storage capacity via cass-operator while Cass Datacenter, Statefulset etc are in service with incoming workloads #263

K8SSAND-1180 ⁃ How do we gracefully increase storage capacity via cass-operator while Cass Datacenter, Statefulset etc are in service with incoming workloads #263

mparikhcloudbeds commented Jan 20, 2022 •

edited by sync-by-unito bot

burmanm commented Jan 21, 2022

mparikhcloudbeds commented Jan 22, 2022

mparikhcloudbeds commented Jan 28, 2022

burmanm commented Jan 28, 2022

mparikhcloudbeds commented Jan 28, 2022

bradfordcp commented Apr 12, 2022

counter2015 commented May 6, 2022 •

edited

discostur commented Jun 9, 2022

counter2015 commented Jun 9, 2022 •

edited

discostur commented Jun 9, 2022

jsanda commented Jun 17, 2022

counter2015 commented Jun 17, 2022

jsanda commented Jun 17, 2022

okgolove commented Jan 5, 2023 •

edited

adziura-ledger commented May 30, 2023

chandapukiran commented May 31, 2023

okgolove commented May 31, 2023 •

edited

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

surajk94 commented Jun 13, 2023

burmanm commented Dec 19, 2023

K8SSAND-1180 ⁃ How do we gracefully increase storage capacity via cass-operator while Cass Datacenter, Statefulset etc are in service with incoming workloads #263

K8SSAND-1180 ⁃ How do we gracefully increase storage capacity via cass-operator while Cass Datacenter, Statefulset etc are in service with incoming workloads #263

Comments

mparikhcloudbeds commented Jan 20, 2022 • edited by sync-by-unito bot

burmanm commented Jan 21, 2022

mparikhcloudbeds commented Jan 22, 2022

mparikhcloudbeds commented Jan 28, 2022

burmanm commented Jan 28, 2022

mparikhcloudbeds commented Jan 28, 2022

bradfordcp commented Apr 12, 2022

counter2015 commented May 6, 2022 • edited

discostur commented Jun 9, 2022

counter2015 commented Jun 9, 2022 • edited

discostur commented Jun 9, 2022

jsanda commented Jun 17, 2022

counter2015 commented Jun 17, 2022

jsanda commented Jun 17, 2022

okgolove commented Jan 5, 2023 • edited

adziura-ledger commented May 30, 2023

chandapukiran commented May 31, 2023

okgolove commented May 31, 2023 • edited

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

okgolove commented Jun 7, 2023

chandapukiran commented Jun 7, 2023

surajk94 commented Jun 13, 2023

burmanm commented Dec 19, 2023

mparikhcloudbeds commented Jan 20, 2022 •

edited by sync-by-unito bot

counter2015 commented May 6, 2022 •

edited

counter2015 commented Jun 9, 2022 •

edited

okgolove commented Jan 5, 2023 •

edited

okgolove commented May 31, 2023 •

edited