Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the storagePoolClaim to add/remove disks #2258

Open
jacquelotjeff opened this issue Nov 6, 2018 · 8 comments

Comments

Projects
None yet
8 participants
@jacquelotjeff
Copy link

commented Nov 6, 2018

Description

Currently we can create a custom storagePoolClaim (https://docs.openebs.io/docs/next/deploycstor.html) to use custom disks for a pool. In production environment, a disk can be down (so have to be removed) or we need to add new disks to our pool.

Context

I add the storage pool (3 disks) with : kubectl apply -f openebs-config.yaml

If I run kubectl get sp I have my 3 disks :

NAME                     AGE
cstor-disk-mfq7          23m
cstor-disk-rdlw          23m
cstor-disk-rddw          23m

Later, I decide to add a new disk, so I edit the openebs-config.yaml to add my new disk; then I apply the configuration : kubectl apply -f openebs-config.yaml

If I run kubectl get sp I don't have the latest disk I just added :

NAME                     AGE
cstor-disk-mfq7          23m
cstor-disk-rdlw          23m
cstor-disk-rddw          23m

Possible Solution

  • Implement the documentation to have an alternative
  • Take care of the update of storage pool configuration

@jacquelotjeff jacquelotjeff changed the title Update the storagePoolClaim to add new/remove disks Update the storagePoolClaim to add/remove disks Nov 6, 2018

@kmova kmova added this to the Backlog milestone Nov 20, 2018

@kmova

This comment has been minimized.

Copy link
Member

commented Nov 20, 2018

This document provides CLI commands for expanding a cStor pool (type=striped) with additional disks.

An SPC can result in one or more cStor pools depending on the disks provided to SPC spec. A cStor Pool comprises of:

  • Storage Pool CR (SP) - used for specifying the Disk CRs used by the pool.
  • cStor Storage Pool CR (CSP) - used for specifying the unique disk path used by the pool.
  • cStor Storage Pool Deployment and associated Pod.

When the SPC spec is created with a set of disks, the cstor-operator will segregate the disks based on the node. And on each node, a cStor Pool will be created using the disks from that node. After the pool is provisioned, it can be expanded only by the disks already discovered on the same node.

The following steps are for expanding a single cStor Storage Pool and will need to be repeated on each of the cStor Pools corresponding to an SPC.

Step 1: Identify the cStor Pool (CSP) and Storage Pool (SP) associated with the SPC.

kubectl get sp -l openebs.io/storage-pool-claim=cstor-disk --show-labels

Storage Pools sample output:

NAME              AGE       LABELS
cstor-disk-i4xj   53m       kubernetes.io/hostname=gke-kmova-helm-default-pool-2c01cdf6-9mxq,openebs.io/cas-type=cstor,openebs.io/cstor-pool=cstor-disk-i4xj,openebs.io/storage-pool-claim=cstor-disk
cstor-disk-vt1u   53m       kubernetes.io/hostname=gke-kmova-helm-default-pool-2c01cdf6-dxbf,openebs.io/cas-type=cstor,openebs.io/cstor-pool=cstor-disk-vt1u,openebs.io/storage-pool-claim=cstor-disk
cstor-disk-ys0r   53m       kubernetes.io/hostname=gke-kmova-helm-default-pool-2c01cdf6-nh6w,openebs.io/cas-type=cstor,openebs.io/cstor-pool=cstor-disk-ys0r,openebs.io/storage-pool-claim=cstor-disk

From the above list, pick up the cStor Pool that needs to be expanded. The name of both CSP and SP will be same. The rest of the steps assume that cstor-disk-vt1u needs to be expanded.
From the above output, also note down the node on which the Pool is running. In this case the node is gke-kmova-helm-default-pool-2c01cdf6-dxbf

Step 2: Identify the new disk that that need to be attached to the cStor Pool.

The following command can be used to list the disks on a give node.
kubectl get disks -l kubernetes.io/hostname=gke-kmova-helm-default-pool-2c01cdf6-dxbf

Sample Disks Output.

NAME                                      AGE
disk-b407e5862d253e666636f2fe5a01355d     46m
disk-ffca7a8731976830057238c5dc25e94c     46m
sparse-ed5a5183d2dba23782d641df61a1d869   52m

The following command can be used to see the disks already used on the node - gke-kmova-helm-default-pool-2c01cdf6-dxbf

kubectl get sp -l kubernetes.io/hostname=gke-kmova-helm-default-pool-2c01cdf6-dxbf -o jsonpath="{range .items[*]}{@.spec.disks.diskList};{end}" | tr ";" "\n"

Sample Output:

[disk-b407e5862d253e666636f2fe5a01355d]
[sparse-ed5a5183d2dba23782d641df61a1d869]`

In this case, disk-ffca7a8731976830057238c5dc25e94c is unused.

Step 3: Patch CSP with the disk path details

Get the disk path listed by unique path under devLinks.

kubectl get disk disk-ffca7a8731976830057238c5dc25e94c -o jsonpath="{range .spec.devlinks[0]}{@.links[0]};{end}" | tr ";" "\n"

Sample Output:

/dev/disk/by-id/scsi-0Google_PersistentDisk_kmova-n2-d1

Patch the above disk path into CSP

kubectl patch csp cstor-disk-vt1u --type json -p '[{ "op": "add", "path": "/spec/disks/diskList/-", "value": "/dev/disk/by-id/scsi-0Google_PersistentDisk_kmova-n2-d1" }]'

Verify that disk is patched by executing kubectl get csp cstor-disk-vt1u -o yaml and check that new disk is added under diskList.

Step 4: Patch SP with disk name

The following command patches the SP (cstor-disk-vt1u) with disk (disk-ffca7a8731976830057238c5dc25e94c)

kubectl patch sp cstor-disk-vt1u --type json -p '[{ "op": "add", "path": "/spec/disks/diskList/-", "value": "disk-ffca7a8731976830057238c5dc25e94c" }]'

Verify that disk is patched by executing kubectl get sp cstor-disk-vt1u -o yaml and check that new disk is added under diskList.

Step 5: Expand the pool.

The last step is to update the cstor pool pod (cstor-disk-vt1u) with disk path (/dev/disk/by-id/scsi-0Google_PersistentDisk_kmova-n2-d1)

Identify the cstor pool pod associated with CSP cstor-disk-vt1u.
kubectl get pods -n openebs | grep cstor-disk-vt1u

Sample Output:

cstor-disk-vt1u-65b659d574-8f6fp            2/2       Running   0          1h        10.44.1.8    gke-kmova-helm-default-pool-2c01cdf6-dxbf

Check the pool name: kubectl exec -it -n openebs cstor-disk-vt1u-65b659d574-8f6fp -- zpool list

Sample Output:

NAME                                         SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
cstor-deaf87e6-ec78-11e8-893b-42010a80003a   496G   202K   496G         -     0%     0%  1.00x  ONLINE  -

Extract the pool name from above output. In this case - cstor-deaf87e6-ec78-11e8-893b-42010a80003a

Expand the pool with additional disk.

kubectl exec -it -n openebs cstor-disk-vt1u-65b659d574-8f6fp -- zpool add cstor-deaf87e6-ec78-11e8-893b-42010a80003a /dev/disk/by-id/scsi-0Google_PersistentDisk_kmova-n2-d1

You can execute the list command again to see the increase in capacity.
kubectl exec -it -n openebs cstor-disk-vt1u-65b659d574-8f6fp -- zpool list

Sample Output:

NAME                                         SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
cstor-deaf87e6-ec78-11e8-893b-42010a80003a   992G   124K   992G         -     0%     0%  1.00x  ONLINE  -
@stpabhi

This comment has been minimized.

Copy link

commented Nov 20, 2018

@kmova How will it work with dynamic provisioning? I have added new storage nodes and doesn't have csp for that nodes to patch.

@chtardif

This comment has been minimized.

Copy link

commented Nov 21, 2018

I can confirm it works. That being said, there must be, sooner than later, a simpler way to implement that, ideally by just editing the SPC

@kmova kmova modified the milestones: Backlog, 0.9 Nov 21, 2018

@kmova

This comment has been minimized.

Copy link
Member

commented Nov 21, 2018

Absolutely @chtardif - Marking this for 0.9.

@stpabhi - For dynamic provisioning, changing the number of maxpools in the SPC spec, will create a new pool on the newly added Storage Node. However, I think to effectively use the storage on this new pool, we will also need to make sure we have the volume replica's balanced on the available pools. ( #2290 )

@muratkars muratkars modified the milestones: 0.9, 0.10 Apr 5, 2019

@tbe

This comment has been minimized.

Copy link

commented May 22, 2019

I just wonder why this is not a blocker or at least mentioned in the docs: "WARNING! You can not remove a broken disk from a custom pool"

Disks break. A "Leading Open Source" software should be able to handle this.

We lost one of your disks on one node yesterday. After replacing it the according pool was still fucked up, so restartet it. And at this point the whole thing broke ... many volumes in state "Recreating", the pool itself didn't start anymore, all iscsi targets for volumes that hat a replica on that pool staled, because missing quorum ( in a two replica setup, wtf? )

Yeah, this went a little bit offtopic. Sry for that. Just wanted to illustrate what happens if you put a "not so fancy, but essential feature" at the end of the roadmap.

@vishnuitta

This comment has been minimized.

Copy link
Member

commented May 22, 2019

sorry for the issues you faced @tbe .. we can recover the data by recreating the pool on which disk gone bad.
Would you be able to join our slack channel http://openebs-community.slack.com/? and, we will fix it over a zoom session if that helps you.

@tbe

This comment has been minimized.

Copy link

commented May 22, 2019

sorry for the issues you faced @tbe .. we can recover the data by recreating the pool on which disk gone bad.
Would you be able to join our slack channel http://openebs-community.slack.com/? and, we will fix it over a zoom session if that helps you.

Thx @vishnuitta for the offer, but we have decided to move along with rook as we know our way around ceph. But it would be great if you can document the process of restoring the broken pool, that may help others, that are not so lucky to have backed up everything twice

@vishnuitta

This comment has been minimized.

Copy link
Member

commented May 22, 2019

sure @tbe . Thanks for feedback 👍 We will document it. cc: @ranjithwingrider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.