Skip to content

Commit

Permalink
ceph: auto grow OSDs size on PVCs
Browse files Browse the repository at this point in the history
When an OSDs reach OSD_NEARFULL state, we have to manually increase the volume claim
Added a new script increase-size-of-osd which will automatically increase claim volume, depending on the percent growthRate mentioned.

Closes: #6101
Signed-off-by: parth-gr <paarora@redhat.com>
  • Loading branch information
parth-gr committed Jun 11, 2021
1 parent 331aaea commit 0a12d91
Show file tree
Hide file tree
Showing 3 changed files with 156 additions and 1 deletion.
40 changes: 40 additions & 0 deletions Documentation/ceph-advanced-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ storage cluster.
* [OSD Dedicated Network](#osd-dedicated-network)
* [Phantom OSD Removal](#phantom-osd-removal)
* [Change Failure Domain](#change-failure-domain)
* [Auto Expansion of OSDs](#auto-expansion-of-OSDs)

## Prerequisites

Expand Down Expand Up @@ -590,3 +591,42 @@ ceph osd pool get replicapool crush_rule
If the cluster's health was `HEALTH_OK` when we performed this change, immediately, the new rule is applied to the cluster transparently without service disruption.

Exactly the same approach can be used to change from `host` back to `osd`.

## Auto Expansion of OSDs
### To scale OSDs Vertically

If you need to auto grow the size of OSDs on a PVC based rook-ceph cluster whenever the OSDs have reached the storage threshold you can use Prometheus for it.

You need to just run the script `increase-size-of-osd.sh`.
```console
.\increase-size-of-osd.sh size
```

After running the script it will ask for the `growth rate percentage`.

Growth rate percentage represent the percent increase you want in the OSD capacity.

For ex: If you need to increase the size of OSD by 30%
```console
Enter the growth rate percentage of OSD:
30
```

### To scale OSDs Horizontally
If you need to auto grow the number of OSDs on a PVC based rook-ceph cluster whenever the OSDs have reached the storage threshold you can use Prometheus for it.

You need to just run the script `increase-size-of-osd.sh`.
```console
.\increase-size-of-osd.sh count
```

After running the script it will ask for the `count of OSD`.

Count of OSD represent the number of OSDs you need to add.

For ex: If you need to increase the number of OSD by 3
```console
Enter the count of OSD you need to add:
3
```
>NOTE: minimum count you should specify is 3
3 changes: 2 additions & 1 deletion Documentation/ceph-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,4 +196,5 @@ spec:
labels:
monitoring:
prometheus: k8s
[...]
[...]
```
114 changes: 114 additions & 0 deletions tests/scripts/auto-grow-storage.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#!/usr/bin/env bash

#############
# FUNCTIONS #
#############

function growVertically() {
growRate=$1
pvc=$2
ns=$3
currentsize=$(oc get pvc ${pvc} -n ${ns} -o json | jq -r '.spec.resources.requests.storage')
echo "PVC current size is ${currentsize}. Will be increased ${growRate}% times."
if [[ "$currentsize" == *"Mi" ]]
then
rawsize=$(echo $currentsize | sed -e 's/Mi//g')
unitsize="Mi"
elif [[ "$currentsize" == *"Gi" ]]
then
rawsize=$(echo $currentsize | sed -e 's/Gi//g')
unitsize="Gi"
elif [[ "$currentsize" == *"Ti" ]]
then
rawsize=$(echo $currentsize | sed -e 's/Ti//g')
unitsize="Ti"
else
echo "Unknown unit this PVC: ${currentsize}"
fi
newsize=$(echo "${rawsize}+(${rawsize} * ${growRate})/100" | bc | cut -f1 -d'.')
if [ "${newsize}" = "${rawsize}" ]
then
newsize=$(( rawsize + 1 ))
echo "New adjusted calculated size for the PVC is ${newsize}${unitsize}"
else
echo "New calculated size for the PVC is ${newsize}${unitsize}"
fi
# if [ "${newsize}" -gt sum(node_filesystem_size_bytes) ]
# then
# newsize = sum(node_filesystem_size_bytes)
# echo "Disk has reached it's MAX capacity, add a new disk to it"
result=$(oc patch pvc ${pvc} -n ${ns} --type json --patch "[{ "op": "replace", "path": "/spec/resources/requests/storage", "value": "${newsize}${unitsize}" }]")
echo ${result}
}

function growHorizontally() {
count=$1
pvc=$2
ns=$3
#update cephcluater CRD
# deviceSet=$(oc get pvc ${pvc} -n ${ns} -o json | jq -r '.spec.labels.ceph.rook.io/DeviceSet')
# finalCount=
# result=$(oc patch pvc ${pvc} -n ${ns} --type json --patch "[{ "op": "replace", "path": "/spec/resources/requests/storage", "value": "${newsize}${unitsize}" }]")
# echo "OSDs count increased by 3,Total count for device set ${deviceSet} is ${finalCount}"
}

function growOSD(){
i=0
alertmanagerroute=$(oc get route -n openshift-monitoring | grep alertmanager-main | awk '{ print $2 }')
curl -sk -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://${alertmanagerroute}/api/v1/alerts | jq -r '.' >./tt.txt
export total_alerts=$(cat ./tt.txt | jq '.data | length')
echo "Looping at $(date +"%Y-%m-%d %H:%M:%S")"

while true
do
export entry=$(cat ./tt.txt | jq ".data[$i]")
thename=$(echo $entry | jq -r '.labels.alertname')
if [ x"${thename}" = "xPersistentVolumeUsageNearFull" ] || [ x"${thename}" = "xPersistentVolumeUsageCritical" ]
then
echo $entry
ns=$(echo $entry | jq -r '.labels.namespace')
pvc=$(echo $entry | jq -r '.labels.persistentvolumeclaim')
echo "Processing NearFull or Full alert for PVC ${pvc} in namespace ${ns}"
if [[ $1 == "count" ]]
then
growHorizontally $2 ${pvc} ${ns}
else
growVertically $2 ${pvc} ${ns}
fi

(( i = i + 1 ))
if (( i == total_alerts ))
then
sleep 300
rm -f ./tt.txt
alertmanagerroute=$(oc get route -n openshift-monitoring | grep alertmanager-main | awk '{ print $2 }')
curl -sk -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://${alertmanagerroute}/api/v1/alerts | jq -r '.' >./tt.txt
total_alerts=$(cat ./tt.txt | jq '.data | length')
i=0
echo "Looping at $(date +"%Y-%m-%d %H:%M:%S")"
fi
fi
done
}

case "${1:-}" in
count)
echo "Enter the count of OSD you need to add: "
read count
echo "Adding on nearfull and full alert and number of OSD to add is ${count}"
growOSD count ${count}
;;
size)
echo "Enter the growth rate percentage of OSD: "
read growRate
echo "Resizing on nearfull and full alert and Expansion percentage set to ${growRate}%"
growOSD growRate ${growRate}
;;
*)
echo " $0 [command]
Available Commands:
count Scale horizontally by adding more OSDs disks to the cluster
size Scaling Vertically by claiming more storage for the already present disk
" >&2
;;
esac

0 comments on commit 0a12d91

Please sign in to comment.