Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: auto grow OSDs size on PVCs #8078

Merged
merged 1 commit into from Sep 1, 2021
Merged

ceph: auto grow OSDs size on PVCs #8078

merged 1 commit into from Sep 1, 2021

Conversation

parth-gr
Copy link
Member

@parth-gr parth-gr commented Jun 8, 2021

When an OSDs reach OSD_NEARFULL state, we have to manually either increase the volume claim for provisioning more storage or add more OSDs to the cluster to keep the cluster Healthy.

  1. If you need to automatically increase the provisioned volume(scale OSDs vertically).
    Added a script auto-grow-storage.sh which will automatically increase claim volume, depending on the percent growth rate mentioned.

  2. If you need to automatically increase the number of OSDs(scale OSDs Horizontally).
    Added a script auto-grow-storage.sh which will automatically increase the OSDs of a device set, depending on the Count you mentioned.

Closes: #6101
Signed-off-by: parth-gr paarora@redhat.com

Tested on different scenarios:
For scaling vertically:
i) When the increment size of PVC becomes more than the max size.
ii) When max Size is labeled with the different units (Mi,Gi,Ti).
For scaling Horizontally:
i) when the increment count becomes more than the disk limit.

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@mergify mergify bot added the ceph main ceph tag label Jun 8, 2021
@parth-gr parth-gr added ceph-osd WIP Work in Progress and removed ceph main ceph tag labels Jun 8, 2021
@parth-gr parth-gr marked this pull request as draft June 8, 2021 15:11
Documentation/ceph-monitoring.md Outdated Show resolved Hide resolved
Documentation/ceph-monitoring.md Outdated Show resolved Hide resolved
@parth-gr parth-gr force-pushed the osd-size branch 6 times, most recently from 0a12d91 to b61fb7c Compare June 15, 2021 15:09
@parth-gr parth-gr marked this pull request as ready for review June 15, 2021 15:09
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Comment on lines 609 to 616
For ex: If you need to increase the size of OSD by 30%
```console
Enter the growth rate percentage of OSD:
30
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the script take dynamic input instead of just using flags or arguments which allows someone to use it in an automatic situation? This means that Prometheus won't be able to take actions automatically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script takes the input only for the first time when we start it(to know the growth percentage or count ), and after it automatically triggers actions based on the Prometheus alert, in a time interval of 5 mins.

tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
.\increase-size-of-osd.sh size
```

After running the script it will ask for the `growth rate percentage`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of prompting for the arguments after starting the script, we should read the arguments to the script. For example:

./increase-size-of-osd.sh count --max 12 --growth-rate 3
OR
./increase-size-of-osd.sh size --max 2Ti --growth-rate 10%

tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
i=0
alertmanagerroute=$(oc get route -n openshift-monitoring | grep alertmanager-main | awk '{ print $2 }')
curl -sk -H "Authorization: Bearer $(oc sa get-token prometheus-k8s -n openshift-monitoring)" https://${alertmanagerroute}/api/v1/alerts | jq -r '.' >./tt.txt
export total_alerts=$(cat ./tt.txt | jq '.data | length')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export total_alerts=$(cat ./tt.txt | jq '.data | length')
export total_alerts=$(jq '.data | length' < tt.txt )

cat's sole purpose is to concatenate files :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
tests/scripts/auto-grow-storage.sh Outdated Show resolved Hide resolved
@parth-gr parth-gr force-pushed the osd-size branch 3 times, most recently from 6793874 to a4ef97c Compare June 23, 2021 08:48
@parth-gr parth-gr added feature and removed WIP Work in Progress labels Jun 23, 2021
@parth-gr parth-gr requested review from travisn and leseb June 23, 2021 08:49
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few doc suggestions...

Documentation/ceph-advanced-configuration.md Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
Documentation/ceph-advanced-configuration.md Outdated Show resolved Hide resolved
@parth-gr parth-gr force-pushed the osd-size branch 2 times, most recently from 60b4670 to a195977 Compare August 19, 2021 12:37
@parth-gr parth-gr force-pushed the osd-size branch 2 times, most recently from 6653a9e to c2f3613 Compare August 19, 2021 13:31
Copy link
Contributor

@subhamkrai subhamkrai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

When an OSD reaches OSD_NEARFULL state,
we have to manually increase the PVC volume claim
or manually increase the count of OSDs in the device set

Added a script auto-grow-storage.sh which will
i)automatically increase claim volume
ii)automatically add number of OSDs

Closes: rook#6101
Signed-off-by: parth-gr <paarora@redhat.com>
@travisn travisn merged commit 1632af9 into rook:master Sep 1, 2021
mergify bot added a commit that referenced this pull request Sep 1, 2021
ceph: auto grow OSDs size on PVCs (backport #8078)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to auto grow OSDs on PVCs
6 participants