Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: Prepare job needs significant more memory for provisioning #11103

Merged
merged 1 commit into from
Oct 5, 2022

Conversation

travisn
Copy link
Member

@travisn travisn commented Oct 5, 2022

Description of your changes:
The OSD creation may need to burst during OSD provisioning depending on the size of the OSD or similar factors. If the OSD prepare job is OOM killed it will cause OSD provisioning to fail and have various side effects that are difficult to troubleshoot to get the OSD to succeed. So we increase the recommendation significantly to avoid the OOM kill.

Which issue is resolved by this Pull Request:
Resolves #10219

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

The OSD creation may need to burst during OSD provisioning depending
on the size of the OSD or similar factors. If the OSD prepare job is
OOM killed it will cause OSD provisioning to fail and have various
side effects that are difficult to troubleshoot to get the OSD
to succeed. So we increase the recommendation significantly to avoid
the OOM kill.

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
@satoru-takeuchi satoru-takeuchi merged commit 259cf39 into rook:master Oct 5, 2022
mergify bot added a commit that referenced this pull request Oct 5, 2022
osd: Prepare job needs significant more memory for provisioning (backport #11103)
@travisn travisn deleted the osdprepare-resources branch October 5, 2022 19:46
@rajha-korithrien
Copy link

rajha-korithrien commented Oct 5, 2022

All,

Apologies for arriving late after this has already been merged. I just tested the value of 1200Mi (which is what this PR changes for the default resource limit for the OSD prepare job) and it is not sufficient to allow the bluestore formatting process to succeed on a 22Ti volume. It fails with this error:

RuntimeError: Command failed with exit code 250: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 2 --monmap /var/lib/ceph/osd/ceph-2/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-2/ --osd-uuid ac7ed11d-f2bd-4020-aaad-c3e9f58e52e2 --setuser ceph --setgroup ceph

And the Job is killed by the OOMKiller.

Further testing suggests that 1200Mi allows OSDs of size 15Ti to be correctly prepared but not much larger. Tested 18Ti and it fails. Given that 20Ti drives are just starting to become common place is 1200Mi enough? I think it is certainly a "sane default".

The note/hint about OSD prepare being potentially killed is useful. Perhaps a sentence could be added that users may need to increase this value if they have "large" volumes to prepare and they don't get OSD pods as expected.

@travisn
Copy link
Member Author

travisn commented Oct 5, 2022

@rajha-korithrien Thanks for your observations! We could keep raising the limit to something like 2Gi. But I'm wondering if we should just not specify the memory limits for the osd prepare. It's a one-time action and we don't want memory to prevent the creation. Is there really any reason to apply limits to it? @satoru-takeuchi @kfox1111 thoughts?

@satoru-takeuchi
Copy link
Member

@travisn Now we know it's difficult to estimate the proper memory limit. So don't set memory limit by default and describe this behavior in ceph-common-issues.md is a reasonable solution for now.

travisn added a commit that referenced this pull request Oct 5, 2022
osd: Prepare job needs significant more memory for provisioning (backport #11103)
@kfox1111
Copy link

kfox1111 commented Oct 6, 2022

yeah, maybe no limit may be better... though I think someone said on slack that they had a prepare push over a running osd... so maybe a limit does help. Its kind of unclear. its messy to clean up after a failed one, so reserving rather then limiting more then enough memory may be a good default, and then if its too much, they can always tweak it down?

@travisn
Copy link
Member Author

travisn commented Oct 6, 2022

Like you said, it's messy to clean up if it gets in a failed state. So allowing it to run unconstrained to completion will be best. If it causes other pods to fall over, there would be a hiccup, but they should recover after the pod comes back up again.

It's still possible to set the limits or different requests if desired, for defaults we just need to leave it unconstrained.

HoKim98 added a commit to SmartX-Team/OpenARK that referenced this pull request Oct 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OSD Prepare fails due to "unparsable uuid"
5 participants