IBM Cloud image variant should be 120GB #931

rvanderp3 · 2021-08-23T14:52:51Z

Describe the bug
Requesting the IBM Cloud image variant virtual size be 120GB to match the minimum requirement mentioned in the OpenShift and OKD documentation.

Reproduction steps
Steps to reproduce the behavior:
1.
2.
3.

Expected behavior
The image should be 120GB to match the OpenShift/OKD minimum requirements.

Actual behavior
The image size is 100GB per coreos/coreos-assembler#2041

System details

Fedora CoreOS
RH CoreOS

Ignition config

Additional information

cgwalters · 2021-08-23T15:19:48Z

To clarify, is this somehow specific to OpenShift, or is it really "the default IBM Cloud guidance is bumped to 120G" (e.g. traditional Fedora Cloud, Ubuntu etc. should also bump to 120G)?

rvanderp3 · 2021-08-23T15:29:25Z

To clarify, is this somehow specific to OpenShift, or is it really "the default IBM Cloud guidance is bumped to 120G" (e.g. traditional Fedora Cloud, Ubuntu etc. should also bump to 120G)?

This is scoped to OpenShift nodes provisioned in the IBM cloud adhering to the OpenShift minimum requirements.

cgwalters · 2021-08-23T16:43:21Z

Hmm. I guess I'd repeat my argument from here then: coreos/coreos-assembler#2041 (comment)

But I think in the future if someone comes along and says "It really should be 150GB" or whatever we should instead tell them "configure it at the infrastructure side" - the cloud should have that knob and their tooling they use to provision the cloud (Terraform/ansible/openshift-machine-api-operator/etc.) should support it.

(The future is now)

So, I think the better approach is to change openshift-install to configure this.

Ultimately IMO, there is no single "right" disk size, in the same way there's not a single RAM size or number of vCPUs. People will want to run smaller clusters, and tune things down. Or, for larger clusters they want things bigger.

And the right place to configure these sizes is in something like openshift-install's install-config.yaml, or equivalent for UPI. And for non-OpenShift FCOS use cases, whatever they use to provision nodes, e.g. scripting a CLI like aws or Terraform/Ansible/whatever.

jeffnowicki · 2021-08-24T16:26:15Z

I would assert that the value should at least meet RH OCP minimum storage requirements. Not suggesting that we keep 'bumping' it up. Rather, the value should at least align with RH requirements.

https://docs.openshift.com/container-platform/4.8/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal

100% agree that if the user wants a larger boot volume size, that could perhaps be an installer capability. That being said, the 'default' should at least meet RH OCP requirements. IBM Cloud VPC has recently added support to parse that value and provision the boot volume size accordingly.

bgilbert · 2021-08-24T16:34:08Z

Fedora CoreOS has use cases beyond OCP/OKD, and those may need far less space than an OpenShift cluster. In general, we ship minimum-size images and encourage users to size their boot disk to meet their needs. IBM Cloud documents a minimum size of 100 GB, so that's what we ship on that platform.

jeffnowicki · 2021-08-24T16:37:08Z

Interesting... installation on IBM Power Systems states '120gb'... https://docs.openshift.com/container-platform/4.8/installing/installing_ibm_power/installing-ibm-power.html#minimum-resource-requirements_installing-ibm-power

jeffnowicki · 2021-08-24T16:48:38Z

Motivation behind this is supporting OCP IPI on IBM Cloud. In general, the minimum storage requirements (from RH) are stated to be 120gb - https://docs.openshift.com/container-platform/4.8/installing/installing_platform_agnostic/installing-platform-agnostic.html#minimum-resource-requirements_installing-platform-agnostic

@bgilbert thanks for the IBM Cloud reference... and as you stated, IBM Cloud reference indicates a different minimum for a generic custom Linux image (which may not be appropriate for RHCOS).

In the end, I'm looking for a 'reconciliation' such that whatever the value is, both RH and IBM 'officially' support it.

I don't want to see an issue arise and RH engineers telling client, your deployment is not supported (from RH) due to boot volume size not meeting RH minimum requirement.

Prashanth684 · 2021-08-24T16:50:25Z

Honestly that 120GB number was used way back when we did UPI installations on baremetal systems. I am not sure where that number came from, but in our testing all this time, i have never even seen half of it being used. I guess in case where the logs fill up disk etc..it would be justified, but normally i haven't seen a need for 120G.

Also in case of baremetal deploys, the partition is grown dynamically during the install and the metal image itself is not sized for 120G.

bgilbert · 2021-08-24T16:54:49Z

@jeffnowicki It is possible that RHEL CoreOS images should have a different default, since those images are exclusively intended to support OCP. (Though, as @cgwalters said in #931 (comment), it would be better for OCP to size the disk appropriately at provisioning time.) However, IMO that's a separate discussion from what Fedora CoreOS should ship.

relyt0925 · 2021-08-24T19:03:26Z

So I think the only confusion here is we need a statement of support from Openshift that at a minimum for IBM Cloud 100GB boot disks are supported. We already use this today across our openshift offerings so that shouldn't be a problem. To be specific these doc references need to change:
https://docs.openshift.com/container-platform/4.8/installing/installing_bare_metal/installing-bare-metal.html#minimum-resource-requirements_installing-bare-metal

If they cannot be changed by default it makes sense to me for this image to be baked by default to the supported size of OCP. But I think getting the doc updated with a caveat for IBM at a minimum will do the trick

relyt0925 · 2021-08-24T19:08:25Z

Note we already do this across all IBM Cloud Openshift offerings today in case there are concerns on potential impacts.

cgwalters · 2021-08-24T21:20:20Z

Hmm. In the IBM Cloud VPC console, it isn't letting me change the 100G size for the default centos image. Is that intentional?

It looks to me like the API supports it in https://cloud.ibm.com/apidocs/vpc#create-instance with volume_attachments? Or is that only for secondary volumes?

IOW if automation (e.g. custom Terraform/Ansible or openshift-installer IPI) wanted to change the default image size from 100G, it'd need to create a copy of that disk at the desired size, and then use it for instances?

relyt0925 · 2021-08-25T01:38:14Z

It's detected off the custom image that gets imported @cgwalters . You just need to build a custom image with a 120 GB boot disk
(qemu-img resize PATH 120 GB)
Then push to cos
Then create image pointing to COS
Then boot machine with image

travier · 2021-08-25T09:21:40Z

Fedora CoreOS has use cases beyond OpenShift/OKD so the default image size is not controlled by the OCP/OKD requirements. I would argue that we should not bump the default image size beyond the minimum supported by the platform or we would create unnecessary costs for non OCP/OKD users.

I find it really strange that there would be no feature in the IBM Cloud to specify a different size for the boot disk when importing an image. If this is really the case then this should really be taken to them to discuss as I don't think we will be the only ones impacted by this issue: anyone else using images from other distributions would be impacted too.

miabbott · 2021-08-25T16:01:33Z

The conversation seems to be pointing to an update to the OCP docs that reduces the documented minimums to 100GB for all platforms or perhaps specially note that the 100GB minimum applies to IBM Cloud.

I've created an issue in the OCP docs repo (openshift/openshift-docs#35793) for further discussion.

If we don't believe any changes will be made to the disk image for FCOS/RHCOS, I believe we can close this issue.

cgwalters · 2021-08-25T16:16:04Z

It is possible that RHEL CoreOS images should have a different default, since those images are exclusively intended to support OCP.

I'd personally like to keep FCOS and RHCOS aligned though, since small deltas like this add up to maintenance pain.

cgwalters · 2021-08-25T16:49:55Z

I have a related question, and this is probably a good place to ask.

Basically, would it be fair to say that in IBM Cloud VPC, it's expected that most systems use the boot disk just for "operating system stuff (binaries, journal, etc.)", and e.g. if you want to add something like database storage, the it's effectively required to create a separate block device and set it up as a separate filesystem mount in the OS?

Obviously in other IaaS clouds (GCP/AWS/etc.) this can also be a good pattern, but it's not really required because they make it easy to have a nearly arbitrary size for the root volume. And in some cases, clearly "lifecycle binding" this data and the OS is more ergonomic (e.g. you don't want the data to outlive the instance).

However, there are advantages to such a split because e.g. one can allocate block devices with different levels of performance for such data. In OpenShift unfortunately our support for splitting "OS stuff" into separate block devices is mediocre. It's absolutely supported to do so for e.g. /var/lib/containers, but e.g. openshift/machine-config-operator#1720 makes it more painful than it has to be.

dustymabe · 2021-09-17T21:36:19Z

The conversation seems to be pointing to an update to the OCP docs that reduces the documented minimums to 100GB for all platforms or perhaps specially note that the 100GB minimum applies to IBM Cloud.

I've created an issue in the OCP docs repo (openshift/openshift-docs#35793) for further discussion.

Fixed in openshift/openshift-docs#36226

If we don't believe any changes will be made to the disk image for FCOS/RHCOS, I believe we can close this issue.

👍

rvanderp3 added the kind/bug label Aug 23, 2021

dustymabe added the meeting topics for meetings label Aug 23, 2021

jlebon removed the meeting topics for meetings label Aug 25, 2021

miabbott mentioned this issue Aug 25, 2021

change required minimum disk size from 120GB to 100GB openshift/openshift-docs#35793

Closed

dustymabe closed this as completed Sep 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IBM Cloud image variant should be 120GB #931

IBM Cloud image variant should be 120GB #931

rvanderp3 commented Aug 23, 2021

cgwalters commented Aug 23, 2021

rvanderp3 commented Aug 23, 2021

cgwalters commented Aug 23, 2021

jeffnowicki commented Aug 24, 2021 •

edited

bgilbert commented Aug 24, 2021

jeffnowicki commented Aug 24, 2021 •

edited

jeffnowicki commented Aug 24, 2021 •

edited

Prashanth684 commented Aug 24, 2021

bgilbert commented Aug 24, 2021

relyt0925 commented Aug 24, 2021

relyt0925 commented Aug 24, 2021

cgwalters commented Aug 24, 2021

relyt0925 commented Aug 25, 2021 •

edited

travier commented Aug 25, 2021 •

edited

miabbott commented Aug 25, 2021

cgwalters commented Aug 25, 2021

cgwalters commented Aug 25, 2021

dustymabe commented Sep 17, 2021

IBM Cloud image variant should be 120GB #931

IBM Cloud image variant should be 120GB #931

Comments

rvanderp3 commented Aug 23, 2021

cgwalters commented Aug 23, 2021

rvanderp3 commented Aug 23, 2021

cgwalters commented Aug 23, 2021

jeffnowicki commented Aug 24, 2021 • edited

bgilbert commented Aug 24, 2021

jeffnowicki commented Aug 24, 2021 • edited

jeffnowicki commented Aug 24, 2021 • edited

Prashanth684 commented Aug 24, 2021

bgilbert commented Aug 24, 2021

relyt0925 commented Aug 24, 2021

relyt0925 commented Aug 24, 2021

cgwalters commented Aug 24, 2021

relyt0925 commented Aug 25, 2021 • edited

travier commented Aug 25, 2021 • edited

miabbott commented Aug 25, 2021

cgwalters commented Aug 25, 2021

cgwalters commented Aug 25, 2021

dustymabe commented Sep 17, 2021

jeffnowicki commented Aug 24, 2021 •

edited

jeffnowicki commented Aug 24, 2021 •

edited

jeffnowicki commented Aug 24, 2021 •

edited

relyt0925 commented Aug 25, 2021 •

edited

travier commented Aug 25, 2021 •

edited