New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
baremetal: Default to emptyDir for the baremetal platform. #332
baremetal: Default to emptyDir for the baremetal platform. #332
Conversation
emptydir never makes sense as a default for any production deployment. If the customer installs this way and pushes a single image to the registry before they fix the storage to something "real" they will make a mess of their cluster. We need to discuss alternatives. |
any suggestions? :-) what do the docs for bare metal UPI clusters say to do? |
basically, set up a PVC and use that, right? Can we put this default in, just so the install completes cleanly, but then the install script for the KNI clusters will immediately reconfigure this to use a PVC using rook? These KNI clusters with rook are the only place where the |
The typical recommendation for baremetal is to use the PVC storage config, which implies separately provisioning a storage solution like ceph or NFS |
I've filed this issue to ensure that the KNI install scripts immediately reconfigure the registry to use ceph once OCS has been deployed: openshift-kni/install-scripts#5 so this PR would just be to ensure would a comment in the code about this help? |
Will that be in place in GA, or that's something that would be done in a future version?
That's ok for RHHI installs, but i imagine other installs will also see the platform as "baremetal" and use emptydir and they won't have KNI install scripts to fix it up after the install completes, right?
definitely, but i still have the above concern about this approach impacting non-RHHI baremetal installs. |
can we uniquely identify RHHI installs somehow? |
Can we please just fix the registry to create a PVC and not block upgrade if the PVC is never satisfied? Fire an alert, but don't block upgrades. |
We can make it a GA blocker. The
That's true. RHHI is the only environment where we support this
Fair concern - hopefully my comment above helps, namely that there's no non-RHHI |
The "baremetal" platform includes automated provisioning of bare metal hosts. emptyDir makes sense as the default, as we can't assume anything else is available. It's possible that for a given deployment, some storage system may be available (such as rook / ceph) and the cluster may be configured to use something else.
ade7d17
to
db50d1f
Compare
Having OCS available isn't sufficient for this behavior, the point is that we need to not be using emptydir. If you're saying that in all baremetal environments we will swap the storage to PVs post-install, then ok I guess. It's still a bit scary since we're dependent on some external bit doing what it's supposed to or else the admin ends up w/ an emptydir registry and doesn't realize it.
I guess we can cross that bridge when we come to it. I'd still prefer something better, but it's something.
It does help, thanks. I'm ok w/ proceeding with this (using emptydir on "baremetal providers" for now given that). Just to confirm, this "baremetal" provider is distinct from the install config the registry operator will see in a true baremetal (not managed) install, correct? We definitely not want this logic to cause the registry to run emptydir in an unmanaged baremetal installation today. |
That's right. This will not affect UPI bare metal installs. This only applies to bare metal installs where we've enabled the extra automated provisioning and management of bare metal hosts, or the "baremetal" IPI platform for lack of a better way to refer to it. The only place that's used right now is RHHI. I would definitely like to figure out a way to make this use a rook/ceph PV automatically, instead of relying on a post-install script. Maybe that's something we can figure out before GA. For 4.2 this will just be a limited beta thing, so the post-install script seems OK. Any suggestions for improvements post-4.2 definitely welcome! |
/test e2e-aws |
@russellb i'm not sure who the right set of people are, but i mentioned to @derekwaynecarr and @adambkaplan that it would be good to rename the "baremetal" provider to something more specific to RHHI, so that if a future baremetal provider wants to do something else, the registry operator can distinguish providers and do the right thing. (alternatively if there is something else in the install config that the registry operator can look at to make the distinction, it would be good to key off that). Do you know who we'd need to involve in such a rename discussion? |
I'm probably the best person to coordinate a rename if necessary. Changing it in 4.3 is probably doable, but after that it's going to be less practical to change. This isn't RHHI specific, though. Architecturally, it's IPI for bare metal hosts. RHHI is just one particular way it's productized, but there could be others in the future. Anything we can do to extend and improve the |
The case that I made to @derekwaynecarr (not sure if he agreed and perhaps i'm missing something about the baremetal IPI intent) is that you could have another baremetal IPI provider that wants, for example, to use noobaa, or some other s3 gateway instead of OCS for registry storage. If we are assuming/requiring that every baremetal IPI installer will
then it's fine. But if we think there's any reason the storage behavior or install flow might be different for another IPI provider, we will need a way to distinguish. |
We're not actually assuming anything, that's why the patch defaults to emptyDir, and expects that this must be dealt with post-install. That leaves all of those options, depending on how the bare metal platform is used and combined with different storage options. |
CC @eparis -- you may be interested in this discussion |
that is an assumption though, right? you are assuming it will be dealt with post-install. I guess i shouldn't have said "assumes OCS/PVC". But what i should have said is: this is ok if it is always safe to assume that for any baremetal IPI implementation, the post-install steps will always update the storage configuration. |
@russellb would it make more sense for us to create a PVC, and require a baremetal "provider" have an automatic storage provisioner? |
the problem there is that the provisioner is OCS, but OCS won't be installed until after the basic install is complete, and the basic install can't complete until the registry comes up. |
Can the installer be enhanced so that for It would then be up to the post-install script to do the following:
|
i don't think that is a path that exists in the installer today and i think there would be significant pushback on adding it. The installer lays down the manifests as provided, it's not going to customize specific manifests based on provider. |
Instead of handling PV config post-install, can we feed the installer some manifests to set up PV during the IPI-ish install? Users who went that route could configure the registry to use whatever storage they were configuring, and install would go smoothly and end up with a happy, functional, production registry. Users who did not add additional manifests would have install hang and exit "the registry never got happy", and folks could investigate the registry, see that it was waiting on storage, and do whatever day-2 stuff they needed to fix it up then. Or is that not feasible for some reason (maybe you need something more interactive to set up PV, and can't just drop in a bunch of manifests into the installer)? |
@bparees is this something that should already work today? or something that requires more work? Are there other relevant issues / PRs i should watch? |
i'm not sure the current or intended state at this point. @dmage @adambkaplan |
OK - I need to unblock this somehow for 4.2, either by merging this PR (and following up in future releases with improvements) or by confirming that the discussed alternative behavior is ready (ability to install without a registry, and allowing it to be configured post-install instead). |
I could live w/ merging this and following up w/ the change to keep the registry in a removed state initially instead. but it is @adambkaplan's call. |
AFAIK we already declared baremetal support in 4.1. So even if this change is indented to fix the problem in 4.3 for RHHI, it can affect other baremetal (UPI) customers. How can we make sure that customers switch the registry to some kind of persistent storage before using it? |
Note that this patch has no effect on UPI bare metal deployments. It only affects bare metal IPI. |
@russellb what sort of support/guarantee are we giving for baremetal IPI installs? I'd say no to this if this is intended for GA code. However, I'd be OK if we considered baremetal IPI developer preview. |
didn't we previously establish that for 4.2, RHHI is the only supported baremetal IPI install? That being the case, we can have confidence that the storage will be "made correct" by the RHHI install process. |
That is correct. |
IPI vs UPI has no meaning here, what does matter is that when people perform a baremetal UPI install they're doing so by selecting |
@sdodson yes, that confused me a bit, for RHHI.next we use |
Architecturally it’s the bare metal platform for infrastructure automation. RHHI is just a product delivery that uses it, among other things. It could also be generally supported in OpenShift some day in the future. Make sense? |
/approve Re-hashing this thread for clarity: For RHHI the installation will be We are considering in 4.3 changing this process so that for @russellb can you please verify that a) my understanding of the process for 4.2 is correct, and b) if the proposed changes for 4.3 make sense. |
Yes.
It makes sense, though I haven't looked at what it will take. Is coming up as |
The registry can be set to DevEx is tracking this in our 4.3 planning - enhancement proposal and JIRA story is forthcoming. |
I've filed #375 for the TODO item to change this once the registry operator can come up with the registry set to Removed. |
@dmage want your take here - I'm OK merging this now and bringing the registry up as |
Hopefully we'll have capacity to improve it in 4.3. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan, dmage, russellb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
The cluster-image-registry-operator will now use emptyDir by default for the "baremetal" infrastructure platform, so this manifest is no longer needed. openshift/cluster-image-registry-operator#332 Note this is still not a supported configuration, and is only used to bring up the cluster. It should be immediately reconfigured to use a PVC once storage has been set up. For more info on that post-install task, see: openshift-kni/install-scripts#5
The cluster-image-registry-operator will now use emptyDir by default for the "baremetal" infrastructure platform, so this manifest is no longer needed. openshift/cluster-image-registry-operator#332 Note this is still not a supported configuration, and is only used to bring up the cluster. It should be immediately reconfigured to use a PVC once storage has been set up. For more info on that post-install task, see: openshift-kni/install-scripts#5 Closes openshift-metal3#704
The "baremetal" platform includes automated provisioning of bare metal
hosts. emptyDir makes sense as the default, as we can't assume
anything else is available. It's possible that for a given
deployment, some storage system may be available (such as rook / ceph)
and the cluster may be configured to use something else.